University of Virginia Library

Search this document 


  

expand section 
expand section 
expand section 
expand section 
expand section 
expand section 
expand section 
expand section 
collapse section 
 1. 
 2. 
 3. 
 4. 
IV
 5. 
  
expand section 
expand section 
expand section 
expand section 
expand section 
expand section 
expand section 
expand section 
  
  
expand section 

expand section 

IV

In order to explain the remaining editorial problems that arise when the poetry of Emily Dickinson is finally confronted by the machine, it is necessary to outline the processes by which a computer can be used to make a concordance. After the text has been edited for the computer, the first step in transmitting it to the machine is to have the poetry punched, a line at a time, on IBM cards. Also punched with each line are identifying poem and line numbers. These cards are the basic data of the concordance, each card with its line providing the context for any indexed word within it. Because the concordance will be only as accurate as the cards, the punching of Emily Dickinson's poetry was, in effect, done twice. The first operator of the machine that punches the cards reproduces the text by "typing" it, but instead of printing the letters, the machine punches holes in the IBM cards. Another operator then repeats this procedure with the same cards on what is called a "verifier". The verifier checks electromagnetically to see that the holes already punched in a given card are the same as the operator of the verifier indicates through his keyboard that there should be. Any discrepancy between the two operators' work is caught by the verifier and has to be corrected—or the card marked as defective—before the operator of the verifier can proceed. The only errors, then, that can survive this operation are identical mistakes made by the two operators. In order to check on these, all the cards were run through the printer—a machine that prints out the contents of the cards—and proofread against the text. Some idea of the accuracy obtainable through this method of punching and verifying—a method much more accurate than the ordinary routines of printing and proofreading—can be seen in the fact that only six errors were discovered in a text of over 100,000 words. Three of these mistakes, incidentally, were due to the editor's handwriting; the other three were identical mistakes made by the two operators. The total amount of time needed for punching and verifying was about 200 hours.


224

Page 224

After the complete text of Emily Dickinson's poems was punched, an additional 1,775 cards consisting of the shortened first-line titles of her poems were punched. Each of these cards was placed before the cards containing the lines of the poem for which it was the identifying title, and then the entire deck of cards—some 25,000 of them—was transferred onto magnetic tape. Emily Dickinson was now ready for the machine.

The following summary of what happens when the text is fed into the computer is based on the explanation given by the concordance's programmer, Mr. James A. Painter, in his preface to the Yeats concordance.[35] The program—the instructions written by the programmer to tell the machine what to do and when to do it—consists of four phases: input, sorting, output, and correction. During the input phase of the program, the computer first scans the magnetic tape and breaks each line of poetry into its component words, attaching to each word the complete line from which it comes as well as the line's identifying poem title and poem and line numbers. These words are then checked against a list of words to be omitted that was previously stored in the magnetic-core "memory" of the computer. The machine then transfers the words to be indexed, together with their lines, titles, and numbers, to a new series of tapes called "word blocks". In the sorting phase of the program each word block is alphabetized,[36] and then the individual word blocks are merged and alphabetized with one another. The amount of time taken by the input and sorting parts of the program was around ten hours: six for indexing and four for checking. In the third, or output, phrase of the program, the results of the final tape—which now contained all the words together with their contexts in alphabetical order—are listed by an IBM printer. Such features of the format as indentation of the entries after an index word or the use of spaced dots to separate these entries from their titles were also automatically done by the printer. The amount of time needed to print out the whole concordance was approximately ten hours, printers being among the slowest of a computer's auxiliary equipment.


225

Page 225

In the final phrase of the program, editorial corrections involving the addition, replacement, or deletion of words and lines were made from this printing out of the tape. But before they could be made, two final editorial problems had to be met: the discrimination of homographs and the use of cross-references. Because the user of a concordance can distinguish homographs for himself from the contexts of their lines, the only homographs that had to be separated were those involving omitted words. The nouns "art", "will", "might", and "may", the verb "wilt" and the adjective "wont" turned out to be the only homographs included with omitted words; they were all separated from the non-essential verbs of the same spelling and retained in the concordance. The problem of cross references is particularly troublesome with Emily Dickinson's poetry because of her erratic spelling. Painter had worked out for the other computer-concordances a program for automatically cross referencing hyphenated compounds, but this was of little help because Emily Dickinson rarely used a hyphen. Her non-hyphenated compounds were cross-referenced by the editor when the compounds were unusual enough so that a user in search of all occurrences of a word could not be expected to anticipate it. "Ashine", for example, was cross-reference from "shine" but there is no reference from "stir" to "astir". Misspellings and uncommon variant spellings were cross referenced, but only when they were not either alphabetically adjacent to each other or separated merely by different forms of the same word. Thus "woe", which comes right after "wo" in the concordance, was not cross referenced, but "conceive" and "concieve" were. Much of the cross referencing was only one way, from the usual to the unusual; "eye" is cross referenced to "e'e", but not vice versa. In cases where more than one spelling appears in the text, the cross referencing was done both ways, lest the user suppose Emily Dickinson was a consistent misspeller. There is, of course, no end to cross referencing, but for the user who wants to be certain he has all the forms of a given word, there is a record of the poet's vocabulary included as an appendix of word-frequencies at the end of the concordance. Here all the words in the concordance are listed according to their frequencies.[37]

After the corrections and the cross references have been decided upon, they are punched onto cards, transferred to tape, and then fed into the computer to modify the final tape of the concordance. The results of the final tape are listed by the printer and its special print wheels. Ideally, the total time involved from the first punching of cards


226

Page 226
to this printing is about a month of eight-hour days. Unfortunately, however, more time is needed in the still unautomated part of the procedure that involves cutting the pages of the concordance apart and pasting page-numbers on each sheet. But once this throwback to the scriptorium is overcome, the pages can be delivered to the press which publishes them by photo-offset, reducing the size of the original IBM type in order to fit the concordance into a manageable volume.