University of Virginia Library

Search this document 


  

expand section 
expand section 
expand section 
expand section 
expand section 
expand section 
expand section 
expand section 
expand section 
Emily Dickinson and the Machine by S. P. Rosenbaum
expand section 
collapse section 
  
expand section 
expand section 
expand section 
expand section 
expand section 
expand section 
  
  
expand section 

expand section 

207

Page 207

Emily Dickinson and the Machine
by
S. P. Rosenbaum [*]

The use of a high-speed, electronic, data-processing machine—more commonly known as a computer—to make a concordance was described by Professor Stephen M. Parrish, the general editor of The Cornell Concordances, in the 1962 volume of Studies in Bibliography.[1] Drawing on his pioneering work in preparing computer concordances to the poetry of Matthew Arnold and William Butler Yeats, Parrish lucidly set forth the particular uses of such concordances as well as the processes involved in making them. He also anticipated and, it is to be hoped, relieved the fears of literary scholars who tend to confuse electronic means and humanistic ends. The third poet scheduled for publication in The Cornell Concordances is Emily Dickinson. Deo in machina volente, a concordance of her complete poems will appear in 1964. In preparing this concordance I have encountered a number of editorial problems, some of which did not arise with the Arnold and Yeats concordances; these problems have to do with the variorum nature of the definitive text, the punctuation and spelling in the poems, the absence of titles, the limitations of IBM type and formats, and the selection of "nonsignificant" words to be omitted from the concordance. The purpose of this paper is to discuss these problems along with the procedures of preparing the poetry for the computer in the hope of shedding some light on the editing of future computer concordances and on the uses of such a concordance for the study of Emily Dickinson's art. But before confronting Emily Dickinson with the machine, it is necessary to explain the background and special features of the definitive text of her poetry on which the concordance was based.


208

Page 208

Unless the maker of a concordance attempts the extraordinary task of re-editing a text in and through his concordance, his work will be only as good as the editions on which it is based. The history of Emily Dickinson's poetry is a case in point. The state of her manuscripts, the "creative editing" of her first editors, and the feud between the poet's editors and relatives combined to produce a remarkable chaos in the editions of her poetry. Despite this chaos, a combination word-index and concordance[2] to Emily Dickinson's poetry was done by Louise Kline Kelly as a doctoral dissertation in 1951.[3] In a listing confined to nouns, verbs, adjectives, and adverbs, Dr. Kelly gives the line-contexts of only those words occuring less than ten times in Emily Dickinson's poetry; words occurring more frequently are simply given a list of references to where the word may be found. Because practically none of Emily Dickinson's poems have titles, these references had to be to the page and line numbers of particular volumes rather than to the poems themselves. And of the five volumes of Emily Dickinson's poetry, only the last—Bolts of Melody—adhered with fidelity to the original manuscripts.

Dr. Kelly's dissertation was used by Thomas H. Johnson in preparing the three-volume variorum text, The Poems of Emily Dickinson,[4] which finally ordered the manuscripts and editions of her poetry. But with its addition of forty-one new poems, its arrangement of the poems, and its inclusion of their numerous authorial variants, Johnson's edition rendered Dr. Kelly's concordance obsolete; it can be used only with the older inaccurate editions, just one of which attempts to record important manuscript variants. Yet as various acknowledgments testify, Dr. Kelly's work has been of invaluable aid to scholars and critics, and she has put users of Johnson's edition and of the concordance based on it considerably in her debt.


209

Page 209

The nature of Johnson's The Poems of Emily Dickinson and the problems it poses for a concordance can best be appreciated by reviewing the states of the manuscripts from which the edition was constructed. Holographs exist for all but 119 of the 1775 poems in the edition, and these manuscripts according to Johnson's analysis exist in one or more of three stages of composition: there are fair copies which Emily Dickinson appears to have finished, there are semifinal drafts which also appear to be finished except for alternative choices of words written between the lines or at the sides or bottoms of the manuscripts, and finally there are worksheet drafts which range from rough jottings to elaborately reworked drafts. Many of Emily Dickinson's poems are to be found in more than one manuscript state, and a number of them exist in slightly different fair copies. The problem of arranging all these manuscripts, grouping versions of the same poem together, and then selecting the main text for each poem was perhaps the editor's most challenging task. In his introduction to The Poems of Emily Dickinson Johnson wrote that the purpose of the edition was "to establish an accurate text of the poems and to give them as far as possible a chronology."[5] As a basis for an accurate chronology Johnson used Theodora Van Wagenen Ward's analysis of Emily Dickinson's changing handwriting.[6] Once a chronology was established and the manuscripts of each poem grouped together, the poems were assigned numbers according to their chronological sequence. The major problem in grouping and numbering the poems was the selection, from among the poems that existed in more than one manuscript, of texts to be given what Johnson calls "principal representation"[7] in large type under the given poem numbers. In order to maintain the chronological order of the poems, Johnson chose, wherever possible, the earliest fair copy of each poem;[8] other versions were given in smaller type below the main text. This decision has resulted in some misunderstanding and misuse of Johnson's edition because the text selected for principal representation is not necessarily the best version of the poem. A later fair copy or an earlier semifinal draft may contain variants that are poetically better than the readings in the earliest fair copy; or the alternative words written at the bottom of a semifinal draft may be preferable to those in the body of the poem.[9] Subsequent users of


210

Page 210
Johnson's edition—particularly the editors of anthologies—have too often used a poetically inferior text, given principal representation in the definitive edition, when there was no chronological reason for doing so.[10] At times it appears that the size of the type alone confers some special authority or poetic quality to the main texts.

Next to the ordering of manuscripts, the most difficult problem in editing Emily Dickinson's poetry must have been the transcribing of her manuscripts into print. When Emily Dickinson's first editors transcribed—perhaps translated is the better term here—the notation of her manuscripts into forms that could be found in a printer's font, the poet's sister Lavinia Dickinson remarked on the result that "The rules of printing are new to me & seem in many cases to destroy the grace of the thought but of course this can't be helped, I suppose."[11] Although Lavinia seems not to have been the reader that her sister was, this comment might well have come from Emily Dickinson herself, had she permitted the printing of her work. But because she remained essentially a private poet, she could indulge her whim—and this she did, particularly in capitalization and punctuation. She not only capitalized with Germanic abandon, she also used several sizes of letters in between capitals and small letters. In punctuation her favorite mark was, of course, the dash, and she used it in a variety of places and in a variety of ways that ranged from slightly elongated commas to what can only be called short lines. Critics who have speculated on the function of these marks usually agree about their importance but disagree about their purpose. It was inevitable that Johnson's transcriptions of Emily Dickinson's capitalization and punctuation would be criticized,[12] but apart from minor errors it seems clear that the only alternative to the impressive accuracy of his transcriptions would be a facsimile edition. Apart from the small detail that it would be impossible to base a concordance on such an edition, a facsimile edition would merely postpone


211

Page 211
the problem of presenting Emily Dickinson's poetry to any other readers than a small circle of Emily Dickinson experts, bibliographers, and perhaps a cryptographer or two.[13]

II

Emily Dickinson's capitalization proved to be no problem at all in editing her poetry for a computer because IBM type is all upper case, and her punctuation provided only minor difficulties. The special print wheels that were purchased for The Cornell Concordances equipped the IBM printing machine that would produce the final pages of a concordance with all the punctuation marks in Johnson's edition.[14] Three minor changes in the definitive text were required in fitting Emily Dickinson's punctuation to the machine, however. First, the brackets that Johnson occasionally used to indicate his insertion of a letter or reconstruction of a word in a torn manuscript were silently dropped because brackets were needed for other purposes in the concordance. Secondly, the numbers that Emily Dickinson used twice in her poetry had to be changed from figures to words because the punctuation in her poetry had to be coded by numbers in order to prepare it for the computer. Finally, the eight occasions on which Emily Dickinson used single quotation marks—none of which followed double quotation marks—had to be changed to the double quotation marks usually used in her poetry; this was necessary because the computer's processes of alphabetizing treated single quotation marks as if they were apostrophes and apostrophes as if they were letters. Thus the computer would have alphabetized separately any word preceded by a single quotation mark.

If Emily Dickinson's punctuation offered no interestingly difficult problems, the variorum nature of The Poems of Emily Dickinson did. Johnson's edition included not only variants inserted in the manuscripts by the poet, but also variants to be found in differing manuscript versions of the same poem, variants in transcriptions of the poems


212

Page 212
made by friends, relatives, and editors, and finally variants in the published versions of her poems. The extent of these variants ranges from single words or phrases through lines and stanzas to what can only be called variant versions of complete poems. The inclusion of all of them is the most valuable feature of Johnson's edition, and a concordance that was to utilize them would have to be a variorum concordance. It is important to remember that Emily Dickinson was a private poet, that her poetry was not "finished" and the final choices between variants not made. To ignore the variants and make a concordance only of the earliest versions of the poems would seriously misrepresent the text of the poems that Johnson established and severely limit the value of the concordance. Yet not all kinds of the variants given in The Poems of Emily Dickinson were worth including in the concordance. Since the concordance is to the poet's words, variants only in spelling,[15] punctuation, and word-order were excluded; but when any of these types of variants accompanied a variant wording, they were retained in the concordance. Variants in line and stanza order were omitted because there was no way of representing them in the single lines that are the units of a concordance. Also excluded were published variants in poems for which there were manuscripts or reliable transcripts. Johnson's inclusion of all published variants makes a fascinating record of editorial corruption, but there was no point in perpetuating these corruptions in a concordance.[16] Variants deriving from transcripts of poems which also exist in holograph manuscripts yielded a thornier group of problems. The sources of these transcripts and their varying degrees of reliability are matters too involved to summarize here. Suffice it to say that variants in transcripts noted by Johnson as probably deriving from manuscripts now lost were included.

Determining what variants to include was of little help in the much more difficult problem of deciding how they were to be included. In The Poems of Emily Dickinson the variants are given either in the separate versions[17] listed after the principally represented text or at the


213

Page 213
end of the poems in which variants appear as alternative choices in manuscript. The principal editorial problem was how to present these separately noted variants in the individual lines of poetry. The simplest solution would have been to treat all variants as if they were variant lines. This could be done by filling out the variant word or phrase with the parts of the line that were invariant and then marking the line with the letter "V", next to the line number, as an indication of a variant line. This procedure, while obviously the way to handle variants that were complete lines, had serious disadvantages when used with parts of lines. To produce a variant line when there was in fact only a variant word is a somewhat misleading representation of the text that would increase the automatic word-frequency counts to be made by the computer and included as an appendix to the concordance. Given the number of Emily Dickinson's variants, the changes in frequencies could be extensive. Such a solution would also fail to show a very important aspect of the variant's context: the word or words for which the variant was introduced. A single method for handling all variants was abandoned, therefore, and the kinds of variants were treated in different ways according to whether they were single words, phrases, stanzas, or versions of an entire poem.[18]

Variant words, then, were included in the concordance by enclosing them in brackets and inserting them into the line of the principal text after the word for which they were variant. When more than one variant was given for a word, the alternative variants were separated from each other within the brackets by a slash mark. Thus Emily Dickinson's description of despair in the next to last line of poem #640 ("I cannot live with You—"), together with the two variants written in the manuscript for the last word in the line, appears as follows in the concordance:

AND THAT WHITE SUSTENANCE—[EXERCISE / PRIVILEGE]
Here as elsewhere in the concordance the word or words that the variants replace can usually be determined simply by noting the number of syllables in the variant and in the words preceding the bracketed insertion. By including variant words in this manner it was possible for the concordance to juxtapose different variants with the words for

214

Page 214
which they were variants and—most important—to index the variant as well as the invariant words.[19]

Editing variant phrases for inclusion within the line, alongside the phrases for which they were variants, proved to be the most difficult problem in preparing Emily Dickinson for the machine. Variant phrases cannot be treated merely as variant words because frequently the phrases cannot be matched, word for word, with the phrases in the main text. Even if they could be always matched, breaking up the phrases would mean ignoring the unity—and hence often the meaning—of the phrases. Sometimes it was impossible to do otherwise. But in many instances the words of a variant phrase could be kept together. The variants, for example, to the phrase "so eminent a sight" (poem #1265, line 4) in a worksheet draft are "Another such a might", "So adequate", and "So competent a sight". To treat these phrases merely as variant words would obscure the relationships between the adjectives and nouns—and thus partly defeat one of the principal purposes of a concordance which is to provide the contexts of the indexed words. By keeping the words of the phrases together and by separating alternate variants with slash marks, it is possible to indicate which adjectives go with which nouns. The line, then, as it was edited for the concordance, appears as follows:

I FAMISH [PERISH] TO BEHOLD SO EMINENT A SIGHT [ANOTHER SUCH A MIGHT / SO DELICATE A MIGHT / SO ADEQUATE A MIGHT / SO COMPETENT A SIGHT][20]
In this line, as in most of the others where variant phrases were worked in, it is again possible to see how far back in the line the phrase refers by counting syllables. When the variant phrase differs in the number of syllables from the phrase in the main text, the sense of the phrase

215

Page 215
usually makes clear what it is a variant of. When on occasion even this does not occur, the user of the concordance must have recourse to Johnson's edition which, in any case, should never be very far away from the concordance.

Sometimes the editing of variant phrases for insertion into the lines of a principal text involved the deletion of words in the main text that were repeated in the variant, or the repetition of words in the main text that were omitted from the variant phrase, as in #1265. In poem #1448, for example, the two variant phrases for "Intent upon it's[21] own career" are "Intent upon it's mission quaint" and "circuit quaint". In fitting these words into the line, the first three words of the first variant phrase were not repeated in the variant after appearing in the main text. In the concordance the line reads as follows:

INTENT UPON IT'S OWN CAREER [MISSION QUAINT / CIRCUIT QUAINT]
And in poem #1343, line 2, Emily Dickinson wrote as variants for the phrase "Was all that saved", first "alone sustained—" and then simply "upheld—". Johnson, to make the variant clearer, uses a bracketed "Alone" with "upheld"; here as elsewhere this clarification of variants was followed, and the line appears in the concordance as,
WAS ALL THAT SAVED [ALONE SUSTAINED— / ALONE UPHELD—] A BEE
It should be stressed, however, that this editing procedure does not involve adding or removing words from Emily Dickinson's poetry, but simply filling out elliptical phrases with words from the main text or removing repetitions that were used to indicate the place of a variant phrase in a line. Even with these procedures, it was not always possible to avoid all repetition, as is seen in the example from poem #1343. Nor was it always possible to keep the different words of a variant phrase together. Sometimes the various combinations requiring insertion were too complex to be fitted together as one or more variant phrases; in these cases the phrases were treated as separate words or

216

Page 216
combined into separate variant lines. The texts that involved this kind of editing were almost always worksheet drafts whose definitive reconstruction is impossible for an editor, let alone a concordance-maker.[22] Sometimes the inclusion of a variant phrase would have meant the repeating of nearly an entire line in order to keep the words of a variant phrase together. In these cases it seemed advisable to separate the words rather than swell the bulk of the concordance and the totals of the word-frequencies. When a variant phrase was in effect a variant line—when, in other words, it lacked only a word or two of being a completely different line—it was treated as an independent variant line and the missing words were supplied from the original line for which the phrase was a variant.

In addition, then, to bracketed variant words and variant phrases, a variorum concordance to Johnson's edition had to make use of variant lines. A given variant was handled as a variant line when all, or nearly all, the words in the line differed from the corresponding line in the principally represented text—or when complicated variant phrases could not be fitted into the lines of the main text. Two different kinds of variant line were used: numbered and unnumbered. Numbered variant lines consist of lines clearly variant to a given line in the main text; the number of the variant line is the same as that of the line for which it was variant, the only difference in the identification of the two lines being a "V" alongside the number of the variant line. Unnumbered variant lines—lines marked with a "V" but given no line number—were used for unplaced variant lines to a poem and for lines and stanzas of poems that were not included in the text chosen for principal representation by Johnson. In poem #1393 ("Lay this Laurel on the One"), for instance, a draft of the poem has an opening four-line stanza that Emily Dickinson omitted from the fair copy that was sent in a letter. In the concordance each line of the omitted stanza is given as an unnumbered variant line. The distinction, in short, between numbered and unnumbered variant lines in the concordance is difference between alternative and additional lines of a poem.

The Poems of Emily Dickinson includes not only variant words, phrases, lines, and stanzas, but also what amount to variant poems. For


217

Page 217
ten poems in Johnson's edition the main texts are given as double versions of the same poem; since these constitute two poems under one number, it was necessary to combine them in some way for the concordance.[23] This was done either by listing certain words of one version as variants to the other—as in poems #494, #1213, and #1282 where the differences are quite minor—or through the use of numbered and unnumbered variant lines. How this was done can be seen in the way the most famous double-version poem—#216 ("Safe in their Alabaster Chambers—")—was handled in the concordance. The 1859 version of this poem was taken as the main text in the concordance, and the lines of both the 1861 fair copy and its worksheet draft were treated as variants to the 1859 version. The additional lines and stanzas of the 1861 manuscripts were clearly not variant to specific lines of 1859 manuscript, hence they had to be listed as unnumbered additional variant lines. And just as the arrangement in Johnson's edition does not imply any evaluation of the quality or authority of the two versions, so in the concordance the unnumbered variant lines are not any less significant or valuable than the numbered ones.

Other editorial problems involved in making a variorum concordance to Johnson's edition centered around the punctuation of variant words and phrases. When the manuscript punctuation of single-word variants differs from the main text, the difference is usually a dash following the variant. These were not reproduced in the concordance because Emily Dickinson seems to have used the mark mainly to separate alternative variants. The punctuation of variant phrases was followed exactly in the concordance because of the greater potential significance of punctuation in the meaning of phrases. When the final punctuation of the phrase was identical with that in the main text, it is given after the bracketed variant phrase or phrases—thus indicating that the punctuation is the same for both readings. When the terminal punctuation of variant phrases differed from that in the main text, the variant punctuation follows the variant phrase in the brackets, and the


218

Page 218
final punctuation of the principal-text phrase precedes the bracketed insertion of variants. Two types of variant "punctuation" had, nevertheless, to be omitted from the concordance. Even with the special printwheels made for the Cornell concordances, it was impractical if not impossible to write a program and set up an IBM printer to print lines either under or through words. Thus all cancels and italics in the manuscripts had to be dropped. This is not a particularly serious sacrifice with the cancels because they are so infrequent in Emily Dickinson's manuscripts. Less than one per cent of the more than 100,000 words in her manuscripts are crossed out. There are even fewer instances of underlined variants, yet these have greater significance because Emily Dickinson appears to have indicated to herself—at the time of composition or revision—the alternative choices she preferred by underlining them. Yet Johnson notes instances where later fair copies of poems did not adopt the underlined variants to be found in earlier semifinal or worksheet drafts of a poem, and he concludes that "the mood of the moment played itspart."[24] The inability of the computer and its peripheral equipment to convey the results of these moods is one of the small but unfortunate sacrifices involved in combining Emily Dickinson and the machine.

III

Before the variants were edited into the lines of the main texts in Johnson's edition, certain corrections had to be made in the edition itself. As Jay Leyda notes in his valuable review,[25] the thoroughness of the edition throws the smallest errors into sharp relief. Because these errors and other new discoveries affecting Johnson's edition are not widely known, it is perhaps worthwhile to detail here the modifications made in The Poems of Emily Dickinson for the concordance. The edition actually used for the concordance was not the 1955 first printing of the first edition, but the 1958 second printing. The differences—with one important exception—are to be found in such minor alterations as the change of "Appendixes" to the less grammatically controversial "Appendix" and the printing right side up of a line of type that appears upside down in the first printing. The important exception is the list entitled "Corrections" that is hidden away on the verso of the appendix title page of the second printing. The substantive changes in the poetry itself to be found in this list are the corrections of "teases"


219

Page 219
to "teazes" (#319, l. 6), "has" to "had" (#1254, l. 1 of the worksheet draft), "revelry" to "revery" (#1526, l. 12 of the Todd transcript), "the" to "a" (#87, l. 2).[26] Also adopted in the concordance were corrections of errors noted by Charles R. Anderson; these include changes of "the" to "this (#1068, l. 11 of the copy sent to Niles), the addition of a dash at the end of a line (#1271, l. 7), and the addition of "swift" to the list of variants for the phrase "sudden legacy" in the worksheet draft of #1333.[27] Corrections I have made while preparing Johnson's text for the computer include changes of "unknow" to "unknown" (#78, l. 8 of the pencilled copy), "Feet" to "Fete" (#794, variant note to l. 16), "world" to "would" (#1133, variant note to l. 8), "he" to "her" (#1496, variant note to l. 11), and "departure" to "departing" (#1773, variant note to l. 3).[28]

In his review of Johnson's edition, Leyda also noted that the number of poems in Emily Dickinson's canon was less that the 1775 given by Johnson because in three instances, poems numbered separately are actually versions of other poems.[29] After writing his review Leyda discovered that the last poem in the canon was a stanza from a variant version of #1068 ("Further in Summer than the Birds").[30] These important modifications of Johnson's edition are noted in the preface to the concordance, but Johnson's original numbering was not changed because of the possible confusions that would result for the reader if all the numbers after #331 were changed. The principle followed in emending Johnson's text was to adopt substantive corrections involving the addition or deletion of words in the poems, but not to include corrections of the ordering of words or the numbering of poems.

Another editorial problem related to Johnson's numbering of the poems was the difficulty of identifying the poems merely by their


220

Page 220
numbers. Only twenty-six[31] of the poems in Emily Dickinson's canon have titles supplied by the poet herself. Contrary to the prevalent thinking in telephone companies and the Post Office, numbers alone are not particularly easy things to remember. Not very many readers know the Psalms by their numbers and even fewer can identify Shakespeare's sonnets by theirs; it did not seem reasonable, therefore, that there would be any more readers familiar enough with Emily Dickinson's poems to identify them simply by their numbers. And to identify the lines of a poem only by number would render the concordance practically useless for any one using it with an edition or an anthology that does not give Johnson's numbering of the poems. The alternative for the editor of a concordance of making up titles for 1,749 poems was also unattractive—particularly since this had been tried with rather horrible results in the first editions of Emily Dickinson's poems. The problem of identifying her poems in the concordance was solved by using as a "title" as much of the first line of each poem as the format of the concordance allowed. This turned out to be twenty-four spaces. In these were put all the complete words of the first line that would fit. In several cases this solution meant that two poems had similar titles,—as, for example, with "The Butterfly's" (#1244) and "The Butterfly's Numidian" (#1387). Another possible solution to the problem of titles was to use parts of words in handling similar first lines, but this was abandoned; besides creating ugly and non-existent words, this method had rather ambiguous possibilities when applied to the shortening of, for example, the third word in "The Butterfly's Assumption". The twenty-six titles that Emily Dickinson herself used were treated as lines in the concordance and indexed along with the lines of the poems. The only difference here was that in place of a line number the symbol "T" was used to indicate the "line" in question was actually the title of a poem.

The following excerpt from the concordance illustrates the manner in which the shortened first lines were used to identify the lines of the poems. It also shows how brackets and the "V" were used to handle variants in the concordance.


221

Page 221

                                   
INDEX WORD  TEXT  FIRST LINE  POEM  LINE 
DRUM 
AND BRING THE FIFE, AND TRUMPET, AND BEAT UPON THE DRUM--  AWAKE YE MUSES NINE,  39 
A SERVICE, LIKE A DRUM--..............  I FELT A FUNERAL, IN MY  280 
FIRM TO THE DRUM--..................  UNTO LIKE STORY—TROUBLE  295  18 
THE EARTH HAS SEEMED TO ME A DRUM,....  WHEN I HAVE SEEN THE SUN  888 
SUBSEQUENT A DRUM—.................  THE POPULAR HEART IS A  1226 
BEFORE THE QUICK [RIPE / PEAL / DRUM / DRUMS / BELLS / BOMB /...  ONE JOY OF SO MUCH  1420 
AS IF A DRUM [THE DRUMS] WENT ON AND ON.  THE PANG IS MORE  1530 
DRUMMER 
THAT LIT THE DRUMMER FROM THE CAMP...  GOOD NIGHT! WHICH PUT  259  11 
DRUMS 
IT IS AS IF A HUNDRED DRUMS............  I HAVE A KING, WHO DOES  103 
OF THEIR UNTHINKING DRUMS--..........  I DREADED THAT FIRST  348  28 
DRUMS OFF THE PHANTOM BATTLEMENTS...  OVER AND OVER, LIKE A  367 
ARE DRUMS TOO NEAR—.................  INCONCEIVABLY SOLEMN!  582  15 
THE DRUMS TO HEAR--.................  INCONCEIVABLY SOLEMN!  582  V15 
AS COOL [DISTINCT] AS SATYR'S DRUMS—.....  DID YOU EVER STAND IN A  590  14 
This excerpt also indicates how the format of the concordance takes care of lines longer than forty-six spaces by doubling them back and omitting the spaced dots that link a single line with its first-line title. Lines longer than sixty-nine spaces—all that an eighty-space IBM card could consistently handle in addition to poem and line numbers—had to be divided. The shortness of Emily Dickinson's lines, even when extended with inserted variants, seldom made this procedure necessary—and then only when variants had been inserted into the lines. Line 6 of #1420, for example, appears with all its variants as follows:
BEFORE THE QUICK [ RIPE / PEAL / DRUM / DRUMS / BELLS / BOMB / BURST / FLAGS / STEP / TICK / SHOUTS / PINK / RED / BLADE] OF DAY
The first part of this line, followed by an ellipsis, is given in the excerpt from the concordance. The second part of the line would be preceded by an ellipsis in the concordance; in this particular example, however, the line had to be divided into three parts, and the middle section of the line is both preceded and followed by ellipses.

One very important feature of the format of the concordance is the arrangement of lines under an index word. As the excerpt reveals, the lines are arranged according to Johnson's numbering of the poems, and this means that they appear in approximate chronological order. A table keying the poem numbers to their assigned dates of composition in Johnson's edition will appear in the preface to the concordance,


222

Page 222
thus permitting the user to examine at a glance the chronological use of any indexed word in Emily Dickinson's poetry.[32]

The last major editorial decision that had to be faced before giving Emily Dickinson to the machine concerned the kinds of words to be omitted from the concordance. For reasons primarily of cost it is not feasible to index every occurrence of non-essential words such as "a" and "the". To list every occurrence of these two particular words in Emily Dickinson's poetry would involve the addition of 2,680 and 6,134 lines respectively to the concordance—an increase of approximately ten per cent in the bulk of the concordance. The number of these kinds of words omitted from the concordance is quite small, compared to the number customarily omitted from manual concordances; the only consideration here was space, whereas the sheer labor involved in a hand concordance makes it desirable to omit as many words as possible. The following is a list of the principal words omitted from the indexing of the concordance; in addition, all forms of these words—plurals, contractions, etc.—were also omitted:

a although an and another at both but can could each either for from here how however if in into it itself must no nor not now of on or other should so than that the their them then there therefore these they this those though through thus to too upon what when where whether which who why would
Also deleted were all forms of the verbs "do" and "have", and all forms of the verb "to be" except "be" itself which was kept because of Emily Dickinson's unusual and extensive use of its subjunctive form. "Like" and "as" were retained to provide lists of Emily Dickinson's similes. Also kept in were pronouns that are almost always omitted from concordances, manual or machine. All occurrences of "I", "we", "you", "he", and "she" together with their other forms will appear in the concordance because of their relevance to biographical and "persona" studies.

Some recent studies have shown, however, that these so-called "non-significant" words omitted from the concordance are actually very important in analyses of style.[33] But for those who need these words


223

Page 223
in studying Emily Dickinson, all is far from lost. It is possible to retrieve them from the complete tape of the concordance which will be stored at Cornell University and available to anyone who wishes to analyze Emily Dickinson's poetry in ways beyond what the concordance allows.[34]

IV

In order to explain the remaining editorial problems that arise when the poetry of Emily Dickinson is finally confronted by the machine, it is necessary to outline the processes by which a computer can be used to make a concordance. After the text has been edited for the computer, the first step in transmitting it to the machine is to have the poetry punched, a line at a time, on IBM cards. Also punched with each line are identifying poem and line numbers. These cards are the basic data of the concordance, each card with its line providing the context for any indexed word within it. Because the concordance will be only as accurate as the cards, the punching of Emily Dickinson's poetry was, in effect, done twice. The first operator of the machine that punches the cards reproduces the text by "typing" it, but instead of printing the letters, the machine punches holes in the IBM cards. Another operator then repeats this procedure with the same cards on what is called a "verifier". The verifier checks electromagnetically to see that the holes already punched in a given card are the same as the operator of the verifier indicates through his keyboard that there should be. Any discrepancy between the two operators' work is caught by the verifier and has to be corrected—or the card marked as defective—before the operator of the verifier can proceed. The only errors, then, that can survive this operation are identical mistakes made by the two operators. In order to check on these, all the cards were run through the printer—a machine that prints out the contents of the cards—and proofread against the text. Some idea of the accuracy obtainable through this method of punching and verifying—a method much more accurate than the ordinary routines of printing and proofreading—can be seen in the fact that only six errors were discovered in a text of over 100,000 words. Three of these mistakes, incidentally, were due to the editor's handwriting; the other three were identical mistakes made by the two operators. The total amount of time needed for punching and verifying was about 200 hours.


224

Page 224

After the complete text of Emily Dickinson's poems was punched, an additional 1,775 cards consisting of the shortened first-line titles of her poems were punched. Each of these cards was placed before the cards containing the lines of the poem for which it was the identifying title, and then the entire deck of cards—some 25,000 of them—was transferred onto magnetic tape. Emily Dickinson was now ready for the machine.

The following summary of what happens when the text is fed into the computer is based on the explanation given by the concordance's programmer, Mr. James A. Painter, in his preface to the Yeats concordance.[35] The program—the instructions written by the programmer to tell the machine what to do and when to do it—consists of four phases: input, sorting, output, and correction. During the input phase of the program, the computer first scans the magnetic tape and breaks each line of poetry into its component words, attaching to each word the complete line from which it comes as well as the line's identifying poem title and poem and line numbers. These words are then checked against a list of words to be omitted that was previously stored in the magnetic-core "memory" of the computer. The machine then transfers the words to be indexed, together with their lines, titles, and numbers, to a new series of tapes called "word blocks". In the sorting phase of the program each word block is alphabetized,[36] and then the individual word blocks are merged and alphabetized with one another. The amount of time taken by the input and sorting parts of the program was around ten hours: six for indexing and four for checking. In the third, or output, phrase of the program, the results of the final tape—which now contained all the words together with their contexts in alphabetical order—are listed by an IBM printer. Such features of the format as indentation of the entries after an index word or the use of spaced dots to separate these entries from their titles were also automatically done by the printer. The amount of time needed to print out the whole concordance was approximately ten hours, printers being among the slowest of a computer's auxiliary equipment.


225

Page 225

In the final phrase of the program, editorial corrections involving the addition, replacement, or deletion of words and lines were made from this printing out of the tape. But before they could be made, two final editorial problems had to be met: the discrimination of homographs and the use of cross-references. Because the user of a concordance can distinguish homographs for himself from the contexts of their lines, the only homographs that had to be separated were those involving omitted words. The nouns "art", "will", "might", and "may", the verb "wilt" and the adjective "wont" turned out to be the only homographs included with omitted words; they were all separated from the non-essential verbs of the same spelling and retained in the concordance. The problem of cross references is particularly troublesome with Emily Dickinson's poetry because of her erratic spelling. Painter had worked out for the other computer-concordances a program for automatically cross referencing hyphenated compounds, but this was of little help because Emily Dickinson rarely used a hyphen. Her non-hyphenated compounds were cross-referenced by the editor when the compounds were unusual enough so that a user in search of all occurrences of a word could not be expected to anticipate it. "Ashine", for example, was cross-reference from "shine" but there is no reference from "stir" to "astir". Misspellings and uncommon variant spellings were cross referenced, but only when they were not either alphabetically adjacent to each other or separated merely by different forms of the same word. Thus "woe", which comes right after "wo" in the concordance, was not cross referenced, but "conceive" and "concieve" were. Much of the cross referencing was only one way, from the usual to the unusual; "eye" is cross referenced to "e'e", but not vice versa. In cases where more than one spelling appears in the text, the cross referencing was done both ways, lest the user suppose Emily Dickinson was a consistent misspeller. There is, of course, no end to cross referencing, but for the user who wants to be certain he has all the forms of a given word, there is a record of the poet's vocabulary included as an appendix of word-frequencies at the end of the concordance. Here all the words in the concordance are listed according to their frequencies.[37]

After the corrections and the cross references have been decided upon, they are punched onto cards, transferred to tape, and then fed into the computer to modify the final tape of the concordance. The results of the final tape are listed by the printer and its special print wheels. Ideally, the total time involved from the first punching of cards


226

Page 226
to this printing is about a month of eight-hour days. Unfortunately, however, more time is needed in the still unautomated part of the procedure that involves cutting the pages of the concordance apart and pasting page-numbers on each sheet. But once this throwback to the scriptorium is overcome, the pages can be delivered to the press which publishes them by photo-offset, reducing the size of the original IBM type in order to fit the concordance into a manageable volume.

V

If, as Dr. Johnson maintained with unimpeachable authority, the making of dictionaries is "dull work," it would seem to follow that the making of concordances is deadly dull work. With an electronic computer, much of the dullness can be left to the machine to endure. The problems remaining for the maker of the concordance, while numerous, complex, and often undeniably irritating, are really not dull. For some, however, they may appear to be trivial problems; whether or not they are depends largely upon the ends that their solutions serve—upon, in other words, the uses of a computer concordance to the works of Emily Dickinson. Most people who know what a concordance is assume that its primary function is to locate poems or parts of poems that the user has forgotten. In the case of Emily Dickinson's poetry this service is not quite as insignificant as it might be for other poets because of the history of the publication of her manuscripts. As a tool, then, for restoring lines and stanzas, appearing elsewhere as complete poems, to the poems from which they were originally taken, the concordance is useful. Yet this function hardly justifies the human and inhuman labor and expense involved in preparing a computer concordance. A much more important value appears in the fact that a concordance to Emily Dickinson's poems is an index to the words, and consequently the images and ideas, of her art.[38] And unless the words that present these images and ideas are given in their contexts—contexts which include the poet's alternative choices for these words—they remain, as one critic has put it, "inert".[39] Through the inclusion of variants and through the chronological order of the entries under a word, the concordance could be indispensable for studies of Emily Dickinson's poetic development. The concordance is also potentially valuable, if used in the right ways, for biographical and canonical


227

Page 227
studies, and the preservation of the tape of the concordance makes it possible to go on to metrical and variorum studies. The usefulness of the concordance extends even beyond the poetry of Emily Dickinson, for the uniform features of the Cornell Concordances of Matthew Arnold and W. B. Yeats invite a variety of detailed comparison studies—studies which will be extended with the subsequent appearance of concordances to such poets as William Blake, Ben Jonson, and Andrew Marvell.

It must be admitted, however, that one use of a concordance has been inevitably lost by collaborating with a computer. This function was eloquently described by F. S. Ellis in the preface to his lexical concordance of Shelley published in 1892:

To those who would induce time to spread his wings, and who would drive from their spirits the cloud of minor vexations with which life is beset, I can heartily recommend the making of a concordance. The day when I began this work, six years since, seems but yesterday.
It is, alas, no longer feasible to ignore six years of minor vexations by making a concordance.

Notes

 
[*]

This paper is based on the preface to my forthcoming A Concordance to the Poems of Emily Dickinson (Ithaca, N.Y.: Cornell University Press).

[1]

"Problems in the Making of Computer Concordances," XV, 1-14. See also the next paper in this volume (pp. 15-31) "Electronic Computers and Elizabethan Texts" by Ephim G. Fogel.

[2]

The distinction between a word-index and a concordance is often blurred—as in the definition of a concordance given by Webster's Third New International Dictionary: "an alphabetical verbal index showing the places in the text of a book or in the works of an author where each principal word may be found often with its immediate context. . . ." The last part of the definition equivocates on the key issues; the definition also fails to note the existence of concordances to languages. For a clear distinction between word-indexes and concordances see Roberto Busa's introduction to Varia Specimina Concordantiarum of Aquinas's liturgical hymns (Milan, 1951). According to Father Busa a word-index is "a list which gives for each entry the numerical listing of quotations only" whereas "when under each word all the lines that contain this word are transcribed, one by one, we have the 'concordance'" (p. 8).

[3]

A Concordance of Emily Dickinson's Poems, The Pennsylvania State College.

[4]

Cambridge, Mass., 1955.

[5]

Poems, I, lxi.

[6]

Poems, I, xlix-lix.

[7]

Poems, I, lxi.

[8]

For the sequence of selection when no fair copy existed, see Poems, I, lxi-lxii.

[9]

The most famous example of this is to be found in "I taste a liquor never brewed—" (poem #214) in which the variant last line, "Leaning against the—Sun—" adds a perfectly appropriate concluding image that is absent from the original line, "From Manzanilla come!" which, in effect, adds only a rum name.

[10]

Johnson's own one-volume edition, The Complete Poems of Emily Dickinson (1960), unfortunately also does this. In his preface Johnson says he has adopted only those variants underlined by the poet, but his practice here is inconsistent.

[11]

Quoted in Millicent Todd Bingham's Ancestors' Brocades (1945), p. 60.

[12]

Edith Perry Stamm, for example, in her article "Emily Dickinson: Poetry and Punctuation," Saturday Review, March 30, 1963, pp. 26-27, 74, criticizes Johnson for not recognizing that the dashes are elocution marks to guide the reciter of the poems. Her theory is easily disposed of simply by trying to read facsimiles of the poems according to these supposed marks.

[13]

As a supplement to Johnson's edition a facsimile edition of at least the best poems would be welcome. With the current techniques of electroprinting, however, the individual student of Emily Dickinson can make a good beginning for himself by reproducing the facsimiles in Johnson's edition, in Bolts of Melody, and in Charles R. Anderson's Emily Dickinson's Poetry (N.Y. 1960).

[14]

The ordinary print wheels used by the IBM "printer" consist only of periods, parentheses, hyphens, and commas for punctuation marks. The Arnold concordance, which was printed before the costly print wheels could be purchased, was done without punctuation, but the Yeats concordance is fully punctuated.

[15]

But variant spellings that perhaps indicated different words—"straight" and "strait", for example—were included.

[16]

Three exceptions were made in applying this principle. Published variants for poems #59 and #160 were included because they appear to derive from manuscripts now lost. And the published second stanza of #57 was retained because it poetically complements the stanza of a poem whose manuscript had been torn off at the top. Poems in Emily Dickinson's canon that exist only in their published forms were, of course, included as such.

[17]

Johnson often notes in his edition the nature and location of variants in another version of the poem—but not always. Consequently it was necessary in preparing the concordance to collate all variant versions with the main text.

[18]

This classification of variants had to ignore whether the variants were from single worksheet or semifinal drafts; from worksheet, semifinal, and fair copies of the same poem; or from differing fair copies, semifinal drafts, and worksheet drafts. To indicate the provenance of each variant in the concordance would have entailed a bewildering system of signs; the only sensible alternative seemed to be to combine them, regardless of their sources, and to refer the user of the concordance to Johnson's edition for their origins.

[19]

One result of combining variants was that words repeated in different manuscripts were not repeated as such in the concordance. The concordance could not represent within the limits of a single line the stages of composition beyond the including of different variants after the word used in the earliest finished version. Thus, for example, in poem #329 ("So glad we are—a Stranger'd deem") there are two manuscripts, the fair copy represented in large type and a semifinal draft from which it was redacted. In the semifinal draft the last line of the poem, "Could not decide between—", is given as "Could not discern between—" and then variant choices for "discern" are listed as "conclude" and "decide". In giving this line in the concordance as COULD NOT DECIDE [DISCERN / CONCLUDE] BETWEEN it seemed unnecessary to repeat "decide" in the brackets to indicate that the word had also been a variant choice at one stage of the poem's composition.

[20]

An example of a variant in lining not handled in the concordance also occurs in this line—or in these lines, to be exact. In the worksheet draft of the poem the line is given as two lines: "I perish to behold" and "Another such a might". If the lining of the worksheet draft were kept in the fair copy, the variants would have been handled as variant lines. But as the concordance has to follow the lining of the main texts in Johnson's edition, what was a variant line in an earlier draft is a variant phrase in the fair copy and in the concordance.

[21]

The eccentricities of Emily Dickinson's punctuation were, of course, followed in the concordance.

[22]

See, for example, poem #533 ("Two Butterflies went out at Noon—") where a fair copy was later turned into a worksheet draft. Johnson's own ordering and lining of the variants in this manuscript have been questioned by William H. Matchett, PMLA, LXXVII (1962), 436-441. See also poem #1591, a worksheet draft so rough that Johnson attempted no reconstruction; in preparing it for the concordance, all that could be done was to group together the phrases that appeared to be variants of one another.

[23]

Nine of these poems—216, 433, 494, 824, 1213, 1282, 1357, 1358, and 1627—are labelled as versions I and II or earlier and later versions; poem #148 simply has an "or" separating the two versions given in a single manuscript. Johnson's editorial practice in giving double versions of these poems seems inconsistent, for in some of them he seems to have abandoned his procedure of giving principal representation to the earliest version of the best manuscript. In other cases, the minor differences between the double versions hardly make the distinction worthwhile, unless it was also to be applied to more radical differences between two or more versions of a poem given only a single representation in the edition. There was, at any rate, no way of including the double versions as such in the concordance without modifying in some way Johnson's numbering of the poems in the canon.

[24]

Poems, I, xxxv.

[25]

New England Quarterly, XXIX (1956), 239-245.

[26]

The list of corrections also adds dashes at the end of #290, l. 4 and #299, l. 4 of the copy sent to Susan Dickinson.

[27]

Emily Dickinson's Poetry, pp. 312, 321, 324, and 325.

[28]

There are also some very minor errors in the line numbering used to key variants to poems. These errors and their corrections are as follows:

  • # 532, for 20] read 19]
  • # 577, " 25] " 26]
  • " " 26] " 27]
  • " " 27] " 28]
  • " " 28] " 29]
  • #1479 " 7-8] " 7-8
  • #1508 " 15] " 11]
  • #1646 " 6. " 5.

[29]

#331 is an early variant of #342, #992 is an adaptation of #937, and #1616 is a slight adaptation of #1525. See NEQ, pp. 242-243.

[30]

The version found by Leyda is given in Anderson's Emily Dickinson's Poetry, pp. 324-325; it has one new variant—"candles" for the "candle" given in the Norcross transcript—and this new variant was included in the concordance.

[31]

Johnson's list of these in Poems, III, 1206, gives only twenty-four, but his edition shows additional titles for #1 (Valentine Week") and #1545 ("Diagnosis of the Bible, by a Boy—").

[32]

The table will also incorporate the changes in order and date made after the text of the poems went to press and given by Johnson in his preface, Poems, I, lxv.

[33]

See, for example, Frederick Mosteller's and David L. Wallace's use of "while", "whilst", "upon", and "enough" in analyzing the authorship of the Federalist papers, "Notes on an Authorship Problem," Symposium of Digital Computers, Annals of the Computation Laboratory of Harvard University, XXXI (1962), 163-197.

[34]

It should be possible, for instance, to instruct a computer to search the tape and print out all of Emily Dickinson's variant words and lines together with their poem numbers and consequently their approximate dates. Such information could be of considerable value in studying the composition of her poems.

[35]

"Programmer's Preface," A Concordance to the Poems of W. B. Yeats (1963), pp. xxix-xxxvii.

[36]

The computer is really working with numbers in binary arithmetric rather than letters, and the words are alphabetized by comparing the numbers of the letters of a word with another word, arranging the two words in numerical order, and then proceeding to another word. If this operation suggests the way an idiot might alphabetize something, it is important to remember that the computer is a highspeed idiot, capable of comparing some 42,000 numbers a second.

[37]

Frequencies of all the omitted words are given in the preface.

[38]

Even used merely as a word-index, the value of the concordance extends far beyond the limited but useful "subject" index at the end of Johnson's edition. There are ways, however, in which Johnson's index complements the concordance.

[39]

Anderson, p. 318.


228

Page 228