Ralph Hanna [*]

Nobis et ratio et res ipsa centum codicibus potiores sunt.

—Richard Bentley

There are editors destitute of this discriminating faculty, so destitute that they cannot even conceive it to exist; and these are engaged in a task for which nature has neglected to equip them. What are they to do? Set to and try to learn their trade? that is forbidden by sloth. Stand back and leave room for their superiors? that is forbidden by vanity. They must have a rule, a machine to do their thinking for them. If the rule is true, so much the better; if false, that cannot be helped: but one thing is necessary, a rule.

—A. E. Housman [1]

I've enjoyed putting this paper together, for it's allowed me to play my way through the large data sample assembled on The Canterbury Tales Project CD- ROM.[2] I want to examine a single reading and passage from Project's `The Wife of Bath's Prologue'. From the CD-ROM images, I have assembled data and checked it off against explanatory materials published on the disk and/or published in the Project's Occasional Papers. A long series of human experiences has taught me that I can take this reading and passage as exemplary (i.e., other passages I could have chosen would produce the same results, if not perhaps the same clarity of presentation).

I begin with a single lection, the Wife's question at D 115-117—in the form of the Project's base for collation:

Telle me also to what conclusioun

Were membres maad of generacioun

And of so parfit wys a wight ywroght?

From the CD-ROM, I've collected the ten most important manuscripts in


the textual tradition, the four great early independent copies Hg El Gg Ha4, the exemplary representatives of Manly and Rickert's four large constant groups Dd (spot-checked against En1) Ne Cp Pw, and the two important later independent copies Ad3 Ch.[3] These read:

Hg  And of so parfit wys a wight ywroght [= Ad3 Ch] 
El  And for what profit was a wight ywroght 
Gg  And for what profyt was a wyf Iwrouʒt 
Ha4   And in what wise was a wight ywrouʒt 
Dd  And of so parfyt wyse a wyght Iwrought [= En1
Ne  And of so parfit vice a wiʒt ywrouʒt 
Cp  And of so parfyt wise and why ywrought 
Pw  And of so parfit wise and whi ywrouʒt 

Leaving aside some other, mostly explicable variation, these suggest (especially in the context of the Project editors' outspoken veneration of the Hengwrt manuscript Hg)[4] that the Project's edited text of the line should be identical with Hg. Here I am particularly concerned with one word in this textus receptus, the penultimate `wight'.

`Wight' appears unexceptionable, the reading of thirty-one witnesses in all. Next most widely dispersed is the peculiar (and peculiarly anacoluthonic) Cp Pw variant `and why'; it is found in sixteen witnesses. Elsewhere one finds only: thyng Py; (that) were Bw, were Nl; and how Sl1; wright Ld2 Ln Ry2; the phrase is omitted in Ii, and the whole line in En2 He. The majority of all copies and the overwhelming evidence of those most usually trusted emphatically support `wight' as Chaucer's reading.

Unfortunately, there are two difficulties with this analysis, ably signalled years ago by E. Talbot Donaldson.[5] In his classic discussion of the line, which I here summarise, Donaldson demonstrated that:

(a) The word `wight' is not grammatical in this utterance; it cannot, in Middle English, refer to the deity, who is not a `thing'. To Donaldson's showing, one might add a reference to the modern word `not', low-stress variant of `nought', a derivative of the Old English compound `nā-wiht' `nothing'.

(b) The Wife happens at this point, as frequently, to be translating literally, in this case Jerome's rhetorical question from Adversus Iovinianum 1.36: `Et cur, inquies, creata sunt genitalia, et sic a conditore sapientissimo fabricati sumus...?'[6] Latin `conditor', regularly used in British Latin as a term for `God the creator', originally means `the builder/founder of a city', i.e., with the reinforcement of Jerome's `fabricati sumus', a carpenter or `wright' (cf. the modern surname `Wright', equivalent to `Carpenter', or such related onomastic forms as `Cartwright' and `Wheelwright').


What's going on here? The Canterbury Tales Project editors have done, by their lights, everything right and, as a consequence, have it all wrong. There is no family-tree principle, equitably and rationally applied, which can construe the reading of the isolated Ld2 Ln Ry2 as the reading of O, the copy provided the archetypal scribe(s) of Chaucer's `Wife of Bath's Prologue'. And yet that reading is, on the basis of all the evidence, most assuredly what Chaucer did write. One does not learn that from a machine or rule, but precisely on the basis of what Bentley calls `ratio' and Housman `discriminating faculty'—in this case Donaldson's thought, that of a distinguished philological scholar who knew his languages and was expert in applying them to a text and textual situation.

Moreover, an analysis of this variation should further undermine one's confidence that a `machine' might generate anything like `the truth' here—or anywhere else in `The Wife of Bath's Prologue' or The Canterbury Tales. Consider, that is meditate upon, think about, the various possible explanations for this situation. I present these as two opposed logical possibilities, but the bottom line, if you adopt either view (and I think you have to choose one), will be the same—`The Wife of Bath's Prologue' (and by extension, The Canterbury Tales) is not an appropriate text to treat by mechanistic processes of textual criticism.

(a) My first possibility I introduce by directing your attention to Peter Robinson's cladistic diagramme of the manuscript relations at this point in the text.[7] This shows, a finding I'd be well disposed to accept, that at least eight copyings must separate the common source Robinson postulates for the three manuscripts Ld2 Ln Ry2 from Chaucerian draft O. That is to say, the three MSS are related, their archetype was a copy extremely advanced in the chain of transmission, and thus there is every reason to expect that the archetype from which it received its text resembled the majority at this point, that it might have contained the majority reading I have already identified, following Donaldson, as erroneous.

In this scenario, whatever the scribe of the Ld2 Ln Ry2 exemplar did, he did not follow the reading of his exemplar. Rather, he imported a reading from elsewhere, either by consultation of another manuscript or by a lucky guess—an act which he performed because he recognized what is abundantly clear from nearly half the tradition, the twenty-five copies that do not transmit the reading `wight', that the text did not make sense in Middle English. (While I am absolutely certain that a number of the scribal teams behind several of the ten manuscripts I have cited in full knew Jerome, Chaucer's source-text, I am equally certain that none of them ever used it to spot-check the Wife's Prologue, a behaviour to be contrasted with several scribes in Manly-Rickert's b tradition who did in fact collate and `correct' `Chaucer's Tale of Melibee' against copies of its French source.)[8] The technical term for


any of these activities is `conflation', and its discovery, on such a random basis anywhere in a variant sample, indicates that no single reading anywhere in the tradition can be taken as evidence for genetic descent on any basis other than raw faith.

(b) As a logical alternative, I examine another scenario, that the three scribes are in fact reporting precisely what their exemplar—and its exemplar —read. In this regard, one might consider the quality of variation which typifies the group of three manuscripts with `wright' and assess the possibility that they (or their exemplars) would have made a brilliant guess or sought aid by consultation of multiple copies. For this purpose, I've simply sampled their readings in sixteen lines surrounding the line under discussion (D 110-125). On this basis, I see no reason to dissent from Robinson's published views on this flavour of the text; he comments generally, `the character of these variants suggests scribal carelessness', and concludes by saying `th[is] version of the text... is the result of scribal carelessness and tinkering'.[9] In other words, these scribes (and their forebears) weren't brilliant and are most likely to have copied what they saw, or to have trivialised, not improved it. (I note in passing that the judgement that the scribes in this tradition were `careless' or `tinkerers' speaks only to the general quality of variation visible in their copies and does not preclude their having, in any individual instance, transmitted an accurate text—as they indeed do in the majority of lemmata.)

To follow out this logic, one must consider the thirty-one souls who wrote `wight' in their manuscripts. The Canterbury Tales Project will print this reading because it is so plainly not genetically bound: in my citations above, it appears as the reading in three of the four great early independent copies, in the Manly-Rickert constant groups a and b, and in two later independent copies of value. Excepting Cp and Pw (to which I will return), every one of these copies appears in the Canterbury Tales Project stemma (indeed, in any stemma one would construct, should one want to) closer to Chaucerian materials than the exemplar of La2 Ln Ry2. Hg was potentially derived directly from a version of O (although not, I would think, the same O available elsewhere in the tradition);[10] Ch and Ad3, here following exemplars related to Hg, second and third generation copies; Dd and El fifth generation.

Quite simply, if `wright' was licitly, i.e. by vertical descent through exemplars, available in the eighth generation of copying, it was also available


somewhere in every previous copying generation. Thus, the exemplars of some substantial number of the thirty-one fellows who wrote `wight' in D 117 must also have read in their copy `wright', and they must have failed to communicate the reading that was before them in their copy.

One can meditate, think, a little as to why this might be so, and the answers will have very little to do with any outright scribal culpability. Of course, it is possible that many scribes simply saw `wright', didn't get it, and chose a word of similar shape they hoped would be communicative in context. This is, of course, `the substitution of homologous similars' (as opposed to that `of semantic similars').

But equally, inadvertence and accident might have had their role. No one reads whole words; we all make approximative guesses on the basis of a general graphemic shape. A good many scribes may have seen `w' as the front, `ght' at the end, identified the word as `wight', and copied it as such. Or alternatively, euphony may have been an issue. Scribes take up a unit to copy visually, repeat it to themselves as they transcribe, and write letters as they hear themselves speak them. In such a process `wright iwrought' may have been a bit of a tongue-twister and just accidentally have got heard (and copied) in dissimilated form, `wight iwrought'. Or finally, memorial contamination might be at issue. Given the `bespoke' conditions under which English books were produced until the 1460s, no copyist of this work was ever a virgin; all of them knew the text already, and the text they knew may have overridden in their copying whatever they saw before them in their exemplar.

As a consequence, the weight of evidence here means very little. In this second argument, there is every possibility that majority attestation for `wight' may be thoroughly accidental. Given the variety of ways in which this error might be made, it might be made over and over again, in various circumstances, with the effect that readings everywhere in the tradition converge in error. But these are not errors genetically transmitted; they are rather perfectly independent innovations. If this is the case, apparent genetic dispersion of a reading may mean nothing other than that, when faced with the same reading, scribes are thoroughly capable of multiplying the same error.

No branching tree model of transmission is prepared to deal with `convergence', for it is not the vertical procedure branching trees outline. Finding it potentially prevalent anywhere in a textual tradition renders that tradition not susceptible to genetic treatment. This then is another situation which undermines the possibility of Housman's `machine', a genetic analysis of the variant sample. In short, Peter Robinson deserves congratulations for a stunning and irrefutable discovery: in the effort to edit `The Wife of Bath's Prologue' genetically, through cladistic analysis, he has shown the futility of the `machine'-task he has constructed. For one or the other of my logical propositions must be true—either the archetype of Ld2 Ln Ry2 included the reading `wright' transmitted by these manuscripts licitly, by genetic descent, or it did not. If the second is the case, stemmatic procedures are inapplicable because of conflation; if the first, because of convergence.


Indeed the cladistic diagramme Robinson has prepared for this portion of `The Wife of Bath's Prologue' enshrines convergent variation as the normal state of affairs. Here I return to `wright' and its variation; although thirty-one copies read `wight', sixteen have instead `and why', the two I have already cited (Cp Pw), as well as Dl Fi Gl Ha2 La Lc Ld1 Mg Mm Ph3 Ry1 Se Sl2 To. In Robinson's diagramme, which indicates general agreement throughout the first 390-400 lines of the text, these sixteen copies cluster; fifteen constitute the majority of a group of twenty, all ultimately dependant on a single ninth-generation copying of `The Wife of Bath's Prologue'.

This is, in itself, not particularly surprising, for this is simply Robinson's rendition of Manly and Rickert's constant group cd*. But surprising in the diagramme is the position of Cp and Pw. This group began, as I have noted, with a ninth-generation copy; Robinson presents Cp as fifteenth-generation and Pw as thirteenth. In other words, they are well down the stemmatic tree—in fact at its very foot—among the most subject to variation and most deviant of all copies.

Philological knowledge made Donaldson think; codicological knowledge occasionally stimulates me. From any codicological point of view, this is a surprising finding. The overwhelming majority of the copies here grouped are manuscripts of c. 1450-80; the exceptions are precisely Cp and Pw. The first is among the very oldest copies of the poem, probably predating 1410, written by a prolific and important individual, `early London scribe D'. He seems to have had, unusually for early Canterbury Tales scribes, a full and assembled copy of the poem to work from (Manly-Rickert's c). This set of archetypes remained as an organised unit for perhaps a decade, passing on to Pw, who edited portions of them (presumably only those for which he could find a second copy, not the Wife of Bath's performance).[11] The result of this activity, Manly-Rickert's d, was a textual source widely used in the later fifteenth century, in fact the source that most of the other fourteen scribes who wrote `and why' are drawing on in D 117. From an historical perspective, this portion of Robinson's diagramme is `upsodoun'; the antecedent sources of the group appear at its foot.

But this state of affairs only occurs, as I have suggested above, because of the laws that govern `machines'. More deviant copies must appear as more advanced in transmission than less deviant ones. And through the sample, the source texts Cp Pw present the poem more variously than those copies which are actually their descendants. One should again think about why this might be the case: quite simply the fourteen manuscripts which appear to precede Cp Pw here do so because they have removed from their texts a


number of the distinctive cd readings they received from Cp Pw. They have imported the lections of the tradition at large, and reveal a behaviour common in late forms of textual traditions, what one may call `regression to the commonplace'.[12] But in any event, their superior position in Robinson's tree indicates these copies' failure to rely on readings genetically transmitted; the texts they transmit must, on a general basis, be the result of those activities I have earlier outlined in discussing Ld2 Ln Ry2's `wright'—either conflation (consultation of multiple copies) or accidental convergence with readings elsewhere in the tradition. In either scenario, the `machine' has broken down.

One final corollary to this argument—that attestation of no single reading among the manuscripts offers evidence about its authenticity. The thoughtless reversal of this argument—the claim that attestation overrides all other concerns—has been used to `deauthenticate' five scattered passages in `The Wife of Bath's Prologue'.[13] And what is traditionally called `internal evidence', baldly thematic and uncritical readings (as opposed to the `external evidence' that Jerome did write `conditore' and Chaucer translated it aptly), cannot offer any purchase on the question of authenticity. Quite simply, there is no responsible evidence available from the manuscripts to indicate that such spottily attested passages are not Chaucer's, and Housmann's “discriminating faculty' might have suggested that they were Chaucerian on their face.

I want to shift to a slightly broader look at `The Wife of Bath's Prologue' now—and perhaps a broader point about thinking and machines. I earlier mentioned that I had validated Peter Robinson's views about the quality of text communicated by Ld2 Ln Ry2 (as well by as their congener Bw) through an examination of their variation in a sixteen-line passage. The evidence which I amassed in my survey and by studying relevant portions of Peter Robinson's report on the stemmatic relations of the manuscripts in `The Wife of Bath's Prologue' will offer broader perspectives on the enterprise.[14]

First I must sadly report a technological glitch, strange in this `machine'-driven context, a case of `reinventing the wheel' which compromises Robinson's whole endeavour. He formulates a family F comprising two pairs [(Bw Ln) (Ld2 Ry2)]. Regrettably, whatever one thinks of his results, this formulation renders a good deal of his information on the four manuscripts valueless.

For there are not two pairs here. On the basis of any number of features, Ld2 is that rare Middle English manuscript which can comfortably be de-


scribed as a `codex eliminaturus', a manuscript whose readings are of no evidentiary value whatever, because it has in fact been directly copied from a surviving manuscript, in this case from Ry2.[15] Of course the two manuscripts will group as a pair; the only thing which can distinguish their readings is the extra errors which got into the subsequent copying of Ld2 (and the scribe's innovations which happily corresponded to a right reading somewhere else in the tradition, yet more convergence). But as a result, any judgement Robinson makes predicated on Ry2 and Ld2 as separate copies will be invalid.

In general, Robinson's showing is predicated on seeking a certain kind of agreement in readings, that is agreement which will testify to grouping as two pairs, apart from all other copies. But he never thinks of submitting the group to the actual test of genetic relationship, that the grouping postulated will explain the preponderance of the evidence. As a consequence, Robinson never gives the full evidence and never cites counterinstances.

In the passage I have surveyed, D 110-125, Bw Ln Ld2 Ry2 agree in six variations which suggest they might be related genetically (I'm retaining Robinson's identification of Ld2 as an independent copy).[16] Two of these variants only show Ry2 readings passed on in Ld2 (112 `Lordes' for `lordynges', 114 `in' om.). Robinson strangely does not cite one of a further two readings which join all four manuscripts uniquely (in 112, they all omit `And'). (The other, the omission of the phrase 113 `the flour of', appears in his account, but is misreported.) Two other variants, 117 `wright' (again misreported) and 120 `maked' (which does not even appear in the Project's CD-Rom collations but


is common to three of the four copies and, strangely, recorded in only two other manuscripts in the whole tradition), are very likely correct readings. They are thus not properly genetic information, although they speak to the isolation of the group of four.

Against these two (conceivably four) variants speaking to a genetic relationship among the four manuscripts, I find in my sample eight counterexamples, two of them mentioned in the Occasional Papers. Robinson rejects the omission of 124 `wel' from his sample on the grounds that it is likely convergent (a capricious case of petitio principii —one undertakes a genetic analysis precisely to measure the degree of covergence among copies—but a pleasant sign that Robinson knows he could edit the text by thinking about it). The second reading, 121 `other', is included as support for the genetic relationship but can only serve this function because Robinson reports its distribution inaccurately—the reading `other' appears in sixteen manuscripts additional to our four and provides no evidence for the group. In the other six instances, one of two situations obtains:

(a) the four manuscripts vary in the form `A B // A C', frequently in agreements with a number of other texts, and no genetic statement is possible, e.g. 117 of (in Ld2)] in Bw, of as interlined correction Ln, om. Ry2. Ln's correction is unique, but six further witnesses, including Ry2, share its original reading and simply omit. Note that this lection provides yet another example of convergence, the derivative Ld2 having made the same obvious correction as the Ln corrector and brought his text into line with the tradition at large;

or (b) the reading is so widespread as to provide no genetic information whatever, e.g. 110 foore] lore Ld2 Ln Ry2, as well as thirty-two other copies.

So Robinson presents two (maybe four) readings which support a genetic relationship, while offering no comment on eight readings which don't. In other words, as an explanatory tool, in my sample Robinson's reconstruction of the textual relations handles 20%-33% of the variant information available. On this showing, convergent variation constitutes the most normal state of affairs in this textual tradition, and no defensible or usable stemma of `The Wife of Bath's Prologue' can be forthcoming. That is not a showing which ought to instill confidence in the Canterbury Tales Project as an editorial `machine', since I would think one needs to explain at least 75% of the variants genetically, i.e. three for every one you cannot explain, to be convinced you have a stemma useful for editing a text.

That the evolutionary biologists from whom Robinson derived his methodology for grouping manuscripts, cladistic analysis, find a 40% showing provocative does not address this querulousness. I am initially bemused by Robinson's choice of tool, a striking form of cyclical `historicism' whereby the post-modern `machine' replicates its archaic origins. For Robinson implicitly appeals to analogy—the development of manuscript traditions should be like the development of species—and in so doing, thoughtlessly replays the same Late Romantic fascinations with origins and organic growth which


produced this kind of thinking in the first place. (Stemma/Stamm/stefn does, after all, mean `tree trunk'.)

Simultaneously, I would think the differences between biological cladistics and manuscript traditions more telling than their resemblances. In a general way, the biologists deal with a nonreversible procedure (once a mutation occurs, it doesn't get undone); in contrast, Robinson's whole interest in the branching tree diagramme is predicated on its reversibility, on being able to trace variations backward into the whole unity of origin. But there seem to me more practical difficulties. The biologists are dealing with samples in which variation is random, with in fact a designated measure of randomness (the standard constant of mutation). Moreover, in their samples numerals on the order of 1020 are small numbers. With a literary text, variation has not been generated randomly, but by motivated human activity (which can be comprehended as such by thinking, re-enacting the procedure followed by the human agents). And the order of magnitude, an important consideration in any statistical procedure, for a text the size of `The Wife of Bath's Prologue' is scarcely comparable to that generated in studying variation of amino acids in a protein string or genes on a chromosome—I would estimate for agreements in erroneous readings an order of magnitude of only 103 or 104.

As Housman points out, thinking is hard and unpleasant. And machines are utterly fascinating and may hold out the hope of some post-modern `scientific objectivity' in the arts. Probably the Wife of Bath should get the last word, in lines Chaucer most assuredly wrote, perhaps as a last word late in the game, perhaps eliminated from much of the manuscript tradition by `notional homeoteleuthon':

Diverse scoles maken parfyt clerkes

And diverse practyk in many sondry werkes

Maketh the werkman parfit sekirly. (D 44c-e)

I would suspect that she and Chaucer—echoing the perfectly thoughtful wisdom of the creator in his manufacture of genitalia—meant that, in addition to counting, thinking both while one in-puts and while one reads printout is a very good thing.


I am grateful to Peter Robinson and the Centre for Technology and the Arts for the invitation to read an earlier version of this paper at De Montfort University, Leicester, 26 April 2000.


