University of Virginia Library

Search this document 


  
  

expand section 
expand section 
expand section 
  
expand section 
expand section 
expand section 
expand section 
expand section 
expand section 
collapse section 
TWO NEW PAMPHLETS BY WILLIAM GODWIN: A CASE OF COMPUTER-ASSISTED AUTHORSHIP ATTRIBUTION by Pamela Clemit and David Woolls
 01. 
 02. 
 03. 
 04. 
 06. 
 07. 
expand section08. 
 09. 
expand section 
expand section 
expand section 
expand section 

  
expand section 
  
  
  
  
  
expand section 
  

265

Page 265

TWO NEW PAMPHLETS BY WILLIAM GODWIN: A CASE OF COMPUTER-ASSISTED AUTHORSHIP ATTRIBUTION
by
Pamela Clemit and David Woolls [*]

Introduction

This article describes and illustrates the use of computer-assisted methods of textual analysis to facilitate author identification. The task was to examine two anonymous pamphlets on the Regency Crisis, prompted by the temporary mental derangement of George III from November 1788 to February 1789, and to give an opinion on the probability that both works were written by the future philosophical anarchist William Godwin. If they could be attributed to Godwin, this would not only add two significant items to the canon of his works, but also shed light on the development of his thought during a period of his career about which little is otherwise known. Since there was some external evidence for Godwin's authorship of one pamphlet and a possible external indicator for the other, the thrust of the examination was to assess whether or not supporting internal evidence could be found. The research dealt with five collections of material: the two Regency pamphlets in question; known political pamphlets by Godwin; other writings drawn from various points in Godwin's career; other known-author pamphlets on the Regency Crisis; and, finally, a selection of writings by Godwin's contemporaries, Thomas Paine, Richard Price, Joseph Priestley, Joseph Towers, and Mary Wollstonecraft, together with one example of work by his daughter, Mary Wollstonecraft Shelley.

Both the quantity of texts examined and the length of the titles have required us to adopt a coding system to assist the reader in identifying the material under discussion. For example, "TP 15=Reason2.9" identifies the author as Thomas Paine, shows that this is the fifteenth example of his work, numbered in chronological sequence, abbreviates the title, The Age of Reason: Part the Second (1795), and shows that we are using the ninth chapter of that work. Codes which do not end with numbers indicate that we have used the whole text, while codes with a single number at the end indicate whole chapters from a book. In the charts, only the initial chronological sequence code is used for clarity of presentation, while Godwin's texts are marked by squares, the two pamphlets in question by circles, and all other texts by triangles. In the text, the first reference to each main work is given in full and subsequent references are by shortened titles. The two pamphlets under discussion, once identified in full, are referred to simply as Law and Reflexions, respectively, to distinguish them from the texts of known authorship.


266

Page 266
Full bibliographical details of all the texts used, together with abbreviations adopted in the charts, are provided in the list of references, and a table listing the number of words in each text is given in an appendix.

Context

The first pamphlet under consideration, The Law of Parliament in the Present Situation of Great Britain Considered, was published by J. Debrett at the start of December 1788, prior to the parliamentary debates on the Regency question, and went into a second edition early the following year. Though the English Short Title Catalogue does not make an authorship attribution, nor record any other copies attributed to Godwin, his authorship is suggested by an anonymous contemporary manuscript ascription, "By Mr. Godwin," on the copy held in the University of Durham Routh Collection.[1] This attribution is strongly supported by evidence in Godwin's unpublished diary, where, according to his customary practice of recording the publication of his own works, he noted, "Law of Parliament published," on 1 December 1788.[2] The second pamphlet, Reflexions on the Consequences of His Majesty's Recovery from His Late Indisposition. In a Letter to the People of England, published by G. G. J. and J. Robinson, Godwin's then employers, was internally dated 16 February 1789, the day of the debate on the Regency Bill in the House of Lords (Derry 187), but did not appear until around a month later.[3] Again, no authorship attribution is made in ESTC, but a copy of Reflexions in the Routh Collection bears a manuscript ascription to Godwin in


267

Page 267
the same hand as that of Law. [4] However, in this case there is insufficient external evidence to support a definite attribution to Godwin. In a diary entry for 25 February 1789, he noted, "Write to the P. of E."—an abbreviation consonant with other contractions used elsewhere—but there is no entry for 16 February, nor for most other days in that month, and he did not mention publication of Reflexions. [5] The apparent discrepancy in the date of composition may be explained by the fact that Reflexions was overtaken by the events it sought to influence. Written in response to news of the king's partial recovery from 10 to 14 February, it warned of the dangers of an immediate restoration of royal authority; but the announcement on 26 February of the "entire cessation" of the king's illness, followed by his speedy resumption of full powers, made such an argument redundant (Macalpine and Hunter 81, 86). In addition, the month-long delay in publication of Reflexions may account for Godwin's omission to record the date in his diary. Yet these explanations for the lack of firm external evidence for Godwin's authorship of Reflexions remain conjectural.

Nevertheless, a reading of Law and Reflexions supports the view that Godwin wrote both of them. The style of both pamphlets is adapted according to the different occasions and audiences for which they were intended. Law, written in the interval between the meeting of Parliament on 20 November at which the king's indisposition was announced, and its reconvening on 4 December to discuss the establishment of a Regency, appears to have been designed to influence the debate among the Whigs concerning the best means of achieving government office (Derry 50, Mitchell 122-126). Accordingly, it is written in a measured, logical style, for the most part, and includes detailed discussion of historical precedents. Reflexions, as indicated by its subtitle, "In a Letter to the People of England," was addressed to a much wider audience, the politically aware, middle section of society which had a voice in public affairs, and its style is more informal and personal, though it too includes much historical analysis. Despite these stylistic differences, there are obvious parallels of theme and technique between the two pamphlets in question and Godwin's known works from 1783 to 1791, which include political journalism, historical writing, and occasional pamphlets in support of the Foxite Whig cause.[6] For example, the authorial stance of philosophical impartiality found in both pamphlets is characteristic of Godwin's early political writings, in which, in keeping with his Dissenting upbringing and education, he sought to forge an identity as an independent social and political commentator who had "nothing to do with administrations" (Law 53). Again, the method of both pamphlets is to provide a mixture of historical analysis and discussion of abstract principles, closely resembling the pattern of Godwin's writings in the second half of the decade. Typical of Godwin, too, is the claim to give equal attention to both sides of the question, while employing


268

Page 268
rhetorical devices to whip up the fears of readers—notably, the invocation of the spectre of civil disturbance—and win their assent to his arguments. Finally, the pamphlets share with Godwin's writings an emphasis on "awaken[ing] the true principles of understanding in others" (Reflexions 60) rather than specifying firm conclusions. Such resemblances to Godwin's known works, while not providing conclusive evidence, reinforce the likelihood that he was the author of both pamphlets.

In order to confirm Godwin as the author of the first pamphlet and establish him as the author of the second, it was decided to employ additional computer-assisted methods of textual analysis. In reviewing the options, some consideration was initially given to the cusum technique, which has been widely used in cases of literary attribution and in British courts of law; but this method was not adopted because its reliability has been questioned from a variety of angles.[7] Instead it was decided that the breadth of techniques used by forensic linguists would allow the most thorough investigation of the texts in question, and that those of a computational forensic linguist in particular would allow the processing of an appropriately large range of textual material. A forensic linguist is normally engaged where a trial, appeal, or disciplinary procedure requires an opinion on the authenticity or authorship of short texts, or on whether there is supportable evidence of plagiarism in a text. David Woolls is a computational forensic linguist who builds and uses computer programs as additional means to this end. These programs allow large numbers of texts of any length to be analyzed very rapidly once they are in electronic form. The data provided by such an investigation can then be used in forming an opinion. These programs are designed to work with whole texts or discrete chapters, as in this study. The branch of forensic linguistics represented here always treats the individual and collective measurements as indicative, rather than attributive, the emphasis being on the textual evidence which causes the plotting of the data points in graphical form to show a particular pattern, if any such pattern emerges. This textual emphasis enables us to discuss our findings in terms of literary style as well as usage.

Computer Assistance

The analysis employed a number of computer programs to quantify membership of word lists, to provide vocabulary listings, and to identify a number of different phrasal patterns. Two word lists were used, the "closed set" and the "core vocabulary." The closed set comprises words which are functional, irrespective of their relative frequency in texts. This list includes some 400 words headed by the very frequently occurring words "the," "of," "to, "a," and "that." It also includes numbers, which can cover several different functions but are normally reliant on context for their meaning. Occurrences of words on this list account for 60% of all the texts examined in this study. If words are not on this list they are treated as "lexical," carrying the content


269

Page 269
of the texts. The core vocabulary is a list of the 600 or so lexical words which occur most frequently in twentieth-century children's literature, and which form a quantifiable subset of the 40% lexical component of the texts under examination. (Its usage is explained below.) Two other specialist terms are used throughout the text. Words which occur only once in a given text are referred to as hapax legomena, and those occurring twice as hapax dislegomena.

The assessment employed computer programs developed from authorship attribution studies, but designed specifically for forensic linguistic purposes (Woolls and Coulthard 49-56). Three measurements of the vocabulary are used:

  • 1. "Lexical richness" is a calculation based on the number of lexical hapax legomena used by the author of a given text. The score is arrived at by dividing the logarithm of the full text length by the proportion of the vocabulary which is used more than once. This gives a higher score when the divisor is low (i.e. fewer words used more than once), hence the concept of "richness." Transforming the text length to its logarithmic value gives comparable scores for texts of very different lengths, reflecting the fact that hapax legomena are usually spread fairly evenly throughout a text of any length, but obviously represent a smaller proportion of longer texts. The term "lexical richness" refers to a writer's use of the vocabulary in a given text rather than the size of his/her vocabulary. (As a rough guide, a richness level of 700 to 800 is found in the news items of the press and a level of 1200 or more is found in poetry.)
  • 2. The "lexical hapax dislegomena percentage" is simply the number of lexical words which occur twice divided by the total vocabulary.
  • 3. The "core vocabulary percentage" is the total occurrence of words which are on the core vocabulary list divided by the total lexical usage.

All three measurements reflect how writers have used their vocabulary, consciously or unconsciously, with no regard to structure or meaning but simply by levels of occurrence. These features are common to all texts and observations of the quantities of each may provide an objective indicator of similarity or difference, which should then also be observable in structure and meaning. All three measurements have been shown to discriminate between texts at the 5% significance level, which is the statistical limit usually taken to indicate that the results are not obtained by chance but reflect distinct differences.

The first two measurements relate to research reported in Holmes (259268). The theory underlying them is that comparison of the scores produced by the writings of two authors will allow them to be seen in different places on a scattergraph, when the lexical richness scores are plotted against the hapax dislegomena percentage in two-dimensional space. One author will appear on the left and another on the right, or one at the top and one at the bottom, depending on which discriminator is the stronger. When more than two authors are being examined, they should each appear in different segments. This is not always a clear-cut division, because there is no suggestion that authors always write in exactly the same way, but the theory maintains that the measurements used are likely to reflect general habits, of which an


270

Page 270
author tends to be unaware during writing, and that observation of these results over a range of texts will reveal a tendency in one direction or another. The third measurement used, the core vocabulary, arose from an earlier use of the programs by Woolls in relation to the development of writing skills in children. This set has also been found to be present in substantial quantities, between 25% and 38% of all lexical occurrences in texts, in eighteenth- and nineteenth-century writing, and measuring the occurrence of the set has proved equally valid as a discriminator in writing for adults from the eighteenth century to the present day.[8]

The closed set, though used initially only to identify the lexical words by their absence from the set, is not discounted in the analysis, since it can provide indications of authorship habits, and indeed is used by authorship attribution analysts where large amounts of data are available for precisely that reason. In forensic linguistics, the text lengths are normally very much shorter than any of the texts examined here, so separation allows closer analysis of the interactions to be undertaken. In addition, this separation reveals that in texts of greatly varying lengths the actual number of closed set hapax legomena and hapax dislegomena is remarkably stable. The texts examined range from 776 words to 13,074 words in length, but most have around 50 hapax legomena and around 24 hapax dislegomena from the closed set. This is why, in this study, all scores are calculated on lexical quantities, to eliminate potential distortions caused by the higher proportionate representation of the closed set in the shorter texts.

The other feature of the lexical hapax legomena in particular is that they are usually spread throughout the text, not always evenly but in substantial quantity wherever an examination is made. The degree of regularity may be examined for any given text, to ensure that the discrimination between authors indicated by the lexical richness scores is in fact based on texts which broadly conform to the expected distribution. Where texts manifest different patterns, further investigation may be required. Such an investigation forms the core of this essay, following the initial stage of analysis.

Analysis

It was first necessary to ascertain three things: whether the known Godwin material and the pamphlets in question fell within similar boundaries; where both the known pamphlets by Godwin and the ones in question fitted in with Godwin's other writings; and whether any other pamphlet writers of the time offered alternative authorship possibilities. To begin with, a selection of known-author pamphlets on the Regency Crisis, by William Cuninghame of Enterkine, Capel Lofft, Jean Louis de Lolme, and Sir James Mackintosh, were found, scanned, and analyzed.[9] In addition, a sermon by Towers, a Dissenting


271

Page 271
minister and author of political tracts, who was, like Lofft, an acquaintance of Godwin's in the late 1780s,[10] was included in the study. Then, during an Internet search for materials suitable for comparison, full electronic versions of Godwin's A Defence of the Rockingham Party (1783) and Instructions to a Statesman (1784) were discovered, together with an extensive quantity of his other writings. The processing of all this material allowed the data comparison shown in figure 1.

This graph shows a division point of around 900 in lexical richness scores between Godwin's known writings and the works of other pamphlet writers in the lower left quadrant. That is, Godwin consistently uses more words only once in a text than any of the other writers. It also reveals a division between Godwin's earlier material at the top and his later material at the bottom. In addition, the longer works tend to form groups, with a separation between


272

Page 272
the essays from Thoughts on Man (1831) on the lower right and the chapters from An Enquiry concerning Political Justice (3rd ed., 1798) in the middle. The six books of Imogen: A Pastoral Romance (1784) occupy the upper right, though with a similar spread to the chapters from Enquiry. The known Godwin pamphlets are to the right and left of Imogen, with the two directly addressing constitutional or legal matters, Defence and Cursory Strictures (1794), sharing lexical richness values similar to the chapters from Enquiry.

Although the other Regency pamphlet authors cluster together, this graph tells us nothing about how they normally write, since we have only one or at most three samples of each. What we can see is that Godwin's texts, which cover a wide span of his writing career, do not overlap or cluster with the other candidates for authorship. We can also see that none of the other writers immediately appear as plausible candidates for the authorship of either Law or Reflexions. Only Letters 1 and 2 of Lofft's pamphlet exceed Law in lexical richness, and both Law and, especially, Reflexions stand at some distance from what is otherwise a distinctive group. (Both Cuninghame and Mackintosh argue against the case for Parliament's right to appoint a Regent proposed in Law, so are unlikely alternative authors on those grounds as well.) Towers and De Lolme stand furthest from any known Godwin material. Lofft's pamphlet is later than Law, comprising three letters dated 8, 13, and 14 December 1788, each reacting to the latest developments in the parliamentary debates and all supporting the underlying position set out in Law and Reflexions. But, as already noted, the Lofft scores have more in common with the other Regency pamphlet writers than with known Godwin material of the time. Lofft's pamphlet also displays a distinctive feature in the way that he refers to kings and dukes: he adopts abbreviations, such as "H.III," "R.II," and "D. of York;" but these are entirely absent from Law. Perhaps most significantly, Lofft adds in a footnote to a passage cited from the parliamentary debates, "I substitute only the word KING instead of MONARCH: as an English constitutional word, more parliamentary, and more consonant to the occasion" (Letters2 32); whereas the word "monarch" appears without comment ten times in Reflexions and twice in Law.

As for the relationship between Law and Reflexions and Godwin's known writings, Reflexions sits firmly within the Imogen books and just to the right of Defence and Strictures. Law is within the hapax dislegomena boundaries of the Godwin material closest in chronology, but has the lowest lexical richness score. From this analysis, it appeared that there was a higher probability that Reflexions could be attributed to Godwin than Law. This finding prompted a further stage of comparison between Law and Reflexions, Godwin's known writings, and texts by other authors writing on political and philosophical subjects in the same period.

Comparison of Godwin's texts with a selection of writings by Paine, Price, Priestley, and Wollstonecraft, together with one text by Mary Shelley, not only allowed a confirmation of the distinction between Godwin and the others at the lexical richness/hapax dislegomena levels, but also revealed a


273

Page 273
definite feature in respect of Godwin's individual use, or rather lack of use, of the core vocabulary, as shown in figure 2.

The great majority of Godwin's texts fall into the lower half of the graph, showing that, in all but one of his texts, less than 23% of the lexical content is formed by the core vocabulary and most contain less than 20%, and that this is very much a feature of his writing throughout his career. The majority of the other authors collect at the top, inhabiting the 22% to 33% range. The only author with a higher lexical richness score than Godwin is his daughter, Mary Shelley. Paine's texts group in the upper left quadrant. Among the known authors on the Regency Crisis, Lofft shares with Godwin a tendency for low use of the core vocabulary. De Lolme, who was a native of Geneva, shows a much higher usage of the core vocabulary than any text except Paine's Common Sense (1776). De Lolme's method of argument is also different from that of the other Regency pamphlet authors, being much more narrative than polemical. It may be that there is common ground in the upbringing and education of the other writers, which influenced their use of language, but that is beyond the scope of this paper.[11] The fact that Godwin


274

Page 274
is not alone in his low usage of the core vocabulary does not diminish the force of the observation that this characteristic is a strong feature of his writing. The scores of Thoughts, for example, are similar to those of Defence, Instructions, and Strictures.

The only other work examined here which scores less than 22% for core vocabulary is Price's Additional Observations on . . . Civil Liberty (1777). Though the other Price text used, A Discourse on the Love of our Country (1789), appears in higher areas of the graph, the proximity of Add.Observ. to Law on both scores measured here suggests that Price might offer an alternative authorship for Law—a possibility explored further below. At present the important factor is that both Law and Reflexions sit firmly within the area occupied by all the known Godwin material in the core vocabulary scores. This finding provides additional support for the case that Godwin is the author of Reflexions and suggests that he remains a possible author of Law.

To confirm the difference between Godwin and other writers of the same period, more data was required. The only relevant contemporary author for whom a substantial amount of electronic data could be found for comparative purposes was Paine. Figure 3 shows the results of comparing all the


275

Page 275
Godwin material with a selection of Paine's writings, Law, and Reflexions.

This chart shows a division between Paine and Godwin from left to right at the 900 mark similar to the division noted in figure 1. Paine's works, with the exception of Rights of Man: Part the Second (1792), are grouped in the same way that Godwin's writings are grouped in figure 1. Chapters 1 and 2 of Rights are each much shorter than all the other samples, which may account for their separation from chapters 3 and 4, but they were included for the sake of completeness and to show that, while the hapax dislegomena percentage is being affected by the difference in length, the lexical richness scores are grouped within similar boundaries to other related sets of texts. Since the works selected cover a period of many years in the case of both authors, we can have some confidence that the measurements are indicating distinctive differences in the use of vocabulary between the two writers. Given that the more limited material from other authors can also be recognized as different from Godwin's, we can be reasonably sure that we have identified a further stylistic feature of his writing in a high lexical richness score. Price is included here because he has already been identified as a potential alternative author of Law. His two works straddle the 900 boundary on the lower limit of Godwin's identified levels. These works were written 12 years apart and on different subjects, so there is no reason why they should cluster any more closely than they do.

It can further be seen from figure 3 that Reflexions is firmly on the Godwin side and Law equally firmly on the Paine side, and that Price's Add. Observ. is even closer to Law than any of Paine's work. We also have an indication that Paine was writing differently from his norm in chapters 3 and 4 of Rights, since the other texts sit in the lower/middle part of the graph. If either Paine or Godwin were the author of Law, it is evident that this work shows divergence from what might be called their normal style, as identified by the statistics. If Price were the author, we would expect a similarity in style, since both works are found in the upper right region, closer to Godwin's material. So the task was to see which of these possibilities was most likely.

At this point, the statistical analytical tools were replaced with vocabulary and phrase analysis tools. The total length of the Paine and Godwin texts used (approximately 50,000 words for Paine, and approximately 100,000 words for Godwin) was sufficient to allow a limited investigation into the function words used by each, with a view to observing potential marked differences, so that such apparently distinctive words could be looked for in each of the pamphlets in question. This investigation, though insufficient to be conclusive on its own, held the possibility of providing supporting evidence for our other findings. It revealed that three function words showed marked differences in usage: both "or" and "on" were used relatively infrequently by Godwin, while he showed a marked preference for "upon" in comparison to both Paine and Price. These patterns were also found in both Law and Reflexions, as shown in table 1.


276

Page 276

TABLE 1 Normalised occurrences per 10,000 words

       
Godwin  Law   Reflexions   Price  Paine 
Or  28  33  16  85  64 
On  15  19  12  51  48 
Upon  41  44  44  15 

This finding was then checked against the other data examined and the low use of "on" and the high use of "upon" was found to be particular to Godwin. The incidence of "or" was generally similar as well. It would appear that in a wider test of functional vocabulary markers, these two items would be clear candidates as Godwin discriminators.

Next, file comparison at vocabulary and phrase level was undertaken between all three known Godwin pamphlets and the two pamphlets in question. The greatest vocabulary overlap was between Law and Reflexions, which was not unexpected because of similarity in material, but which produced a sufficiently high score to have caused an investigation into potential sharing of material if that level had been reached by two different students in the present-day university setting for which the program, CopyCatch, is intended.[12] CopyCatch is the diagnostic element of the tools used above, which is specifically designed to look for shared vocabulary in work on the same topic which is supposed to have been produced independently. The greater the proportion of shared vocabulary which appears in two apparently related texts, the greater the possibility that the texts were produced by plagiarism, collusion, or undesirable co-operation. But CopyCatch can also indicate a tendency of one or both authors to repeat themselves, since such repetition, particularly on the topic or sub-topics, will cause the proportion of the shared vocabulary to increase. It was this practice which was highlighted by the initial examination of Law and Reflexions using this program.

In addition, CopyCatch allows comparison of phrase use between any two files it has examined, enabling us to ascertain whether repetition is occurring at the phrase level. In the cases of Law, Defence, and Strictures, this analysis showed a great deal of direct or related phrasal repetition within each of those texts, with relatively little phrasal similarity observable within any of the other texts used for purposes of comparison. Table 2 presents only some of the extraordinary amount of direct phrasal repetition in Law. It is not the phrases themselves but the number of times they appear which makes this text so different from all the others examined.

In addition, there are in Law 20 instances of the phrase, "the duke of [Bedford/Gloucester/York]."

We find similar repetition over distance in Strictures, though not on such a wide scale, as shown in table 3. (The numbers following the examples indicate which sentences are being identified to show the distances between occurrences.)


277

Page 277

TABLE 2 Phrases repeated in Law, showing number of occurrences

               
"a parliament was summoned" 
"the two houses of parliament" 
"was virtually in the hands of the king's uncles" 
"a council of regency" 
"the regency of a single person" 
"the king alone" 
"in the reign of king [X] the [Y]"  14 
"king [X] the [Y]"  43 

TABLE 3 Repeated phrases in Strictures, with sentence number identified

       
"guilty of High Treason"  (Strictures 97 and 116) 
"Justice implicitly confesses himself"  (Strictures 73) 
"Justice implicitly confesses, that"  (Strictures 67) 
"overawe the legislative body"  (Strictures 79 and 189) 

This phrasal repetition demonstrates a similarity of technique between the author of Strictures and the author of Law, though such repetition is much more markedly evident in Law than in Strictures or Defence, being especially concentrated around the historical sections. The author's method in Law was to name each king in full, discuss the nature of the incapacity, the circumstances in which parliaments were summoned and the outcome, so the vocabulary repetition was occasioned by the chosen structure of argument. One effect of this technique was to reduce the lexical vocabulary substantially in comparison to the other texts of similar length. The other effect was to reduce the normal occurrence of hapax legomena within this text, resulting in a lower percentage of once-only occurrence and hence a lower lexical richness score, which, as noted above, uses this percentage as one of its components.

As mentioned in the discussion of vocabulary measurements, the formula used for calculating lexical richness expects a degree of regularity of occurrence of lexical items throughout a given text. Thus the presence of this feature needs to be confirmed to establish that this is in fact the case. An examination of all the pamphlets, together with examples of works by Paine and Price, demonstrates that there is a clear difference in pattern between the known pamphlets by Godwin and Reflexions on the one hand, and the examples by Paine and Price on the other, with Law revealing its idiosyncratic construction, as shown in table 4.

Here it can be seen that the three known Godwin pamphlets start with, and generally maintain, a high quantity of hapax legomena, while the two works by Price and chapter 4 of Paine's Rights start with a much smaller quantity, which chapter 4 of Rights maintains but the works by Price increase. Both Law and Reflexions share the high opening and closing quantity found in Godwin's other works, but Law has a substantial drop in the middle segment, which is in fact much wider than simply this segment. This is |


278

Page 278

TABLE 4 Hapax legomena IN FULL TEXT SHOWN FOR THEIR OCCURRENCE IN THREE 1000-WORD SEGMENTS PER TEXT

       
Instructions   Defence   Strictures   Law   Reflexions   Rights 2.4   Discourse   Add. Observ.  
First 1000 words  131  134  178  134  129  70  106  59 
Mid 1000 words  124  146  128  96  130  93  124  114 
Last 1000 words  166  132  114  140  123  71  133  132 

where the lengthy laying-out of historical precedents is placed. So the opening and closing sections in both cases are not only close to Godwin's known pamphlets, but also distinct from Paine and distinct from Price. The drop can be explained only by heavier repetition of vocabulary, and the question then is whether this repetitious style is similar to that used by the other two authors, or of a distinctive nature.

The presence of a distinctive repetitious effect in Godwin's writing was confirmed by a detailed vocabulary analysis, summarized in table 5.

TABLE 5 Word frequency occurrences and percentages of lexical items they represent

         
Words  Law   Instructions   Defence   Strictures   Reflexions   Add. Observ.   Rights2.4  
21+  17  11 
6+ % of Lex Items  138  121  87  80  73  123  101 
21+  14%  5%  5%  11%  1%  12%  8% 
6+  41%  26%  28%  32%  20%  39%  38% 

This investigation found that Law contained an unusually large number of words repeated 21 times or more, and an even more unusually large number of words occurring 6 times or more. In comparison, Instructions, which is 30% longer than Law, had only 3 words occurring 21 times or more and 121 words occurring 6 times or more. Chapter 4 of Rights shows a higher 6+ rating but also a much lower 21+ rating, indicating a different vocabulary spread through the text. This lowering of overall lexical vocabulary usage in Law, together with the tendency to repeat entire phrases, also affects the hapax dislegomena percentage, since, even though the proportionate number of hapax dislegomena in the full text remains much the same, the percentage they form of the lexical vocabulary increases. These two factors imply a twoway


279

Page 279
shift, leftwards and upwards, from what might be called a normal position, if the repetition were not so marked. It would appear that the heavy repetition is a major contributory factor in placing Law where it is in figure 2. This being the case, the expected position for a less repetitious text would be much closer to the Imogen books.

As already noted, chapters 3 and 4 of Rights mark a move away from the rest of Paine's material, but in this case the movement is largely the result of the dominance of a single vocabulary item, "government," which occurs over 100 times in one chapter and over 80 in the other. Such phrasal repetition as can be observed in Paine is very rarely exact, compared with the numerous examples of exact repetition in Law noted in table 2. The move in Paine is simply a gentle rise above the normal pattern. From this combination of factors, it would seem very unlikely that Paine offers an alternative authorship for Law.

Next the disproportionate number of words used frequently in Law in comparison to the other texts was further explored. It was found that in the other texts, once the main human subjects, Lord Rockingham and Lord Chief Justice Eyre, and the terms "Lord" or "Lordship" are set aside, the next significant items are low in all but Strictures, where the non-human subject is the law of treason, which accounts for the higher frequency of this term in the pamphlet. These items are all strongly topic-related and reflect the expected frequency of the main topic vocabulary. In contrast, though there are numerous human subjects in the list of kings and dukes referred to in Law, their use is supportive to the main topic of the role of Parliament in the appointment of a Regent. Only three of the four most frequent words can be strongly identified with this topic: "parliament," "regency," and "authority," together with "government." The most frequent, "king," though relevant to the topic, owes its high frequency to the number of times a specific king is cited in the discussion of historical precedents, as indicated by the examples above.

It can be seen that Price's Add.Observ. also has a substantial amount of vocabulary occurring more than 20 times in a similar number of words, but examination shows that all of this vocabulary is subject-matter related, so this marks a difference from rather than a similarity to the style of Law. The lower figure in chapter 4 of Rights is also strongly subject-matter related and dominated by the term "government," as noted above. There is no element of repetition of lists in either text, as found in Law. The 6+ percentages are also revealing. Godwin's known pamphlets and Reflexions are all noticeably lower than either the Paine or Price examples, and it is now apparent that the statistics for Law are heavily affected by the presence of non-topic related repetition, the absence of which would drop the 6+ level of 41% substantially. All this leads to the conclusion that the lexical richness positions of Paine and Price reflect their normal usage of vocabulary. The proximity of Law to their positions is a result of disruption caused by repetition. This implies that the author of Law has a normal lexical richness score substantially


280

Page 280
higher than in this particular work, and this in turn suggests that the author of Law was neither Paine nor Price, leaving Godwin as the only likely candidate from those examined.

This being the case, it might be expected that some positive vocabulary or syntactic evidence would be found linking both Law and Reflexions with the known Godwin material, in addition to the function words identified above. Comparison of vocabularies across the files showed that such evidence certainly exists. First, the words "measure" and "measures," used almost always in relation to parliamentary procedures, appear in all three attested Godwin pamphlets and in both pamphlets under consideration, and in very similar frequencies: 13 times in Law, 8 times in Strictures, and 11 times in the other three. These words are not found in anything like such quantities in any of the other material examined (with the exception of Mackintosh's pamphlet, which puts forward an opposing argument). For example, Paine's Rights includes only 4 examples in 16,204 words, and there are only 2 instances in the full 10,756 words of Price's Add.Observ. This distinctive usage indicates at least a common interest in the pamphlets in question and Godwin's known pamphlets, and is most noticeable in reading the material as well as from the vocabulary analysis, so may be taken as additional evidence in favour of a common authorship. Second, there is a strongly related phrase in Law and Reflexions, which suggests that the author of the latter at least had seen the former, and which, added to the accumulating similarities, may also point to a common author. Though the reference to the imbecility of Henry VI is much more compact in Reflexions than in Law, both pamphlets contain a strikingly similar phrase in relation to the monarchy when describing the same case:

"as to render him incapable of maintaining even the appearance of royal authority" (Law)

"when Henry was no longer capable of maintaining the appearance of royalty" (Reflexions).

Third, the introductory phrase "Let it be," as in "Let it be remembered"/ "supposed"/"considered," is found only in Godwin's writings and the two pamphlets in question, apart from Mackintosh's Arguments (which addresses the same concerns as Law from a different viewpoint). While the phrase "Let it be" might be assumed to be a common rhetorical device of the time, its presence across the Godwin material and its absence from most of the other works examined offers further evidence in support of Godwin's authorship of both Law and Reflexions. Finally, phrasal analysis revealed the following construction, which again is found only in the three pamphlets indicated:

"thought proper to summon" (Law)

"thought proper to prepare" (Reflexions)

"thought proper to bring" (Strictures).

Taken together, these examples offer considerable support for an identification of Godwin as the author of both Law and Reflexions.


281

Page 281

Conclusions

Godwin has been identified as having a distinctive style in the materials examined, which cover much of his writing life. The two main features of his style are that he generally has a much higher lexical richness score than any of the other writers analyzed, and that he makes much less use of the core vocabulary than many of his contemporaries. Of the two pamphlets in question, Reflexions sits within the boundaries of Godwin's other writings in all the charts and in all examinations provides positive indications that he is the likely author. Investigation of the pamphlets of other writers at the time certainly offers no other candidate. This is significant, since the external evidence for the authorship of Reflexions is the weaker of the two pamphlets under consideration. The focus of this examination has been Law because it is the better attested of the two pamphlets, yet it appeared to be outside the boundary of Godwin material in one crucial respect, that of lexical richness. However, further analysis revealed that Law sits well within the norms for Godwin's use of the core vocabulary and shares with Reflexions very similar use or non-use of marker functional vocabulary, and that lexical vocabulary is present at the phrasal level linking Godwin's known work with both Law and Reflexions. The low lexical richness score has been identified as having its source in the unusually high level of repetition contained in the central portion of the pamphlet, which is a function of the way the author has chosen to lay out the argument. This authorial decision has greatly reduced the amount of vocabulary items relative to the length of text, not only in respect of Godwin's work, but also in relation to what would be expected in any text of a similar length. This means that the statistical scores, without the large historical segments and the regular references to parliamentary procedures, could be expected to be more like Defence and Strictures. In the light of all these factors, it is our opinion that there is sufficient internal evidence from this investigation to support the contemporary manuscript ascriptions and the related evidence in Godwin's diary, and to attribute authorship of both anonymous pamphlets to Godwin.

These two pamphlets on the Regency Crisis are important additions to the canon of Godwin's works.[13] They are significant for what they reveal about both his developing political views and his consistent resourcefulness as a pamphlet writer. They demonstrate that his practical engagement with contemporary British politics did not lessen towards the end of the 1780s, as is often thought, but that he continued to support the Foxite Whigs right up to the spring of 1789. At the same time, they indicate just how much Godwin was preoccupied with constitutional questions on the eve of the French Revolution, two years before he began writing Enquiry. Finally, Godwin's Regency pamphlets, in their mixture of abstract speculation and engagement


282

Page 282
with concrete political questions, adumbrate a central feature of his thought as it developed through the 1790s, and beyond. As experiments in combining speculative and practical politics, Law and Reflexions help to explain how Godwin became the author of not only Enquiry but also Strictures, his most successful intervention in contemporary politics, in which he demolished the government's case of high treason brought against twelve leading radicals in 1794.

APPENDIX List of codes and word counts of all texts used in the computer analysis

                                       
Works by William Godwin  Words  Works by Thomas Paine  Words  Law, Reflexions, Regency pamphlets, and other texts  Words 
WG1=Defence   8576  TP1=CommonSense1   2161  Law   10317 
WG2=Instructions   13074  TP2=CommonSense2   3484  Reflexions   8377 
WG3=Imogen1   7389  TP3=CommonSense3   2646  [WG]=ShortView   2656 
WG4=Imogen2   9553  TP4=CommonSense4   3865  JLDL=Observations   2650 
WG5=Imogen3   5963  TP5=Crisis7   7942  [JM]=Arguments   2228 
WG6=Imogen4   5700  TP6=Crisis11   7964  CL1=Letters1   4598 
WG7=Imogen5   7635  TP7=Crisis15   2457  CL2=Letters2   3550 
WG8=Imogen6   9325  TP8=Rights2.1   1737  CL3=Letters3   2047 
WG9=Strictures   7315  TP9=Rights2.2   776 
WG10=Hist. Rom.   5174  TP10=Rights2.3   5257  RP1=Add. Observ.   10756 
WG11=Enquiry5.1   3306  TP11=Rights2.4   8435  RP2=Discourse   8693 
WG12=Enquiry5.7   1651  TP12=Reason1.1   1140  JT=Sermon   2956 
WG13=Enquiry5.8   3079  TP13=Reason1.8   1728  MW=Vindic.9   4539 
WG14=Enquiry5.14   2217  TP14=Reason2.1   1832  JP=Phlogiston   5598 
WG15=Enquiry5.15   4538  TP15=Reason2.9   1940  MWS=LastMan1.1   5941 
WG16=Thoughts1   3550 
WG17=Thoughts4   4721 
WG18=Thoughts9   4078 
WG19=Thoughts23   3810 

REFERENCES

Manuscript Sources

Godwin, William. Diary. Abinger Manuscripts, Bodleian Library. Dep. e. 196-227.

Primary Sources

[Anon.] The Law of Parliament in the Present Situation of Great Britain Considered. London: J. Debrett, 1788. Full text examined. (Law)

—. Reflexions on the Consequences of His Majesty's Recovery from His Late Indisposition. In A Letter to the People of England. London: G. G. J. and J. Robinson, 1789. Full text examined. (Reflexions)

[Cuninghame, William, of Enterkine]. A Short View of the Present Great Question. London: J. Debrett, 1788. Full text examined. ([WC]=Short View)

De Lolme, John Louis, Observations upon the late National Embarrassment, and the Proceedings in Parliament Relative to the Same. London: J. Debrett, 1789. First 2,650 words examined. (JLDL=Observations)

Godwin, William. A Defence of the Rockingham Party, in their Late Coalition with the Right Honourable Frederic Lord North. Anonymous. London: J. Stockdale, 1783. Anarchy Archives. Ed. Dana Ward. Sept. 1999. Pitzer College. 3 April 2001.


283

Page 283
<http://dwardmac.pitzer.edu/Anarchist_Archives/godwin/defense/defenserock. html> Full text examined. (WG1=Defence)

—. Instructions to a Statesman. Humbly Inscribed to the Right Honourable George Earl Temple. Anonymous. London: J. Murray, 1784. Anarchy Archives. 3 April 2001. <http://dwardmac.pitzer.edu/Anarchist_Archives/godwin/instruct. html> Full text examined. (WG2=Instructions)

—. Imogen: A Pastoral Romance. From the Ancient British. Ed. Jack W. Marken. New York: The New York Public Library, 1963. Anarchy Archives. 3 April 2001. <http://dwardmac.pitzer.edu/Anarchist_Archives/godwin/imogen/imogentoc. html> Full text (6 books) examined. (WG3=Imogen1, WG4=Imogen2, WG5= Imogen3, WG6=Imogen4, WG7=Imogen5, WG8=Imogen6)

—. Cursory Strictures on the Charge Delivered by Lord Chief Justice Eyre to the Grand Jury, October 2, 1794. Anonymous. London: D. I. Eaton, 1794. Full text examined, with quotations from other works removed. (WG9=Strictures)

—. "Of History and Romance." [1797.] Romantic Links, Electronic Texts and Home Pages. Ed. Michael Gamer. n.d. U of Pennsylvania. 3 April 2001. <http:// www.english.upenn.edu/~traister/godwin.html> Full text examined. (WG10= Hist.Rom.)

—. An Enquiry Concerning Political Justice, and Its Influence on Morals and Happiness. 3rd ed., corrected. 1798; London: J. Watson, 1842. Anarchy Archives. 3 April 2001. <http://dwardmac.pitzer.edu/Anarchist_Archives/godwin/PJ frontpiece.html> Chs. 1, 7, 8, 14, and 15 examined from Bk. 5, "Of Legislative and Executive Power." (WG11=Enquiry5.1, WG12=Enquiry5.7, WG13= Enquiry5.8, WG14=Enquiry5.14, WG15=Enquiry5.15)

—. Thoughts on Man, his Nature, Productions, and Discoveries. Interspersed with some Particulars respecting the Author. London: Effingham Wilson, 1831. Anarchy Archives. 3 April 2001. <http://dwardmac.pitzer.edu/Anarchist_ Archives/godwin/thoughts/TMNPDfrontpiece.htmlgt; Essays 1, 4, 9, and 23 examined. (WG16=Thoughts1, WG17=Thoughts4, WG18=Thoughts9, WG19= Thoughts23)

Lofft, Capel. Three Letters on the Question of Regency. Addressed to the People of England. Bury: J. Rackham, 1788. Full text (3 letters) examined. (CL1=Letters1, CL2=Letters2, CL3=Letters3)

[Mackintosh, James.] Arguments concerning the Constitutional Right of Parliament to Appoint a Regency. London: J. Debrett, 1788. Full text examined. ([JM]= Arguments )

Paine, Thomas. Common Sense: Addressed to the Inhabitants of America. Philadelphia: R. Bell, 1776. ushistory.org. July 1995. Independence Hall Association. 5 April 2001. <http://www.ushistory.org/paine/commonsense/index.htm> Full text (4 sections) examined. (TP1=CommonSense1, TP2=CommonSense2, TP3= CommonSense3, TP4=CommonSense4)

—. The American Crisis. Number 7. Philadelphia: John Dunlap, 1778; Number 11. Philadelphia: John Dunlap, 1782; Number 15. Philadelphia: John Dunlap, 1783. ushistory.org. 5 April 2001. <http://www.ushistory.org/paine/crisis/ index.htm> Full texts of all 3 numbers examined. (TP5=Crisis7, TP6=Crisis1, TP7=Crisis15)

—. Rights of Man: Part the Second. London: J. S. Jordan, 1792. ushistory.org. 5 April 2001. <http://www.ushistory.org/paine/rights/index.htm> 4 chapters examined. (TP8=Rights2.1, TP9=Rights2.2, TP10=Rights2.3, TP11=Rights2.4)

—. The Age of Reason: Being an Investigation of True and of Fabulous Theology. London: D. Eaton, 1794. ushistory.org. 5 April 2001. <http://www.us history.org/paine/reason/index.htm> Sections 1 and 8 examined. (TP12= Reason1.1, TP13=Reason1.8)

—. The Age of Reason: Part the Second. Being an Investigation of True and of Fabulous Theology. London: H. D. Symonds, 1795. ushistory.org. 5 April 2001.


284

Page 284
<http://www.ushistory.org/paine/reason/index.htm> Sections 1 and 9 examined. (TP14=Reason2.1, TP15=Reason2.9)

Price, Richard. Additional Observations on the Nature and Value of Civil Liberty and the War with America. London: T. Cadell, 1777. Liberty Library of Constitutional Classics. n.d. The Constitutional Society. 18 May 2001. <http://www. constitution.org/price/price_4.txt> Full text examined. (RP1=Add.Observ.)

—. A Discourse on the Love of our Country. London: T. Cadell, 1789. Liberty Library of Constitutional Classics. 6 April 2001. <http://www.constitution.org/ price/price_8.txt> Full text examined. (RP2=Discourse)

Priestley, Joseph. Considerations on the Doctrine of Phlogiston and the Decomposition of Air. Philadelphia: Thomas Dobson, 1796. Classic Chemistry. Ed. Carmen Giunta. n.d. Le Moyne College. 4 April 2001. <http://web.lemoyne.edu/ faculty/giunta/phlogiston.html> Full text examined. (JP=Phlogiston)

Shelley, Mary. The Last Man. London: H. Colburn, 1826. Romantic Circles. Gen. Ed. Neil Fraistat, Steven E. Jones, Carl Stahmer. n.d. U of Maryland. 10 April 2001. <http://www.rc.umd.edu/editions/mws/lastman/ascii.htm> Ch. 1 of vol. 1 examined. (MWS=LastMan1.1)

Towers, Joseph. The Professors of the Gospel under the Strongest Obligations to distinguish themselves by an eminent Degree of Piety and Virtue. A Sermon Preached at St. Thomas's, January 1, 1777, for the Benefit of the Charity-School in GravelLane, Southwark. London: J. Johnson and J. Buckland, 1777. Full text examined. (JT=Sermon)

Wollstonecraft, Mary. A Vindication of the Rights of Woman: With Strictures on Political and Moral Subjects. London: J. Johnson, 1792. Oregon State University <http://www.osu.orst.edu/instruct/phl302/texts/wollstonecraft/woman-con tents.html> Ch. 9 examined. (MW=Vindic.9)

Secondary Sources

Bentley, G. E., Jr. "Copyright Documents in the George Robinson Archive: William Godwin and Others, 1713-1820." Studies in Bibliography 34 (1982): 67-110.

Deconinck-Brossard, Françoise. "The Case for Computer-Aided Textual Analysis." Erfurt Electronic Studies in English 4 (1996). 4 July 2003. <http://webdoc. gwdg.de/edoc/ia/eese/artic96/decol/4_96.html>

Derry, John W. The Regency Crisis and the Whigs, 1788-9. Cambridge: Cambridge UP, 1963.

Farringdon, Jill M. Analysing for Authorship: A Guide to the Cusum Technique. With contributions by A. Q. Morton, M. G. Farringdon, and M. D. Baker. Cardiff: U of Wales P, 1996.

Holmes, David I. "Vocabulary Richness and the Prophetic Voice." Literary and Linguistic Computing 6:4 (1991): 259-268.

Macalpine, Ida, and Richard Hunter. George III and the Mad-Business. 1969. London: Pimlico P, 1991.

Mitchell, L. G. Charles James Fox and the Disintegration of the Whig Party, 17821794. Oxford: Oxford UP, 1971.

Ruecker, Stan. "Kurt Vonnegut wrote my E-mail: An Experimental Evaluation of the QSUM Technique." English 694. Ed. Stan Ruecker. 28 October 1999. U of Alberta. 22 March 2001. <http://www.humanities.ualberta.ca/sruecker/QSUM. htm>

Sanford, Anthony J., Joy P. Aked, Linda M. Moxey and James Mullin, "A Critical Examination of Assumptions underlying the Cusum Technique of Forensic Linguistics." Forensic Linguistics 1:2 (1994): 151-167.

Woolls, David, and Malcolm Coulthard. "Tools for the Trade." Forensic Linguistics 5:1 (1998): 33-57.

Woolls, David. "Better Tools for the Trade and How to Use Them." Forensic Linguistics 10:1 (2003): 107-117.

Notes

 
[*]

We are grateful to Lord Abinger for permission, granted through the Bodleian Library, to quote from Godwin's diary in the Abinger Collection, held on deposit in the Bodleian Library; to Beth Rainey, former Sub-Librarian, Special Collections, Durham University Library, for her expert advice; and to Françoise Deconinck-Brossard for supplying a text by Joseph Towers in electronic form. Thanks are due to the following individuals for help of various kinds: Bruce Barker-Benfield, Malcolm Coulthard, Martin Fitzpatrick, Oliver Hudson, Gary Kelly, Michael Popham, Lisa Vargo, and, above all, Robin Dix. Pamela Clemit's part in this study was completed with the support, mainly for other purposes, of an Arts and Humanities Research Board Research Leave award.

[1]

University of Durham, Special Collections, Routh 67. F. 2/5. Both pamphlets under discussion are contained in a volume of ten tracts, entitled "Pamphlets concerning King's Illness 1788-89." The volume includes a manuscript contents list in an unidentified late eighteenth-century or early nineteenth-century hand, headed "S. S. S. 7." Before rebacking in 1998, the spine had a fragment of a label bearing the same number, which suggests that the volume was originally part of a large pamphlet collection or that this is the pressmark of a private library. The volume also has a nineteenth-century ownership inscription, "James Weale." The pressmark on the spine, "LVII | F | 2," indicates that it forms part of the library of Martin Joseph Routh (1755-1854), the great patristics scholar, whose collection of printed books passed on his death to the University of Durham. The hand in which the authorship ascriptions of the two pamphlets in question are written does not occur elsewhere in the volume and is not that of Routh himself. A review of copies of each pamphlet in other libraries found no other evidence of authorship attributions.

[2]

Godwin, diary, Abinger Manuscripts, Dep. e. 196, fol. 20r.

[3]

Monthly Review, 80 (March 1789), 275. For Godwin's known work for Robinson, see Bentley, 77-83, 89.

[4]

University of Durham, Special Collections, Routh 67. F. 2/6.

[5]

Godwin, diary, Abinger Manuscripts, Dep. e. 196, fol. 26r.

[6]

A fuller discussion of Godwin's early writings is in preparation for publication in Pamela Clemit, The Literary Lives of William Godwin (Oxford University Press).

[7]

For the cusum technique, see Farringdon; for detailed criticisms of its assumptions and results, see Sanford et al., and Ruecker.

[8]

For a description of the construction and discriminatory function of the core vocabulary, see Woolls (2003).

[9]

An exception was the sample of De Lolme's writing, which was typed because the printed copy of the pamphlet proved impossible to scan. The first 2,650 words were entered to create a sample of approximately the same length as the other Regency pamphlets, which proved to be so distinct from the others that it was considered unnecessary to enter the rest of the text. This represents the only instance where part of a text was used.

[10]

Godwin noted dining with Towers for the first time on 7 October 1788; Lofft's name appears on a list of acquaintances made in 1788 at the back of the seventh volume of his diary (Abinger Manuscripts, Dep. e. 196, fol. 16r; Dep. e. 202, fol. 47r).

[11]

For an examination of the shared vocabulary characteristics of eighteenth-century Dissenting sermons, including works by Towers, see Deconinck-Brossard.

[12]

CopyCatch is a collusion and plagiarism detection program developed by David Woolls and is commercially available from CFL Software Development, UK.

[13]

For an edition of the two pamphlets, together with a fuller discussion of the material summarized in this paragraph, see Pamela Clemit, "Two Pamphlets on the Regency Crisis by William Godwin," Enlightenment and Dissent 20 (2001) (forthcoming).