Studies in bibliography :: :: University of Virginia Library

Textual Criticism at the Millennium

David Foxon, Humanist Bibliographer

LITTERA SCRIPTA MANET: BLACKSTONE AND ELECTRONIC TEXT

THOUGHTS ON THE AUTHENTICITY OF ELECTRONIC TEXTS

JOHN MANNINGHAM'S DIARY AND A LOST WHIT-SUNDAY SERMON BY LANCELOT ANDREWES

A FUNERALL ELEGYE . . . NOT . . . BY W.S. AFTER ALL

The Question of Authorship

The Cusum Technique

Reading Cusum Graphs and Charts

Attributing Shakespeare

Analysis by samples

The Dedication to the Elegye

Key Questions

FIELDING'S CONTRIBUTIONS TO THE COMEDIAN (1732)

WHAT DID ANNA BARBAULD DO TO SAMUEL RICHARDSON'S CORRESPONDENCE? A STUDY OF HER EDITING

FORM AND FUNCTION IN THE ENGLISH EIGHTEENTH-CENTURY LITERARY EDITION: THE CASE OF EDWARD CAPELL

[section]

"THIS INSTANCE WILL NOT DO": GEORGE STEEVENS, SHAKESPEARE, AND THE REVISION(S) OF JOHNSON'S DICTIONARY

TWO NEW PAMPHLETS BY WILLIAM GODWIN: A CASE OF COMPUTER-ASSISTED AUTHORSHIP ATTRIBUTION

A BIBLIOGRAPHICAL HISTORY OF THOMAS HOWES' CRITICAL OBSERVATIONS (1776-1807) AND HIS DISPUTE WITH JOSEPH PRIESTLEY

THE FIRST PUBLICATION OF BYRON'S "TO THE PO"

JOSEPH CONRAD'S UNDER WESTERN EYES: THE SERIALS AND FIRST EDITIONS

UNRECORDED WRITINGS BY G. K. CHESTERTON, H. G. WELLS, PADRAIC COLUM, MARY COLUM, T. S. ELIOT, GEORGE BERNARD SHAW, AND WILLIAM BUTLER YEATS

Notes on Contributors

BIBLIOGRAPHICAL SOCIETY OF THE UNIVERSITY OF VIRGINIA

BENEFACTORS FOR 2000

PATRONS FOR 2000

CONTRIBUTING MEMBERS FOR 2000

SUBSCRIBING MEMBERS FOR 2000

STUDENT MEMBERS FOR 2000

AVAILABLE PUBLICATIONS

WINNERS OF THE 2002 STUDENT AWARDS IN BOOK COLLECTING

Collapse All | Expand All expand section

162

Reading Cusum Graphs and Charts

The principle on which cusum analysis is based is thus extremely simple,
being predominantly based on unconscious, habitual use of function words.
The practise is within the grasp of any academic prepared to devote the necessary
time and effort to learning it; but, to sound a cautionary note, this would
be no rapid process with automatic results. Learning how to use the method
with confidence takes about three to six months of practise on a wide variety
of texts. By starting with the analysis of one's own utterance (as providing
samples of known integrity, a crucial proviso), any researcher is enabled to
gain confidence in the effectiveness of the method. The benefit of "tutorials"
from experienced analysts is also important. Only then may one approach
one's literary problem.

It should be noted here that there have been various attempts at critiques
of cusum analysis: these have all failed in remarkably similar ways, namely,
by misunderstandings of both principle and practise. One misunderstanding
is the notion that the method is based on a single invariable language habit
(counting words of 2 or 3 letters is the one usually selected), which is suitable
for all writers/speakers and occurs in rigid proportions or ratios in each
sentence—an obvious absurdity. There are, in fact, nine language-habit tests
used on any sample under analysis to discover which one will be consistent
for the writer; and these are counted in no simple proportional manner.
Analysing for Authorship devotes a whole chapter to "Answering the Critics",
and each critique is carefully examined there. Despite fairly widespread
awareness of the few critical attempts, there is complete ignorance of the endorsement
of the method's validity by the statistician responsible for writing
the British Standard on cusums—who, further, developed a refinement which
enhances the original method for the satisfaction of professional statisticians.[19]

One impression of the method's "unreliability", apparently widespread
and also described in the book, deserves special mention here since it involves
an interview with Morton by a television company who had asked for two
samples of writing to be analysed for homogeneity. Given Morton's opinion
that they were indistinguishable, the interviewer then dramatically revealed
that they were by a man convicted of corporate crime, and by the Lord Chief
Justice (Taylor). In actual fact, the "crook" was reading out a company report
compiled by his department so that the sample was not his own utterance
at all; moreover, the TV crew had got hold of a sample of writing by the wrong
Lord Taylor![20] As always, the integrity of the text is everything. Given corrupt
data, "wrong" answers will be inevitable: in computer jargon, "garbage
in, garbage out". In Morton can be faulted, it is surely in accepting too readily
samples whose origin he had not personally been able to verify. But to the
viewing public, it apparently seemed Q.E.D. and remains so among some academics

163

in the attribution field to this day, even if it calls for a certain naïvety
to believe that scientific validity can be decided by a ten-minute TV stunt.
The old "I saw it in the newspapers" seems to have been superseded by the
new "I saw it on TV".

Less well-remembered (or known at all) are the many blind tests where
the technique has been successful, some of them required by the presiding
judge in a court case. The most amazing example of these is perhaps the one
where the "challenger", or person setting the test (Sir Kenneth Dover),[21] did
the counting himself and passed over only the resulting numbers—not the
Greek text—to Morton. Morton promptly discovered the inserted passage.

Properly used, cusum analysis is a useful tool. A major advantage of the
method is that the results of analysing quite small samples can be visually
demonstrated in graphic charts which may be understood by the non-expert
(for example, a juror in a court case). As well as resistance to computers, there
can also be an in-built resistance among literary professionals to looking at
graphs. Yet a graph is only a way of presenting information. The more
familiar we become with pictorial ways of interpretation, the easier it becomes
to "read" the information in that form. How naturally we can now
read, for example, television weather maps: "See those isobars packed in
tightly together", says the TV Weatherperson, and we know we are in for a
spell of high winds.

The sample of utterance under investigation is counted, by cumulative
sum (hence, cusum), first for sentence length and the deviation of each sentence
in the sequence from the average. The second step is to count, again by
cumulative sum, some feature, usually called the "habit", of language-use
within the sentence. The nine tests available to the analyst are based on function
words, as already described. This is not the place to explore why such
features have been found to work, although speculation is intriguing.[22] We
need only point to the success of the method in analysing: natural utterance,
both written and spoken; edited text; translated work; children's writing;
dialect; and disputed utterance. Here we have an attributive measure of
great sensitivity which is objective and which works across both time of authorial
composition, and genre.

The last claim is one which has occasioned a degree of scepticism, and it
is as well to outline the obstacles which must be overcome and habits of
mind which must be set aside before the technique can gain confident acceptance
by literary scholars. The cusum technique has nothing to do with
style or literary value. It is purely quantitative, not qualitative (remember
now those assumptions that statistics were analysing style?). Therefore, such
procedure as comes naturally to the literary critic, that is sensitivity to tone,
image, rhythm; comparison of like with like in terms of genre or date of
composition; the difference between poetry and prose—all these must be
set aside to turn instead to a study of language by measurable units, and that

164

unit normally the sentence. Whatever the soundness of the literary/stylistic
judgments brought to bear on the Elegye by Professor Vickers or others, such
judgments will remain interpretative rather than objective—indeed, a "game"
to some readers. That is why it is worth drawing attention to a method which
asserted five years ago that the Elegye had nothing to do with Shakespeare.

[19]

Professor A. F. Bissell introduced a useful addition to any analysis, through the use
of "weighted cusums" and a t-test.

[20]

A matter of no small annoyance to the real Lord Chief Justice.

[21]

Sir Kenneth Dover is a past Master of Corpus Christi College, Oxford, past President
of the British Academy, and former Chancellor of the University of St. Andrews, Scotland.

[22]

Farringdon et al., Analysing, "A Note on Linguistics", pp. 45-48.