§ 1 Njáls saga is the subject of a research project located at the Árni Magnússon Institute in Reykjavík which aims to re-evaluate the textual transmission of the saga. The last critical edition (Gíslason and Jónsson 1875) appeared more than 135 years ago, and work done in the project will be a contribution to the preparation of a modern critical edition. The short-term scientific interest of this project, however, is: a) a revision of the stemma of the manuscripts which was outlined by Einar Ólafur Sveinsson (1953) some sixty years ago and b) an investigation of variation in the manuscripts from different scientific angles (material philology, linguistics, stylistics, literary studies).
§ 2 Part of this research is the investigation of synchronic linguistic variation in the oldest manuscripts of the saga from the fourteenth century. In this article I give a short outline of how the corpus for this research is compiled and prepared for linguistic analyses, give one example of an application of the methods on a limited part of the corpus, describe some of the typical problems of work with medieval manuscripts in the context of digital humanities, and try to sketch possible solutions for some of these problems.
§ 3 Njáls saga is usually seen as the pinnacle of the development of the Icelandic family saga, a genre which is not only Iceland's most important contribution to world literature, but also by far the largest corpus of original (i.e. non-translated) medieval prose texts we have in any Germanic language, which makes it especially interesting for historical-linguistic research. The text was presumably composed around 1280. The special status of Njáls saga in the literary consciousness of the Icelanders, comparable to the role of Shakespeare's Hamlet in English, Goethe's Faust in German, or Dante's Divina Commedia in Italian literature, is due to the stylistic and compositional quality of the text. With approximately 100,000 words, Njáls saga is the longest Icelandic family saga. Its popularity is emphasised by the unusually large number of medieval and post-medieval manuscripts (eighteen manuscripts between 1300 and 1500 and forty-five from the sixteenth to the nineteenth century are still extant), and citations from the saga have reached the status of proverbs used in everyday communication.
Examples of variation
§ 4 The following short sample from the beginning of chapter
sixty-three which deals with the so-called
Knafahólar is thought to give a picture of some typical examples
of the types of variation we find between the manuscripts of the saga. The text
follows what is apparently the oldest manuscript, the fragment Þormóðsbók (AM 162 B fol. δ) from
around 1300; variants come from manuscripts from about the same period, Gráskinna (GKS 2870 4to, ca.
1300) and Reykjabók (AM 468 4to, ca. 1300-1325).
§ 5 The transcription is diplomatic (see The diplomatic level: How to reconcile linguistic principles, practicability and philological traditions? below), i.e. the orthography of the original is kept (except for variation on the allographic level), abbreviations are expanded and italicised, upper-/lower-case letters and punctuation are kept, line breaks, page breaks and column breaks are indicated with superscript numbers or letters. To facilitate a comparison of the different manuscripts, sentence numbers (see the Segmentation section below) are displayed.
§ 6 The English text follows Robert Cook's translation, which is based on the text from the mid-fourteenth century codex Möðruvallabók (AM 132 fol.). Möðruvallabók has not yet been included in the text corpus of the project. To facilitate a comparison of the English translation and the Old Icelandic original, deviating readings in Möðruvallabók (apart from differences in spelling and use of tenses) have been added in brackets to the transcription from Reykjabók. The additions are based on Andrea van Arkel-de Leeuw van Weenen's transcription (1987).
Þormóðsbók (AM 162 B fol. δ), p. 11v B:
§ 7 om bardaga
§ 8 1 NU eggiar Starkaðr ſina menn9 2 oc ſnero fram i neſit at þeim . 3 Sigurðr 10 ſuinhofþi for fyrſt oc hafði 11 torgv ſciolld rendan en ſverð i 12 annarri hendi 4 Gunnarr ſer hann oc ſcytr 13 til hanſ af boganom . 5 hann bra upp við 14 ſcilldinom er hann ſa at orin flo hatt 15 oc flo orin igegnom ſciolldinn oc au16gat ſua at vt flo ihnaccann oc uarð 17 þat uig fyrſt . 6 Annarri avr ſkavt Gunnarr at 18 vlfheþni heima manni ſtarkaðar oc kom ſv 19 a hann miðian . oc fell hann a fetr bon20da ſinom . en bundinn fell um hann þveran 21 7 kolſceggr kaſtar ſteini ihofoð bond22anom oc varð þat hanſ bani .
Gráskinna (GKS 2870 4to)
§ 9 30 1 <S>iþan egiaði ſtarkaðr menn ſina . 2 ſnva þeir nv framm ineſit at [40v] 1 þeim . 3 Sigurðr ſvinhavfði for fyrſtr oc hafði tavrgo ſkiolld ein2byrðan enn ſviðv i annarri hendi . 4 Gvnnarr ſer hann oc ſkytr af bog[a]ganom 3 5 hann bra vp ſkilldinom er hann ſa orina hatt flivga oc kom avrin i gegnom 4 ſkiolldinn . oc i avgat ſva at vt kom vm hnakkann oc varð þat vig 5 fyrſt . 6 annarri avr ſkavtt Gvnnarr at vlfheðni raða manni ſtadkaðar oc 6 kom a hann miðian . oc fell hann fyrir fœtr einom . oc fell bondinn vm hann . 7 kolskeggr 7 kaſtaði til ſteini oc kom ihavfvt bondanom . oc var þat hans bani .
Reykjabók (AM 468 4to) (Möðruvallabók, AM 132 fol.)
§ 10 [32v] 26 1 Siðan eg[g]iaði ſtarkaðr [ſin]a menn . 2 ſnua þeir 27 þa fram i neſit at þeim 3 [Sigurðr] ſvinhofði for fyrſtr ok hafði tor28gv ſkiolld einn rauðan (M: einbyrðan) en ſuiðv iannarri hendi . 4 Gunnarr ſer hann ok ſkytr til 29 hanſ afboganvm . 5 hann bra vpp hat[t] (M: -) ſkildinvm er hann ſa aurina hatt flygia ok kom 30 orin igegnvm ſkioldinn ok iaugat sva at vtt kom i (M: -) hnackann okvarð þat vig fyrſt 6 [33r] 1 annarri aur ſkaut Gunnar[r] (M: hann) at vlfheðni manni (M: ráðamanni) ſtar[c]kaðar okkom ſv a hann miðian okfell hann fyrir fetr bo2anda einvm ok (M: +fell) bondinn v[m] hann . 7 kolſkeggr kaſtar til ſteini okkom ihofut bondanvm okvarð þat hanſ 3 bani
English translation by Robert Cook
§ 11 1 Starkad then urged his men on; 2 they headed toward those who were on the point of land. 3 Sigurd Swine-head was out in front and was holding a small round shield, and a hunting-spear in the other hand. 4 Gunnar saw him and shot at him with his bow. 5 Sigurd raised his shield when he saw the arrow flying high, but it went through the shield and into his eye and out at the back of his neck. That was the first slaying. 6 Gunnar shot another arrow at Ulfhedin, Starkad’s overseer, and it struck him in the waist and he fell at the feet of a farmer, and the farmer tripped over him. 7 Kolskegg threw a stone and it hit the head of the farmer, and that was his death.
Variation on the lexical level
§ 12 Certain types of variation on the lexical level, for example the use of coordinating conjunctions (oc/ok [and], en [but], or no conjunction), the use or omission of anaphoric subject pronouns (þeir [they]) in coordinated main clauses or the denomination of Úlfhéðinn as heimamaður (roughly [farmhand]), ráðamaður (roughly [bailiff]) or simply maður ([man], manni being the dative form), have to be regarded as deliberate changes of the text and thus being of potential stylistic value (it should be pointed out that statements about the direction of the change or relationships between single manuscripts are not intended at this point).
§ 13 Other instances of lexical variation are of lesser interest
for the part of the project that deals with synchronic linguistic variation
and is mostly concerned with grammatical features, but are in some instances
highly interesting in connection with stemmatological questions. Those
variants are the result of unconscious changes due to misreadings by a
scribe. A good example is the description of the weapons Sigurd Swine-head
is using: A sword (sverð) in Þormóðsbók, but a certain type of spear used for hunting
(sviða) in Gráskinna and Reykjabók. In the Icelandic family sagas (the search
was based on the forty-three different texts in “Snerpa.is.” 2015), the word sviða is much less
frequent than the word sverð (a total of seven instances in
forty-three different texts compared to several hundred instances for
sverð); an (unconscious) change from sviða
(found in Gráskinna and Reykjabók) to sverð is more likely than the
opposite development (see also Sveinsson
1953, 77). For the description of Sigurður's shield the version of
Gráskinna seems to fit the plot best. The
information that the shield is einbyrður [single-layer] is
useful because it explains why Gunnar's arrow so easily penetrates it. In
comparison to this, the descriptions of the shield as red (in Reykjabók) or round (in Þormóðsbók) are superfluous or tautological. In the family
sagas, the adjective einbyrður is also by far less frequent than
rauðr (rendr is rather uncommon too). It is
tantalising to hypothesise that einbyrðan was the original
reading that was split up in two different words, einn and
byrðan, by a scribe because it was separated by a
line-shift in his exemplar (as is the case for example in Gráskinna, in the transcription above). However, an adjective
byrðr [boarded] makes absolutely no sense in this context
and would thus have been corrected, possibly to round or
red, and also einn (either a numeral
one or an indefinite pronoun
certain) would in this case have been in the best case
superfluous: of course Sigurd Swine-Head uses only one shield, and the use
of a postpositive indefinite pronoun (
a certain/some kind of
target-shield) would be rather marked in this context and not
appropriate in connection with a rather common weapon as a
törguskjöld (a small round shield with a buckle). This
would lead to the assumption that both Þormóðsbók and Reykjabók provide younger
readings than Gráskinna (and also Möðruvallabók).
§ 14 The earliest manuscripts of Njáls saga exhibit several examples of synchronic grammatical variation that pose severe theoretical problems for linguistic approaches dealing with linguistic universals. A typical example that shows up in the text examples above is the order of the reflexive possessive pronoun sinn and its governing noun (sína menn/menn sína [his men]. Examples are given in Modern Icelandic spelling). Language typology claims that the order of nouns and attributes, in this case the noun maðr [man] and the reflexive possessive pronoun sinn [his], is one of the features that determines the type of a language and should thus be stable. On grounds of this typological axiom, variation in this core area of syntax in different copies from the same text and produced at the same time is not to be expected. Nevertheless this variation can be observed, and fortunately linguistics is able to provide descriptive models to deal with the empirical fact of linguistic variation.
§ 15 A highly applicable model is Coseriu's (1988) model of a language as a system of different varieties that can be arranged on three different levels: location, social group, and circumstances. A language system at a certain time consists thus of different regional dialects, social dialects, and styles. If we act on the assumption that the variation in the use of certain grammatical constructions that can be observed in manuscripts of Njáls saga from about the same time can neither be attributed to historical developments nor to regional or social dialects, we have to deal with these types of variant as stylistic features.
§ 16 It is obvious that there is a connection between historical developments in a language, sometimes involving linguistic contact, and the style of certain types of texts. An example from Modern Icelandic that involves word order and may be comparable to variation in the order of noun and pronoun in the sample texts is the reverse order of negation and finite verb (negation-finite verb instead of finite verb-negation as in the standard language) in conjunctional clauses in certain types of religious and political texts that can be traced to developments in the history of Icelandic (Zeevaert 2009, 288). Nevertheless, in the modern Icelandic case the decision to use one of the two alternative structures is dependent not on historical-linguistic developments, but on (extralinguistic) circumstances (subject of the text, target audience, purpose). I hypothesise that for cases like the variation in the order of noun and attribute in the samples from the Njáls saga-manuscripts above, the same is true. To provide evidence for this hypothesis it is necessary to describe the usage of grammatical variables in different manuscripts on a larger scale in order to be able to attribute this variation to stylistic decisions dependent on certain non-linguistic circumstances.
§ 17 As was mentioned above, an examination of the relationships between the manuscripts and, if necessary, a revision of the stemma set up by Einar Ólafur Sveinsson (1953) is one of the main goals of the project. This examination is based on XML-transcriptions of the individual manuscripts that by and large follow the Menota-guidelines. Menota is an acronym for Medieval Nordic Text Archive, an internet resource publishing digital medieval Scandinavian texts. The guidelines, which are based on the TEI-standard of text-encoding but include modifications necessary for dealing with specific properties of Scandinavian manuscripts, are an excellent and exhaustive description of the procedure of encoding texts for this archive. In some rare cases a modification of the Menota-guidelines was necessary to meet specific demands of our project.
§ 18 In the last few years, the development of computer-based collation
and production of stemmata has advanced notably. The last step, that is to say
software to collate manuscripts and to generate stemmata that are customised for
the end-user, is still lacking. Juxta,
open-source tool for comparing and collating multiple witnesses to a single
textual work (About Juxta 2015), originally developed at
the University of Virginia, comes quite close to this ideal. Unfortunately the
desktop version of the software shows some shortcomings that make it difficult
to use with Old Icelandic texts. Icelandic characters like á,
ð, æ, þ, ó or
é, not to mention peculiarities of medieval Icelandic
manuscripts like r-rotunda or insular f
(ꝼ), are not displayed (see Fig.
1); and even more serious is the fact that some rather important variants
were not found by the program. The difference between the text witnesses Þormóðsbók (the base text for the comparison) and Óssbók (AM 162 B fol. γ =
gamma.txt) on the one hand and Gráskinna and Reykjabók on the other hand is that the first two
manuscripts treat it as a fact that a certain Þiðrandi Síðu-Hallssonur was slain
by dísir (female guardian spirits), whereas the other two witnesses
treat this rather as a rumour.
§ 19 The use of the collation software Collate (description available in Kondrup 2011, 469) is a rather unrealistic option, as it runs only on older Macintosh computers and technical or other support is not available. Its successor, CollateX (developed in the frame of the COST action Interedition) is not yet ready for public use. However, a web-application with restricted functionality is available (http://collatex.net/demo/) and giving quite promising results (see Fig. 2).
§ 20 The beta version of Juxta Commons, the online version of Juxta, does not exhibit the same problems with the handling of non-USASCII-characters as the desktop-version. Recently a function to compile a critical apparatus was added, and the results are much more reliable than the ones from Juxta, although still not completely error-free (see Fig. 3). In this example, the text from Þormóðsbók (W1) and Óssbók (W2) is represented correctly in the apparatus whereas for the text from Gráskinna (W3) the words þann and að are omitted erroneously). From experience, however, accessibility and maintenance are generally problematic issues in connection with web-based solutions.
Compiling and preparing the corpus
The purpose of transcribing
§ 21 The purpose of transcribing manuscripts is quite obvious. Medieval manuscripts are precious unique copies with limited accessibility. Transcriptions reproduce the relevant contents of the original and adapt its form to the demands of contemporary readers. It is less obvious, though, which features of the manuscript are relevant and to what extent its text has to be prepared for the reader. It is self-evident that an analysis of letter forms or the use of abbreviations in a manuscript cannot be performed on an edition with normalised and modernised orthography. In editions aimed at a general audience or intended for use in educational contexts, however, a normalised text is the appropriate choice because it avoids irritation caused by variation in spelling, scribal errors, or deviation from the linguistic standard familiar from dictionaries and grammar books.
§ 22 One of the main advantages of the TEI-XML-format is that it allows for an encoding of additional information in a transcription and gives the opportunity to choose which information is to be suppressed and how the chosen features are to be displayed on the computer screen or represented in files for further steps of analysis or preparation. This makes it possible to use the transcriptions not only as a basis for computer-aided collating and producing of stemmata, but also for a comparison of different manuscripts with regard to, for example, linguistic variation, and for editions of single manuscript witnesses.
Levels of transcription
§ 23 In the project
The Variance of Njáls saga, the text is transcribed in
three parallel versions or levels (<facs>, a type-facsimile
transcription, <dipl>, a diplomatic transcription, and <norm>, a
normalised transcription), an approach that is suggested in the guidelines
for transcriptions in the Medieval Nordic Text
§ 24 The <facs>-level tries to reproduce the text of the manuscript in the style of a type facsimile: letter forms, abbreviation signs, punctuation, and layout are reproduced as accurately as possible using a special character set. The <facs>-level has proven to be a very useful tool for example in connection with analysing the use of abbreviations in certain manuscripts. Unfortunately, a transcription of the <facs>-level is very time consuming. It is not yet implemented in all manuscripts, and we decided to postpone it for the remaining manuscripts until the transcription of the project corpus is completed on the <dipl>- and <norm>-levels.
§ 25 The <norm>-level is a transformation of the diplomatic level to modern Icelandic spelling. A normalised orthography makes it, for example, much easier to search for words or morphological structures. The decision to use the modern Icelandic standard creates a certain distance between the transcription and the language of the original (although the distance between contemporary and medieval Icelandic is much smaller than in other European languages). The advantage in comparison to a reconstructed historical normalised orthography as, for example, used in the ONP (Ordbog over det norrøne prosasprog) is its easier applicability and better documentation which helps to avoid errors and inconsistencies during transcribing.
§ 26 No morphosyntactical adaptation of the language of the
manuscripts is made on the <norm>-level: thus, for example, forms like
vér and þér (modern Icelandic við
[we] and þið [you], 1PL and 2PL of the personal pronoun) or
em (modern Icelandic er [am], 1SG.PRS.IND of
vera [to be]) are kept. This decision ensures an
unambiguous approach to the normalisation of the manuscript texts
(adaptations to modern language use only in the phonological but not in the
morphological domain). However, it entails some limitations with respect to
the straightforwardness of searches of words and forms and leads also to
anachronistic combinations of the morphological and phonological shape in
ég em maður skapharður (Modern
ég er maður skapharður,
normalised Old Icelandic
ek em maðr
skapharðr [I'm a man harsh of mood]). Historical phonological
changes from after the time of the writing of the manuscript, like the
epenthesis of u after a consonant and before a word-final
r (in the text example above the names
Starkaðr and Sigurðr become
Starkaður and Sigurður on the normalised
level) or the shortening of final rr in weakly stressed endings
(in the example from Þórmóðsbók above the
name Gunnarr, which becomes Gunnar on the
normalised level), also affect morphological endings which make it, in some
cases, difficult to distinguish between morphological and purely
§ 27 The graphic distance from the original does not constitute a disadvantage for linguistic analyses of the text, which in any case have to be based on the <dipl>-level, but is a precondition for the successful use of collation software. An issue with machine-based approaches to manuscript collation is the difficulty of distinguishing between important and unimportant variants. In most cases, variation on the graphic level is not of interest for stemmatological questions; a normalised text is thus able to sieve a considerable amount of irrelevant variation. For the examples of comparisons of manuscripts with CollateX, Juxta, and Juxta Commons above the normalised versions were used. A positive side-effect of the modern Icelandic standard is the better accessibility of the text to an Icelandic audience.
The diplomatic level: How to reconcile linguistic principles, practicability and philological traditions?
§ 28 A diplomatic version of the manuscript text which keeps the spelling of the manuscript but expands abbreviations, corrects errors, and normalises allographs without phonological value seems to be most appropriate for linguistic (apart from palaeographical) purposes. Such a version is represented by the <dipl>-level of our transcriptions. Unfortunately clear descriptions of how to approach diplomatic transcriptions of Old Icelandic manuscripts are still lacking, a fact already mentioned by Knirk (1985, 612). The editorial practice of the Editiones Arnamagnæanæ, the series of critical editions of Old Icelandic texts published by the Arnamagnæan institutions in Copenhagen and Reykjavík, seems, to a large extent, to build on traditions developed in those institutions that partly reflect technical limitations now resolved (Jensen 1989, 211; Sigtryggsson 2005, 265-268). Aspects of this practice are documented in the introductions of some of the Arnamagnæan editions (for example in Chesnutt 2006, LXV-LXVIII); Knirk (1985) and Jóhannes B. Sigtryggsson (2005) give descriptions of the treatment of punctuation, word division, capitalisation, graphemes, expansion of abbreviations, and corrections of scribal errors.
§ 29 Unfortunately, however, some rather basic features like the non-distinction of letter forms/signs which render no phonological distinction are disputed in the literature: the distinction between <s> and <ſ>, for example, is described as solely palaeographical by Driscoll (2006, 255) but as potentially phonological (with <s> for /s:/ and <ſ> for /s/), at least in the oldest manuscripts, by Gunnlaugsson (2003, 202) and Haugen (2004, 92).
§ 30 Medieval manuscripts were written for contemporaries and not for twenty-first-century linguists, and to a modern reader medieval writing practice may often appear to be unsystematic. In Þormóðsbók (Knirk 1985, 609 giving examples from other manuscripts), the round form of s is used in different functions that partly overlap: most commonly (in approximately seventy percent of the cases) it is used in connection with abbreviations – obviously because, by contrast with <ſ>, it leaves space for superscript letters or abbreviation marks – and at the end of words. It is less common that round <s> is used instead of a geminate consonant (approximately twelve percent of the cases) or at the beginning of proper nouns (approximately eight percent). The two latter functions are also fulfilled by the small capital forms of g, n, and r (<ɢ>, <ɴ> and <ʀ>), which, in contrast to round <s>, exhibit clearly distinguishable forms for capitals and minuscules. Such a distinction on formal grounds is not possible for s. A distinction according to function, for example using <ſ> in the transcription for <s> with superscript abbreviation signs, superscript <s> and short final <s> in the manuscript and <s> in the transcription for <s> in the beginning of names and when it stands for /s:/ in the manuscript, seems not to be applicable, not only because of the functional overlap of some instances of <s>, but also because it would violate the principle of keeping the spelling of the manuscript and would not reduce phonologically allographic variation, but rather establish a phonological differentiation that is graphically not systematically present in the manuscript.
§ 31 A comparable issue is the distinction between u and v. The use of the round or the pointed variant is partly characteristic of certain Latin scripts (mainly due to the use of different writing tools), but often <u> and <v> are used as allographic variants, to some extent distributed according to their position in the word; <u> is, for example, typical for Carolingian minuscule (Derolez 2006, 52), the pointed form <v>, however, is reintroduced in the ninth century and used particularly in the beginning and to a lesser extent at the end of words (Bischoff 1986, 156). A differentiation between two graphemes <u> and <v> is post-medieval (Mazal 1986, 82). In Icelandic manuscripts the situation is complicated by the use of <ꝩ>, originally developed in England. It is traditionally transcribed as <v>. In Þormóðsbók it is mainly used word-initially, but in a substantial number of cases it is also used in medial position, and in rare cases: final position. It is clear from writings like <hlꝩt>, <hlvt>,and <hlut> for hlut (ACC.SG of hlutr [thing]) that there is free variation between the three allographs (i.e. no complementary distribution), although clear preferences for certain positions are visible (<ꝩ> 80.5% initial, <v> 51% initial, <u> 52.5% medial).
§ 32 Thus both <ꝩ> and <v> are rendered as
v on the diplomatic level, whereas <u> and <v>
are kept apart because they represent different phonemes in early Icelandic
manuscripts (Hreinn Benediktsson 1965,
26), as in Modern Icelandic. The letters c, q, and
k are distinguished on the diplomatic level. Their
distribution in Icelandic manuscripts is for the most part conditioned by
the graphic context. In contrast to the distribution of
r-rotunda and normal r, however, where
r-rotunda, at least originally, was mainly used
after the round letters o and ð, this distribution
is primarily phonologically motivated. According to Hreinn Benediktsson
(1965, 30-31), c was
replaced by k before front vowels in medieval texts written in
Germanic languages in several countries. In Medieval Latin pronunciation,
the unvoiced velar stop, represented by c in the Latin
alphabet, was palatalised before front vowels, and the usage of the variant
k clarified that this pronunciation was not to be applied
in this context in Germanic languages. When the influence of Latin writing
was reduced by the increasing number of manuscripts in Icelandic, this
context became obscured and <c> and <k> were used as variants
without regard to phonological/graphic context. Disregarding cases like the
exclusive usage of c for the Roman number for
hundred and as the first letter in combinations that stand
for the long unvoiced velar stop (<cc> and <ck> are used in the
manuscript, but not <kk>), it has to be stated that the allographs
<c> and <k> are used in free variation in Þormóðsbók, but with a pronounced tendency to prefer
k word-initially and c word-finally (although
the latter tendency becomes much less prominent if the conjunction
oc [and] is not taken into account). The letter q is
exclusively used before u/v; c and
k are not used in this position. This practice does not
rest upon a phonological difference in Old Icelandic but is ultimately
adopted from Latin orthography (Hreinn
Benediktsson 1965, 33).
§ 33 Traditionally the decision on which allographs to keep in an edition and which not seems to rely at least partly on modern usage; distinctions involving allographs not present in the language at the time of the edition tend to be levelled out. Variants such as r-rotunda and normal r or insular f and normal f are more endangered than those still in use (for example <u> and <v> or <q> and <k> to the present day in English and German, <ſ> and <s> until the beginning of the twentieth century in Denmark). In this article, the r-rotunda used in Þormóðsbók and Gráskinna had to be replaced with the normal r in the example-texts from those manuscripts to avoid problems with the display of this character in browsers. Against this background it seems to be acceptable that the treatment of allographs is not based on phonological reasoning alone but considers also to a certain extent the traditions of Old Norse philology, and that not only phonological differences from the time of the writing of the manuscript are rendered, but also earlier and later developments. After all, the manuscripts dealt with in the project cover a period of more than five hundred years.
§ 34 Another typical issue in transcriptions of Medieval Icelandic texts is the expansion of abbreviations. Icelandic manuscripts are characterised by, in comparison to Continental manuscripts, an excessive use of abbreviations. Icelandic abbreviation practice, which is mainly based on insular Latin traditions, exhibits a fairly unequivocal system. A problem for the transcription is not so much the identification of what the abbreviation stands for, but which orthography to use for the expanded form. In most cases abbreviations consist of highly conventionalised symbols that stand in an iconic relationship to the characters they represent (in the sense that the abbreviation sign renders the form of letters it stands for). This iconic relationship, however, is partly obscured by the fact that letter forms have changed from the time of the creation of the abbreviation sign. The abbreviation for ra originally represents an open a (Cappelli 1967, XXVII), which is not used in Icelandic manuscripts from the fourteenth century. Furthermore, phonological change that is rendered in the orthography is not necessarily visible in abbreviation signs. The abbreviation for er is frequently used to abbreviate the word-ending that in the oldest manuscripts is written out -er, but from the thirteenth century to the end of the Middle Ages ‑ir (Þórólfsson 1925, XXIIf.).
§ 35 The usual practice is to expand abbreviations according to comparable unabbreviated forms used by the scribe and, in the (usual) case of orthographic variation, to use the most frequent forms. In other words, a diplomatic edition should seek to render the text as it was intended by the scribe.
§ 36 To facilitate a comparison between different manuscripts, a
segmentation was added to the text. This segmentation makes it easier for
collation programs to deal with differences caused by additions, omissions,
or rearrangements of larger chunks of text, but it is also of great use for
a comparison of linguistic features because it facilitates the finding of
text corresponding to a certain structure with regard to context but
deviating formally. In contrast to other major medieval literary, especially
poetic, works (for example the Old French Chanson de
Roland, Wolfram's Parzival, or Dante's
Commedia), a generally recognised or
self-evident reference system does not exist for Njáls
saga. We therefore decided to introduce a system based on the
smallest self-contained textual unit, the sentence. In this context the
sentence does not refer to a syntactical
unit, but is used with regard to (semantic) contents. A similar system
(chapter and verse) is used very successfully to identify corresponding
textual units in the Bible. This segmentation was based on the latest
edition of the text (Egilsson 2003) which
adds punctuation marks according to modern usage that could easily be used
to identify beginnings and endings of sentences. The TEI-conventions allow
for the implementation of a numbered segmentation based on sentences which
is compatible with a more precise syntactical segmentation.
§ 37 Software for academic purposes, not least in the area of digital humanities, tends to be commercially less promising, and is thus often dependent on public funding which is subject to financial and political conditions, than software for commercial purposes. We therefore decided not to rely solely on external solutions but to develop simple approaches that can be mastered with project-internal expertise and without further costs.
§ 38 Finding variation, especially of the linguistic type, in different versions of the same text requires the identification of certain structures to be compared and the output of corresponding chunks of text potentially containing this structure in the different manuscripts. The structure of our transcriptions allows for both a tagging of the structures in question and an identification of corresponding units of contents in different transcriptions, and to us the use of XSLT-style sheets appeared to be a manageable way to prepare this information for further research. Most tasks can be accomplished with a limited number of basic style sheets that can be easily adapted to certain demands.
§ 39 Typical linguistic variables for the earliest manuscript-fragments of Njáls saga are the position of the finite verb (verb-first or verb-second order), the order of noun and attribute (attribute before noun or noun before attribute), but also other stylistic phenomena as historical present tense vs. past tense or the use of either accusative with infinitive or conjunctional subordinate clauses in indirect speech (examples for all variables below Fig. 5, Fig. 6, Fig. 7).
§ 40 In order to find and compare these variables in different manuscripts, a mark-up of certain grammatical information was added to the XML-transcriptions. The Menota-guidelines provide a complete tag set for the morphosyntactic annotation of Old Icelandic texts. A complete tagging of the text, however, is excessive in relation to the task. For an investigation of the above-mentioned variables, a tagging of parts of speech (noun, verb, adjective, etc.) and a partial morphosyntactic annotation seem to be sufficient. To find instances of historical present tense, a tagging of the tense of verbs is self-evident, but a marking of direct speech to distinguish instances of historical present tense in the narrative parts of the text from normal present tense used in dialogues is also necessary. For the variables involving word order it is not sufficient to tag parts of speech. What has to be added is case when nouns in the genitive are used as modifiers and a marking of syntactical units, noun phrases for the order of noun and modifier, and the beginnings of main clauses to determine the position of the finite verb. Subordinate-clause constructions in indirect speech (conjunctional clauses, accusatives with infinitive) require a marking of clause-type and tagging of the parts of speech decisive for the construction (conjunction, finite verb in the middle voice plus infinitive).
§ 41 An extension of the tagging that might be useful for further research questions is unproblematic and can be based on the already realised tagging at a later stage.
§ 42 Of course it is possible to find strings of characters (including tags) with the search-function of text- or XML-editors. A considerably more efficient method, however, is to use XSL-style sheets (Extensible Stylesheet Language Transformations) that either transform the XML-file to HTML or generate PDF- or text files. The use of style sheets allows the definition of both content and form of the output to, for example, limit the output to the <dipl>-level or to define a certain font style, for example, for expanded abbreviations:
This allows for the output as a PDF-document with information about page numbers, columns, line numbers, emendations, and expanded abbreviations (Fig. 9).
Of much more interest for a linguistic analysis and comparison of the manuscripts, however, is the possibility to count and display certain tagged structures.
§ 43 One of the features that is usually emphasised in descriptions of the literary quality of the Icelandic family sagas is their characteristic style, which is usually described as objective, controlled, lacking decorative figures, syntactically uncomplicated, and containing elements of oral style and so forth (Bollason 2011, 16; Szokody 2002, 985; Hallberg 1969, 63 etc.). Amongst the typical features of this style Hauksson and Óskarsson mention transpositions of word order and especially the so-called narrative inversion, that is the order finite verb–subject instead of the unmarked order subject–finite verb (1994, 273ff.).
§ 44 This stylistic feature is also present in Njáls saga, but during our work with the manuscripts we observed differences in its use between different text witnesses. To quantify these observations and to prepare a more objective and less random analysis of these differences in the corpus, we tagged clauses and finite verbs in two chapters of Njáls saga in four different manuscripts: Betabrotið (AM 162 B fol. β), Kálfalækjarbók (AM 133 fol.), Skafinskinna (AM 2868 4to) and Gráskinna (GKS 2870 4to) (the fragments cover different parts of the text which limits the number of corresponding chapters in different manuscripts).
§ 45 The beginning of clauses was marked with the <cl>-tag,
finite verbs were tagged as
xVB fF (cf. Fig. 10), which makes it easy to count and output clauses
beginning with a finite verb using the XPath-expressions
(//s[.//cl[*[contains(@me:msa,'fF')]]]) (Fig. 11).
§ 46 This query was performed on all four transcriptions, and the results were revised for incorrect examples. A tagging of clause-types was not implemented in the XML-transcriptions, non-declarative (interrogative and imperative) and other inapplicable sentences were removed, and the sentences corresponding to all valid examples (identified by chapter- and sentence-number) from all texts were output, with interesting results.
§ 47 Twelve sentences (out of fifty) featured narrative inversion in at least one of the texts, but only one of them in all texts. The distribution was rather unbalanced between the different manuscripts: Betabrotið contained only three (2.5%), and Gráskinna eleven (9.5%) examples.
§ 48 Of course these results have to be interpreted with some
reservation. The analysis was done only on two chapters out of 159, and only
one stylistic feature was examined. In addition to this, narrative inversion
in Icelandic is a bit more complex. For Table 1, only unintroduced main clauses were
considered. In some of the manuscripts, however, narrative inversion after
the conjunction ok [and] (searched for with the
*[contains(@me:msa,'fF')]]]) is clearly more frequent than in
unintroduced main clauses (cf. Fig. 12).
§ 49 Christoffersen's remarks on narrative inversion (she prefers
discourse cohesion) in Old Nordic reveal that
it is more frequent in main clauses introduced with ok than in
unintroduced main clauses, although she is rather cautious about clear
statements (2002, 185). A general
problem for a precise survey seems to be that different texts behave
differently, and it is interesting to see that in the small sample from our
project's corpus this tendency also holds true for different contemporaneous
manuscripts of the same text (see Table
|Nr.||AM 162 B fol. β||AM 133 fol.||GKS 2868||
§ 50 The main problem in the automatic comparison of different manuscripts of medieval texts is not so much to identify textual variants, but to distinguish between important and unimportant variants. Software for an automatic collation of texts like Juxta or the different versions of Collate rely mostly on a comparison of chunks of text which does not always lead to satisfying results, as the example in Fig. 1 shows. In the case of Njáls saga, a further problem is the number of sixty-three text-witnesses which exceeds by far the number of texts that can be handled by software like Juxta or Juxta Commons but would anyway result in a critical apparatus too large to be useful for a specific analysis of, for example, word order.
§ 51 Research interests in connection with Old Icelandic texts are
manifold and require different kinds of transcriptions and annotations. Printed
editions are usually designed with a certain audience in mind (Kondrup 2011, 43-86) which very often excludes
their usage for, for example, linguistic research questions (cf. the discussions
in Marques-Aguado 2013 and Beltrami 2013). One of the advantages of
digital representations of manuscript texts over printed editions is their
flexibility and adaptability to different demands. A complete transcription of
all manuscript-witnesses of a text is usually less time-consuming than a manual
collation if a suitable work-flow is applied (Andrews
2013, 67), and the possibilities to add additional levels of
transcription or information needed for different kinds of analyses to a digital
transcription are virtually unlimited. Thus, the actual challenge is not so much
to develop methods for an automatic assessment but to prepare suitable data for
work on different research questions with as little time and effort as possible.
From my experience, this development is still very much influenced by the
tradition of printed critical editions; a typical example from the project
The variance of Njáls saga
is the treatment of variants of letter forms (see the previous section: The diplomatic level: How to reconcile linguistic
principles, practicability and philological traditions?).
§ 52 In this article I have described the methods used in the project
The variance of Njáls saga
to detect variation on different levels and I have put special emphasis on
convenient tools and approaches used to find and analyse linguistic variation
between manuscripts. This approach can be used to deal with most cases of
linguistic variation, and the example of narrative inversion (see the previous
section: Narrative inversion) shows that it is able
to revise earlier research based on printed editions (Zeevaert 2014, 985-986 presents a stronger
focus on results). Software designed to automatically compare different versions
of a text can be a useful aid for this research. At the moment, however, the use
of convenient tools and approaches as they are described in this article seems
to be a more promising way to come to conclusions about linguistic differences
between manuscripts of a text.
. Cf. for example
sögu (1991, VII). A self-evident terminus
ante quem is the age of the oldest extant manuscripts that
can be dated to around 1300 (± 25 years). The terminus post
quem is usually determined with regard to the usage of certain
judicial proceedings and technical terms (for example the Low German
loan word prófa [to examine]) that do not show up in the
laws of the Icelandic free-state but are of Norwegian provenance. The
laws of the free-state were replaced by Járnsíða in 1271, and according to Einar Ólafur Sveinsson
(1954, LXXVIII) it is quite
likely that Járnsíða was used as a source by
the author of Njáls saga; for example, the
með lögum skal land várt byggja, en með ólögum eyða [with
law our land shall rise, but it will perish with lawlessness] in chapter
seventy of Njáls saga seems to be taken
directly from Járnsíða, but it is assumed that
it took some years for the new legal customs and law codices to have an
effect on the writing of a saga; Einar Ólafur Sveinsson (1933, 299 ff., esp. 310).
. It is assumed that the use of the sobriquets instead of call numbers makes it easier for the reader to distinguish between the different manuscripts. The call numbers (GKS stands for Gammel Kongelig Samlig, the Old Royal Collection, AM for Den Arnamagnæanske håndskriftsamling, the Arnamagnean Collection) are given in brackets at first mention and in the list of Source texts. Þormóðsbok is named after the seventeenth-century Icelandic historian Þormóður Torfason, the name Gráskinna refers to the sealskin cover of the codex, and the names of Reykjabók and Möðruvallabók to the provenance of the manuscripts (Reykir and Möðruvellir are Icelandic place names).
Þormóðsbók, sentences one and two are
coordinated by oc [and], and the subject in sentence two
is omitted, which is to be expected if both sentences have a common
subject. In this case, however, the omission of the pronominal
subject þeir [they] in sentence two in Þormóðsbók is sylleptic. The verb form
snero refers to a subject in the plural,
Starkaðr and his men; the subject in sentence one,
however, is a singular, Starkaðr. Thus, the usage or
omission of a pronominal subject in the plural may be taken as a
conscious stylistic decision.
. Zeevaert (2012, 173 ff.) challenges the idea of a universal principle of consistency in the order of modified and modifying elements in phrases and clauses, which is at the basis of typological approaches to syntax in the Greenbergian tradition, on grounds of lacking – and in the case of the Scandinavian languages counterfactual – empirical evidence.
. Þormóðsbók: sá er dísir drápu/Óssbók: þann er að dísir vægju [the one who was slain by dísir;] Gráskinna: þann er sagt að dísir vægju [he is said to have been slain by dísir]; Reykjabók: þann er sagt er að dísir vægju [the one who is said to have been slain by the dísir].
. Haugen (2004, 94) and Driscoll (2006, 254) point to the fact that the term
diplomatic edition covers a continuum from
strictly diplomatic editions aiming at reproducing every feature of
a manuscript to semi-diplomatic editions giving no information about
expanded abbreviations or the layout and the punctuation of the
original. The method applied here corresponds by and large to what
is described by Guðvarður Már Gunnlaugsson (2003, 202) as a stafrétt
útgáfa [literal edition]. Especially for historical
language stages the distinction between individual deviations from a
linguistic norm and scribal errors is difficult and often
subjectively biased. The correction of scribal errors in
transcriptions should thus be applied with carefulness, and has to
be traceable to avoid wrong conclusions about the language of the
. A complementary distribution of different s-allographs exists for example in the Greek script (word final: <ς>, non-final: <σ>) or blackletter scripts like the German Fraktur (used until 1941, word final: <s>, non-final: <ſ>). In Iceland this distribution is common in late medieval manuscripts, but not in the earlier texts; in Þormóðsbók word-final <s> is written <ſ> in approximately eighty-three percent and <s> in approximately seventeen percent of the cases (in about half of these cases <s> stands for a geminate s), in non-final position the distribution is 95% for <ſ> and 5% for <s>. According to Mazal (1986, 11) <s> was introduced to Latin script in the eleventh century as part of the ligature -vs at the end of words and spread from there to other positions.
. This seems to be the usage proposed by the First Grammarian (i.e. the anonymous author of the Icelandic so-called First Grammatical Treatise from the mid-twelfth century) who lists the shape of the letters to be used for the short and long variants of consonants, for example <g> for /g/, <ɢ> for /g:/, <c> for /k/, <k> for /k:/ (Nordal 1931, 88). Unfortunately, the scribe of Codex Wormianus, the only extant manuscript of the treatise, jumped over the line containing the character to be used for /s:/ when copying the text, which was added above the line as <s>, presumably by a younger hand.
. The letter usually referred to as
(for example in Hreinn Benediktsson
1965, 25), was adapted from <ƿ> (Wynn) which was
used in Old English writing for the voiced labio-velar approximant
/w/. In Post-Classical Latin, <v> represented a voiced
labiodental fricative /v/ (Norberg
1968: 21), and <v> was thus no longer perceived as
an adequate representation of English /w/. In a shape
that resembles a capital Latin P it was originally part of the Elder
fuþark and the Anglo-Saxon
. It should be added that <v> and <u> are not clearly distinguishable in all cases: <v> does not have a clearly pointed but rather a round shape, <u> consists of two slightly waved downward strokes whereas the right stroke of <v> is bowed to the left. Thus <u> usually has a characteristic short stroke to the right on the base line, but one cannot exclude that the distribution in the transcription is slightly biased by modern orthography that uses <v> for the consonant /v/ and <u> for the vowel /u/ in cases where this stroke is not clearly discernible.
. The tendency for a distribution of the two
letters c and k in relation to their
position in the word seems to reflect a rule that is formulated in
the Second Grammatical Treatise, preserved
together with three other writings on Old Icelandic grammatical
matters in Codex Wormianus (AM 242 fol.) and under
the name Háttalykill also in Codex Upsaliensis (DG 11). Together with <ð>,
<z>, and <x>, <c> is classified as an
undirstafr [sub-letter], which can only be used
En fjórði stafr er c, ok hafa sumir menn þann
ritshátt, at setja hann fyrir k eða q; en hitt eina er rétt hans
hljóð, at vera sem aðrir undirstafir í enda samstöfu,
Raschellà 1982, 68). Hreinn
Benediktsson (1965, 79)
explains this rule as an attempt to reinterpret the use of two
graphemes for one phoneme after the reason for their distribution
was no longer transparent.
. In comparison to the First Grammarian's orthographic principals, which are radically phonologically based (he proposes for example using <c> for /k/, to sort out <q> and to use <k> equivalent to the small capitals of other consonants, i.e. instead of the geminate consonant), this approach may appear inconsequent from a strictly linguistic point of view. It should be mentioned, though, that the First Grammarian's rules are not even applied in the part of Codex Wormianus that contains the First Grammatical Treatise.
. Suitable style sheets are provided by different organisations. In the Njála-project we use mainly style sheets provided by Menota and TEI, but also style sheets that were originally developed by Kai Wörner (HZSK, Hamburg) for use with an Old Swedish corpus. Of valuable help for the adaption and development of the style sheets to the tasks of our project and for the design of new style sheets were Ulrike Henny (CCeH, Cologne) and, at a course organised by the IDE (Institut für Dokumentologie und Editorik), Martina Semlak (University of Graz).
. Hallberg (1968, 38ff.) gives an overview of the use of what he calls omvänd ordföljd [inverted word order] in different manuscripts of nine Icelandic sagas. The figures include four family sagas (not Njáls saga, however) which partly exhibit quite distinct differences in the use of the feature.
sögu/The Variance of Njáls saga (principal
investigator: Dr Svanhildur Óskarsdóttir) is funded by
Íslands/The Icelandic Centre for Research (http://www.rannis.is/) (styrknúmer 110610021).
is based on a presentation with the title
Axes, halberds or
foils given at the COST-workshop
Easy Tools for
Difficult Texts: Manuscripts & Textual Tradition at the Huygens ING, Den Haag,
Netherlands, 18-19 April 2013.
I would like to thank Alaric Hall (School of English, University of Leeds) and two anonymous reviewers for valuable suggestions for an improvement of this article, and Emily Lethbridge (Miðaldastofa, Háskóli Íslands) for useful comments on an earlier version.
First Grammatical Treatise. Nordal, Sigurður, ed. 1931. Codex Wormianus (The Younger Edda). MS. No. 242 in The Arnemagnean Collection in the University Library of Copenhagen (Corpus Codicum Islandicorum Medii Aevi, 2). Facsimile. Copenhagen: Levin & Munksgaard.
Raschellà, Fabrizio D., ed. 1982. The so-called Second Grammatical Treatise. An orthographic pattern of late thirteenth-century Icelandic (Filologia Germanica. Testi e studi, 2). Florence: Felice le Monnier.
About Juxta. 2015. Accessed May 10. http://www.juxtasoftware.org/about.
Benediktsson, Hreinn. 1965. Early Icelandic script. As illustrated in vernacular texts from the twelfth and thirteenth centuries (Íslenzk handrit. Series in folio, 2). Reykjavik: The Manuscript Institute of Iceland.
Christoffersen, Marit. 2002. Nordic language history and research on word order. In The Nordic languages. An international handbook of the history of the North Germanic languages. Volume 1 (Handbooks of linguistics and communication science, 22.1), ed. Oskar Bandle: 182-191. Berlin, New York: Walter de Gruyter.
Comrie, Bernard, Martin Haspelmath, and Balthasar Bickel. 2008. The Leipzig glossing rules. Conventions for interlinear morpheme-by-morpheme glosses. Max Planck Institute for Evolutionary Anthropology. Department of Linguistics. Accessed May 9, 2015. http://www.eva.mpg.de/lingua/pdf/LGR08.02.05.pdf.
Derolez, Albert. 2006. The Palaeography of Gothic manuscript books. From the twelfth to the early sixteenth century (Cambridge Studies in Palaeography and Codicology, 9) (1st edition 2003). Cambridge: Cambridge University Press.
Driscoll, Matthew James. 2006. Levels of transcription. In Electronic textual editing, eds. Lou Burnard, Katherine O’Brien O’Keeffe, and John Unsworth, 254-261. New York: The Modern Language Association of America.
Knirk, James. 1985. The role of the editor of a diplomatic edition. In The Sixth International Saga Conference. 28.7. - 2.8.1985. Workshop papers II. Copenhagen: Det arnamagnæanske Institut, Københavns Universitet.
Marqués-Aguado, Teresa. 2013. Editions of Middle English texts and linguistic research. Desiderata regarding palaeography and editorial practices. Variants. The Journal of the European Society for Textual Scholarship 10: 17-40.
Szokody, Oliver, 2002. Old Nordic types of texts I: Old Icelandic and Old Norwegian. In The Nordic languages. An international handbook of the history of the North Germanic languages. Volume 1 (Handbooks of Linguistics and Communication Science, 22.1), ed. Oskar Bandle. Berlin, New York: Walter de Gruyter, 981-989.
Zeevaert, Ludger. 2009. Deutscher Einfluss und syntaktischer Wandel im Schwedischen. In Deutsch im Norden. Akten der nordisch-germanistischen Tagung zu Åbo/Turku, Finnland, 18.-19. Mai 2007 (Nordeuropäische Beiträge, 28), eds Lars Wollin, Dagmar Neuendorff, and Michael Szurawitzki. Frankfurt am Main: Peter Lang, 279-306.
Zeevaert, Ludger. 2012. Low German influence and typological change in Swedish: Some results from a research project. In Contact between Low German and Scandinavian in the Late Middle Ages. 25 Years of Research (Acta Academiae Regiae Gustavi Adolphi, 121), eds Lennart Elmevik, and Ernst-Håkon Jahr. Uppsala: Kungl. Gustav Adolfs Akademien för svensk folkkultur, 171-190.
Zeevaert, Ludger. 2014. Mörkum Njálu! An annotated corpus to anlayse and explain grammatical divergences between 14th-century manuscripts of Njáls saga. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC 14), eds Nicoletta Calzolari, Khalid Choukri, Thierry Declerck et al.. Paris: ELRA, 981-987.