A TEI Customization for Paper and Watermarks Descriptions

Watermarks are key to retracing the origins of paper manuscripts and early printed books and to understanding the context in which they were produced. TEI-P5, one of the most commonly used XML standards for digital descriptions of text-bearing objects, offers the possibility to describe watermarks, but not yet in a sufficiently detailed and consistent manner. The present article introduces a TEI-P5 customization for the description of paper and watermarks based on the International Standard for the Description of Paper, Watermarks and Paper Moulds in Relational Databases (IPHN 2.1.1). This customization provides TEI users with tools to make detailed, structured and standardized descriptions of paper, watermarks, and paper moulds. Such descriptions have the potential to improve the communication and collaboration between the different scholars working on paper manuscripts and early printed books. Moreover, once organized into a database, they can be mined in order to determine the origin and circulation of paper types. Therefore, the present customization can represent a strong asset in the study of the origins and material contexts of paper documents, handwritten or printed.

usually the trademarks of papermakers, and can therefore be used by paper scholars in order to determine where a paper sheet was produced. Moreover, albeit this too should be approached cautiously, watermarks inform the dating of paper sheets because paper moulds have a limited lifespan (Harris 2017, 76-78). Some watermarks even explicitly state the date of production (Ibid.,(52)(53). §2 The use of specific watermarks can be situated in time and space using quantitative evidence and historical records. Therefore, watermarks inform several aspects of the history and material context of paper manuscripts and early printed books. First of all, they can be used to better determine when and where paper documents were made. A foundational example of this is Allan Stevenson's study The Problem of the Missale Speciale (1967), where he was able to precisely date the Missale Speciale incunabulum using watermark evidence. Furthermore, watermarks inform the commercial routes involved in the production of paper documents, as is illustrated by recent projects such as Paper Trails (Paper Trails 2020). Last but not least, since paper brands vary in price, watermarks also inform the financial investment represented by the production of a paper document, and thus its socioeconomic context (see for instance Busonero et al. 2001). §3 Watermarks themselves should not be confused with watermark motifs. Indeed, the same motif could be used by several paper mills, sometimes with the intent of imitating paper of a higher quality (Hills 1988, 32). The use of a motif can even mislead us about the country in which a paper sheet was produced (Churchill 1935, 6-9). Therefore, identifying watermarks requires the gathering and analysis of a large amount of information besides the motif itself, on a large sample of paper sheets.

I.2. Watermarks in digital databases and catalogues
For this reason, digital databases have proven to be a strong asset for watermark studies. A fairly large number of online watermark databases exist. The search portal of the Bernstein project (Bernstein 2020) covers the majority of them. However, their interoperability is limited. Indeed, the different databases do not always record the same information, and do not always follow the same guidelines while recording it. Moreover, the same watermark motif can have several different names depending on who describes it, and in which language. The International Standard   (BVH 2020) or the Manuscripta digital catalogue of manuscripts in Sweden (Manuscripta 2020). A long, but non-exhaustive list of projects using TEI can be found on the TEI website (TEI 2020a). In addition, TEI's wiki includes a list of the manuscript catalogues that use it, but it is unfortunately not up to date (TEI wiki 2016). §5 The possibility to record paper and watermark information in TEI following the IPHN standard presents five major advantages: • TEI is open for all to use. Therefore, it allows a large number of professionals -codicologists, bibliographers, archivists and librarians, to cite only a fewto record compatible, standardized paper and watermark data.
• It makes digital catalogues made in TEI more adapted to the needs of paper scholars, and facilitates communication and collaboration between the different professionals who research paper manuscripts and early printed books.
• It helps spread the IPHN standard in digital catalogues and databases that include paper documents. Therefore, it represents a significant step towards the standardization of paper and watermark registration in the digital age.
• TEI has proven extremely useful for data mining in the field of humanities.
This can be a strong asset for quantitative studies on watermarks.
• Last but not least, because TEI is highly stable and interoperable, it ensures that paper and watermark information that is recorded in this manner is perennial and widely available.
II. How to describe watermarks in TEI: Best practice and current state of the art II.1. Making usable descriptions of watermarks: The basics §6 Here is not the time and place to expound on papermaking techniques and the methodology of paper scholars. However, a quick summary of the main aspects of medieval and early modern paper and what they entail for the identification of watermarks is necessary as a preamble to the customization's description.
II.1.a. What's in a paper sheet §7 In the second half of the thirteenth century, western papermakers started using moulds with a sieve made out of metallic wires (Harris 2017, 17). The areas of the paper sheet that were in contact with these wires in the mould are thinner, resulting in typical marks. The marks left by the vertical wires are called chain lines. The ones left by the horizontal wires, more densely arranged, are called laid lines. In addition, wire figures were sewn onto the moulds to create the patterns of watermarks, whose oldest extant examples date back to the late thirteenth century (Clemens and Graham 2007, 8;Harris 2017, 19). Medieval and early modern paper sheets are bifolia, and it is not uncommon to find not only a watermark, usually a symbolic or figurative motif, on one half of the bifolium, but also another pattern called countermark, usually the initials of the papermaker, on the other half. Moreover, in order to optimize their workflow, papermakers would use two moulds bearing the same watermark motif simultaneously (Harris 2017, 18). Hence "watermarks are twins," to use the expression coined by Allan Stevenson in his eponymous 1952 article. Quantitative studies have shown that, in 97% of cases, both twins are present in a batch of ten

Müller: A TEI Customization for Paper and Watermarks Descriptions
Art. 1, page 5 of 24 paper sheets (Bozzolo and Ornato 1983, 135-145). It is therefore very likely that the description of watermarks within a paper document will have to include the location of both twins within the quire structure.
II.1.b. Describing a paper sheet §8 In order to identify watermarks, one must record at least the following information: • The density of laid lines -IPHN recommends recording the number of laid lines over a segment of 20 mm (IPH 2013, 6).
• The distance between chain lines.
• The dimensions of the watermark.
• The distance between the watermark (and, if present, the countermark) and the nearest chain lines. Recording the position of the watermark relative to the chain lines is more efficient than measuring the distance between the watermark and the edges of the paper sheet, because sheets were often trimmed.
• The aspect of the watermark, using the standard typology of watermark motifs (IPH 2013, 21-84), and, if present, the aspect of the countermark. Using a standard typology instead of recording only a visual description prevents the confusions that could arise from terminological and linguistic differences. §9 Ideally, this information should be recorded for both twin watermarks. Such a description allows us to make a reconstruction of the original paper mould. We know paper moulds mainly through the analysis of paper sheets because they were shortlived, and the few that have been preserved are no older than the late eighteenth century (Loeber 1982, 2;Harris 2017, 76-77). §10 Once we know the main features of a paper mould, the next step is to connect it to a specific paper mill. This task is simplified by the tremendous work of watermark cataloguers such as C. M. Briquet (1907), W. A. Churchill (1935) and others. However, there is obviously the possibility of meeting an uncatalogued watermark. Moreover, little is known about paper mills in the early days of western papermaking. Quantitative evidence proves therefore especially useful to find at least the approximate time and place in which a specific watermark was used. §11 There is nevertheless an important caveat here. Because the wire figures that produced the watermarks were not part of the moulds per se, but sewn onto them, they could come loose during the papermaking process. They would then be sewn back onto the mould, in which case their position would change. Moreover, they could get damaged and undergo repairs that deformed them slightly. There are thus variants in the position and shape of the same watermark (Stieglecker 2009, 38).
Therefore, the identification of watermarks relies on the combination of parameters that should be kept distinct: those that are specific to the mould (chain lines and laid lines) on one hand, and those that pertain to the watermark itself on the other.
Moreover, the information that one gathers does not represent a permanent state of the watermark and the mould, but a specific stage of their shared existence. §12 Finally, several other features of the paper and the mould can be recorded.
We will see some of them while going over the details of the customization (points III.3.a and b). These features can strengthen watermark identification and provide insights into the paper's production method and quality. However, observing them may require hardware that is not always available to non-specialists, or be made impossible by the state of conservation of the paper sheet. By contrast, watermarks, chain lines and laid lines can be observed more easily, although, as we will see, this does not go without certain difficulties.
II.1.c. Gathering paper and watermark data: Logistics and technical aspects §13 There exist several methods to observe watermarks, chain lines and laid lines. Some, such as the different types of radiography, are very precise, but rely on costly hardware that is not always available to those who research paper documents. Others, such as UV photography or rubbing, are very efficient but can pose invasiveness problems. Transparency photography remains the most accessible method. Indeed, it can be performed with a fibre optic light sheet, which is part of the staple equipment of conservation workshops. Transparency photography should be used with caution

Müller: A TEI Customization for Paper and Watermarks Descriptions
Art. 1, page 7 of 24 for two reasons. First of all, the presence of text on the paper sheet can compromise the readability of the image. Second, a scale has to be present on the photograph to ensure the accuracy of the measurement. However, if such a scale is always included and if enough samples are collected, this method can already lend impressive results. §14 The different problems posed by these methods should not discourage scholars from studying paper and watermarks. However, as IPHN states, one should record only those features that can be assessed with certainty (IPH 2013, 2). Therefore, systems for describing paper and watermarks digitally should allow users to provide only the information that they can obtain with certainty. For this reason, the TEI customization presented here allows for maximum flexibility. This ensures that only accurate data is recorded. Moreover, it allows users to choose the level of precision that they can and wish to achieve in the descriptions, leaving them in control of the time and budget they want to invest in the recording of watermark and paper data.

II.2. Describing watermarks in TEI:
State of the art §15 The official TEI-P5 module 10 Manuscript Description (msdescription) is used for recording "detailed descriptive information about handwritten primary sources and other text-bearing objects" (TEI 2020b, 10.1). At the present stage, it provides basic tools for the description of paper and watermarks within the <support> element.
<support> can contain the elements <material>, in which paper can be described, <watermark>, in which watermarks can be described, and the standard TEI elements for dimensions, references and chronological information. While these official elements may be used for a brief description of a watermark or paper type, they are not sufficient for watermarks and paper descriptions that allow their identification.
Moreover, they lack elements for the description of specific features of paper and watermarks that would be useful to paper scholars. Finally, they cannot accurately render the location of twin watermarks, as well as watermarks and countermarks, in the quire structure of a document. §16 There exist ways to add some of these features to the official msdescription module (Github 2020c, see also Github 2020b). These include: Müller: A TEI Customization for Paper and Watermarks Descriptions Art. 1, page 8 of 24 • Using values of the attribute @type to express which specific features of paper are described -chain lines, laid lines, watermark motif, position of the watermark on the sheet etc.
• Creating a <countermark> element. §17 These suggestions are not implemented yet in the official TEI-P5, but they would significantly improve the quality of paper and watermark descriptions in msdescription. They can indeed cover the needs of TEI users who wish to include succinct, but accurate information concerning paper and watermarks in descriptions of text-bearing objects. §18 The customization presented here also uses msdescription as a basis for the description of paper and watermarks within individual documents. It is meant for slightly different usages: • Detailed descriptions of paper sheets.
• Descriptions that closely follow the IPHN standard.
• Expressing the relation between twin watermarks.
• Representing paper moulds in TEI.
• Being able to connect several paper sheets to a mould description in TEI databases. This opens the possibility of identifying watermarks directly in TEI using quantitative methods.

III. The customization
III.1. Overview §19 The present customization allows TEI users to express the location of twin watermarks and countermarks within the quire structure of paper documents, make standardized descriptions of paper and watermarks, and link these with standardized descriptions of paper moulds. It is flexible enough to allow users to record only the information that is absolutely necessary for the identification of watermarks or to make more detailed descriptions if they wish to. §20 It consists of two custom modules:

Müller: A TEI Customization for Paper and Watermarks Descriptions
Art. 1, page 9 of 24 • WatermarkDesc, which allows TEI users to describe paper and watermarks following the IPHN guidelines within descriptions of the support of manuscripts and early printed books.
• PaperMoldDesc, which allows TEI users to make descriptions of paper moulds including historical information about the paper mills that used them and the papermakers who worked there. As explained in point II.1.b, it is uncommon to find the original paper moulds, and they are usually reconstructed using the features of the paper sheets. Therefore, PaperMoldDesc is intended primarily for such reconstructions.
Files made with WatermarkDesc can be linked with files made with PaperMoldDesc.
This way, several descriptions of paper sheets in individual documents can be linked to the same mould file.

III.2. Overall structure and relation to official TEI-P5 and IPHN
III.2.a. Relation to official TEI-P5 modules §21 The two custom modules relate to the official TEI-P5 in the following manner: • WatermarkDesc is based on the official msdescription module, which is intended for the physical description of all text-bearing objects. Using msdescription not only enables the inclusion of the custom paper and watermark descriptions in standard TEI descriptions of paper documents, but also guarantees that the basic information on the paper item that is required by the IPHN standard, such as database entry number (3.0.0), repository (3.0.2), shelfmark (3.0.3), and date of entry or updating of the description (3.0.1), is de facto included using official TEI elements.
• PaperMoldDesc is based on the official TEI module 13 Names, Dates, People, and Places (namesdates), which is intended for persons, places and organizations (TEI 2020b, 13). It combines custom elements for describing the materiality of paper moulds based on the IPHN guidelines with standard TEI elements for encoding biographic and historical information. In this manner, users can provide information about the materiality

Müller: A TEI Customization for Paper and Watermarks Descriptions
Art. 1, page 10 of 24 of paper moulds alongside historical information about papermakers and paper mills.
• The official TEI module linking is used to link files made with Watermark-Desc with files made with PaperMoldDesc.
III.2.b. Relation to the IPHN standard §22 At the present stage, the customization covers all of the parameters listed in IPHN that are relevant for western paper without decoration. It also allows users to structure a potential database in a manner that conforms as much as possible to the recommendations of IPHN in this matter (see Figure 1). §23 However, this recommended structure does not perfectly translate to TEI. Indeed, to connect its different parts, there are two possibilities in TEI-conformant XML: linking and nesting. If one would closely follow the IPHN recommendation and make separate files for each and every part of this structure, it would imply that one uses only linking. This would deprive the user of the possibility to record and consult paper and watermark information directly in the description of a document's support. Therefore, it would not have been very interesting to make such an extension for TEI, whose main purpose is the representation of texts and text-bearing objects themselves. If one would use only nesting, it would prevent users from connecting paper sheets found in different documents to the same paper mould. This too would have made the customization purposeless. Therefore, the customization uses both nesting and linking (see Figure 2).

Müller: A TEI Customization for Paper and Watermarks Descriptions
Art. 1, page 11 of 24 §24 This has the advantage of providing a simple structure that is still consistent with IPHN. If one wishes to have a separate file for each and every part of the structure, this is still possible if one completes the customization with the official msdescription and namesdates modules (see Figure 3). §25 Finally, two parts of the structure recommended by IPHN are absent from the customization at the moment: the mould file and the marbled paper file. IPHN distinguishes indeed between mould parameter (i.e. the characteristics of a mould that can be deduced from a sheet) and mould (i.e. the mould as a physical object).  based on paper sheets, and thus for what is called mould parameter in IPHN. If users request it, the customization can expand to include a section devoted to extant moulds. As for marbled paper, it has been mentioned above that the customization cannot be used for decorated paper yet. In this case too, the customization could cover this aspect in the future if requested. These potential developments of the customization are discussed below in point IV.2.

III.3. The two custom modules
III.3.a. WatermarkDesc §26 WatermarkDesc allows users to nest descriptions of paper sheets with or without watermarks within the official <physDesc> or <support> elements, and to nest detailed descriptions of watermarks within the official <watermark> element. It also provides the possibility to express the relation between watermarks and countermarks, and between two twin watermarks, in the quire structure of documents (for the structure of WatermarkDesc, see Figure 4).

Müller: A TEI Customization for Paper and Watermarks Descriptions
Art. 1, page 13 of 24 §27 This way, files made with this custom module can contain as many paper sheet or watermark descriptions as desired. Moreover, descriptions of paper sheets can be used for watermark identification, but also for recording paper information for its own sake. §28 WatermarkDesc includes a set of custom elements that correspond to the parameters listed in IPHN for paper and watermark descriptions. Since presenting these in fine detail would make this paper lengthy and tedious to read, those readers who would be interested in a more technical description of each of them are invited to consult the HTML documentation of WatermarkDesc (see Github 2020a). For the purpose of the present paper, we will simply go over the main features and structure of this module. §29 WatermarkDesc includes two main custom elements, <paperDesc> for the description of paper sheets, and <WMDesc> for the description of watermarks. <paperDesc> can be used to describe paper sheets with or without watermarks. It contains: • The official element <locusGrp>, which allows users to express the location of one or several sheets within the document. <locusGrp> is required so that this location is always provided.
• Two required custom elements: <paperDataMethod> (corresponding to IPHN 4), which is used to specify the method used to collect sheet data and can include an image of the sheet (<sheetReprod>); and <papState> (IPHN 3.0.9), which indicates the state of conservation of the sheet.
• Optional custom elements for all of the physical characteristics of the sheet listed in IPHN, including the distance between chain lines (<cld>) and the density of laid lines (<lld>). These two elements are optional in <paperD-esc> because they pertain first and foremost to the description of the mould (see point II.1.b).
• The optional custom element <moldRef>, which contains the identifier of the mould and the link to the mould description file.

Müller: A TEI Customization for Paper and Watermarks Descriptions
Art. 1, page 14 of 24 • <WMDesc>, which is optional so that sheets that do not have a watermark can also be described. §30 In an XML document using WatermarkDesc, <paperDesc> would look like this • <locusGrp>, which is required. This may seem redundant, but it ensures that, when <WMDesc> is nested in <watermark>, the location of the watermark in the document is also systematically present.
• The required custom element <moti> (IPHN 3.1.3), which contains the description of the watermark motif and its IPHN reference.
• Required custom elements for the textual description of the watermark, its dimensions, and its position relative to the chain lines.
• Optional custom elements for including additional information and measures that are not required by IPHN and images of the watermark (<WMre-prod>).
• <moldRef>, which is also optional here. §32 <WMDesc> bears the following attributes: • @twin, whose value can be either 1 or 2. @twin is optional so that users are not forced to identify both twins and so that the watermarks in paper documents that are too small to contain both twins can also be described.
• The required attribute @kind (IPHN 3.1.0), whose value expresses if it is a main watermark (m), a countermark (c), if it is a decorative border (b) or corner (ncrn) watermark, if it consists only of dividing lines (dv), or if the watermark is absent (n) -which is useful in the case of bifolia that have a

Müller: A TEI Customization for Paper and Watermarks Descriptions
Art. 1, page 16 of 24 watermark, but no countermark.
• Optional attributes for the watermark's position on the paper sheet (@posp, IPHN 3.1.2), its structure (@stru, IPHN 3.1.1) and the technique through which it was applied to the paper (@wmod, IPHN 3.0.32). §33 Here is an example of how <WMDesc> would look like in a document, including only the required children elements and attributes: • Information about the physical features of the moulds that can be deduced from the sheets themselves.
• Historical and geographical information about paper mills.
• Prosopographical information about papermakers. §35 Figure 5 shows the structure of this module. Its custom elements are used in the following manner: • <moldDesc> contains the whole description of the mould.
• <paperSheets> contains one or more links to the descriptions of the sheets that were analysed to reconstruct the mould.
• <mold> contains information about the material features of the mould.
Its required children elements are <moldID> (identifier, IPHN 3.2.7), <cld> (distance between chain lines) and <lld> (laid lines density). Its optional children elements are <moldPair> (IPHN 3.2.8), which contains a link to the "twin" mould, and <wire> (IPHN 3.2.6), which contains a description of specific features of the mould's wire such as defects or special traces. <mold>'s required attributes are @fabr (fabrication, i.e. if the mould was used for hand-made or machine-made paper), which corresponds to IPHN 3.2.1, and @paperType (the type of paper for which it was used), which corresponds to IPHN 3.2.2. In addition, it has the optional attribute @shad for shadow zones (IPHN 3.2.5), i.e. denser areas in the paper created by the accumula-

Müller: A TEI Customization for Paper and Watermarks Descriptions
Art. 1, page 18 of 24 tion of pulp below the "ribs" (pieces of wood that were placed perpendicular to the laid wire to keep it straight) of the mould (see James and Cohn 1997, 43-4).
• <paperMill> (IPHN 3.4) contains all the historical and geographic information concerning the paper mill in which the mould was used. <paperMill> is optional because, unlike material features of paper moulds which can be reconstituted from the paper sheets, historical information regarding paper mills depend on primary sources that may not always be available.
<paperMill> can contain the official TEI elements provided in namesdates for geographic and chronological information, links, and the custom elements <paperMillID> (identifier, IPHN 3.4.4) and <paperMaker>.

IV. Conclusion
IV.1. End uses of the customization §38 The customization can be used to create databases of paper and paper moulds, or simply to make more precise descriptions of the paper that can be found in a specific manuscript or printed book. Therefore, it can fulfill three main purposes: • The rendering of information about the origin of specific papers and watermarks that was established through previous research in TEI. In this case, either or both of the two custom modules are used to relay information that can be found in existing scholarship.
• The creation of a database of papers and watermarks that is then mined in order to reconstruct paper moulds. This use of the customization is especially interesting since it gives the possibility to make quantitative studies on paper and watermarks directly in TEI. One does not need to record all of the IPHN parameters in order to do this. The essential information -distance between chain lines, laid lines density, motif and dimensions of the watermark and, if present, countermark -is sufficient as long as the number of paper sheets that are described is statistically significant. When the file made with WatermarkDesc includes the custom element <WMDesc> and is coupled with a file made with PaperMoldDesc, this information is necessarily included.
• The creation of detailed descriptions of paper sheets, without the intent of identifying watermarks or reconstructing moulds. In this case, only Water-markDesc is used. Making such descriptions for the sake of it is relevant for two reasons. First, the data they contain is worth recording as it may be used for future quantitative research. Second, having such information within a database or catalogue made in TEI is useful to paper scholars, who can thus navigate it better and see which items present features that are interesting for their work. §39 Recording paper and watermark data is time-consuming and can become costly if one wishes to use specialized material. For this reason, the customization is made so that users can choose how precisely they want to describe paper sheets and if they want to describe paper moulds or not. However, the customization requires from them that they enter the strictly necessary information for watermark identification if they use both modules and the element <WMDesc> in WatermarkDesc.

IV.2. Further developments §40
As mentioned in point III.2.b, the customization is currently suitable for the description of western paper without decoration. Several further developments could be implemented in order to make it fit for other uses. §41 First of all, it would be extremely beneficial to extend the customization to oriental paper. IPHN provides parameters for this, and they could be included in the

Müller: A TEI Customization for Paper and Watermarks Descriptions
Art. 1, page 21 of 24 customization within WatermarkDesc. Secondly, although WatermarkDesc includes a <pattern> element in <paperDesc> for succinct descriptions of decorative patterns, it does not suffice for proper standardized descriptions of decorated paper. IPH plans to publish a standard for the description of decorated paper (IPH 2013, 3), which could be included in the customization with an additional module. Thirdly, as mentioned in point III.2.b, the customization is not quite yet fit for descriptions of extant moulds. This too could be an interesting addition. The most logical choice would be to nest descriptions of extant moulds in the official TEI element <objectDesc>, which should then be added to PaperMoldDesc. The possibilities offered by <objectDesc> for describing non-text-bearing objects in TEI are still being explored and are often discussed among TEI users (see for instance Nelson 2016/17). If users express interest in seeing the customization cover one or more of these three aspects -oriental paper, decorated paper and extant moulds, they will be implemented.

IV.3. Where to find it §42
The latest versions of the two custom modules can be found on Github, alongside their documentation (Github 2020a). The author is currently in search of beta testers who could use such a customization in their own research or cataloguing work and would be willing to give feedback about it. Such feedback would allow the author to further tailor the customization to the needs of those who wish to better describe paper and watermarks in TEI. The customization is registered under the GNU General Public License (version 3.0). It is therefore -and will always remain -free to use.