The Cantus Database: Mining for Medieval Chant Traditions


The Cantus database is a well-established project devoted to the creation and distribution of electronic indices of manuscript and early printed sources of Latin chant for the liturgical Office. As of January 2011, there were over 379,000 records in the database, each of which is an individual chant in one of the 134 manuscripts which have been indexed to date. For over a decade, this research tool has been growing and adapting to the needs of chant scholars, musicologists, hagiographers, art historians and researchers in other fields. In addition to the basic search functions and downloading options, there are now several analytical tools available on the website, including a textual concordance and an interactive dendrogram-creation tool. The latter, an example of data-mining, allows the user to select a series of chants which will form the basis of a comparison among the numerous manuscripts whose contents are recorded in Cantus. Similarities in chant series can be interpreted as affinities among manuscripts, and so, the dendrograms which are created (through the calculations of similarity matrices) can assist researchers in identifying related chant repertories, in studying the origins and dissemination of saints' feasts, in providing evidence for the provenance of manuscript sources and, undoubtedly, for numerous other research applications.


Cantus, Database, Antiphoner, Antiphonal, Office, Gregorian, Chant, Latin, Research Tools, Modes, Medieval Studies, Inventory, Index, Dendrogram, Volpiano Font

How to Cite

Lacoste, D., 2012. The Cantus Database: Mining for Medieval Chant Traditions. Digital Medievalist, 7. DOI:


Download HTML






§ 1 Cantus: A database for Latin ecclesiastical chant (; also available from is a well-established online project devoted to the creation and distribution of digital indices of medieval chant manuscripts and early printed books for the liturgical Office. For almost two decades, this freely-accessible research tool has been growing and adapting to the needs of scholars in a variety of fields, such as musicology (ecclesiastical chant and the sacred polyphony of the Middle Ages and Renaissance), liturgical drama, hagiography, palaeography, philology, art history, ecclesiastical history and monasticism. By providing a searchable database of detailed information for the over 379,000 chants entered to date,[1] Cantus is also a useful digital archive for librarians, archivists, amateur chant enthusiasts, auction houses (where medieval manuscripts and individual folia are sometimes sold), as well as performers of this early music, including church musicians, directors of liturgy and members of monastic communities.

§ 2 The most popular features on the website continue to be the search and download functions, and it is mainly for these aspects of Cantus that the website has received an average of approximately 15,000 visits per month over the last few years from users all around the world. In addition to these basic functions, Cantus has begun to offer several online, interactive analytical tools which utilize the data in a variety of ways. The current offerings include a textual concordance, programmes which compare series of chants in order to identify regional or widespread traditions, and a dendrogram-creation programme which provides a visual display of the degree of similarity or difference among medieval sources of chant. More analysis programmes are being proposed. These applications of the data housed in Cantus demonstrate the research potential of this relatively large mass of information and illustrate the flexibility and usefulness of indices of chant manuscripts in a digital medium.

A brief history of Cantus

§ 3 Cantus was developed in the late 1980s by Ruth Steiner at the Catholic University of America. The first files were created on a mainframe computer and distributed in the post on floppy diskettes. By the mid-1990s, the database had been posted to the Internet first with a Gopher protocol and then, eventually, to the World Wide Web where it has remained with open access for all interested users. From 1997 until 2010, the base of operations was at the University of Western Ontario (UWO) under the leadership of Terence Bailey; during these years, there was tremendous growth in the database and it became firmly established as an effective and reliable research tool. On 1 December 2010, following the retirement of Bailey, Debra Lacoste entered into a collaboration with MARGOT at the University of Waterloo, Ontario and Cantus became one of the partners in their cluster of medieval, online, digital humanities projects. [2]

The database

§ 4 In the Cantus database, each record is an individual chant in a manuscript. Each record contains such information as the folio number on which the chant is found, the liturgical occasion or feast day (that is, the day of the liturgical year, such as Christmas, St. Benedict’s Day, or the Wednesday following the eighteenth Sunday after Pentecost), the first few words of the chant in the incipit field, the melodic mode to which the chant belongs (that is, one of the eight medieval church modes), the liturgical Office for which the chant was intended to be sung (such as Matins, Lauds, Vespers, and so on), the genre of the chant (for example, hymn, responsory, antiphon, etc.), as well as supplementary fields which contain additional information.

§ 5 The database was created to assist scholars who work with medieval chant manuscripts. A formidable challenge in the study of the medieval Office is the very large number of surviving sources and the variability in arrangement of their contents. Each hand-copied manuscript, which regularly may contain thousands of chants, is unique and testifies to the tradition of a specific time and place. Although the liturgy in the various antiphoners and breviaries is often similar from one book to another, the ordering, selection and placement of specific chants can differ substantially. Scholars regularly use the data provided free-of-charge in Cantus to locate particular chants on which they are working and to navigate through microfilms or digital image libraries.

§ 6 Although the original purpose of the database was the creation of tables of contents for medieval Office books, many researchers have begun to employ the data in Cantus in creative ways. What follows is a description of the known ways that Cantus data has been manipulated, augmented, and programmed into applications in order to further research into the long tradition of medieval ecclesiastical chant. Chant-researching pioneers in the field of digital humanities will no doubt expand on this listing of methodologies in the coming years.

The creation of tonaries

§ 7 One of the first applications of Cantus data beyond its usefulness in locating individual chants on particular folios was the creation of tonaries. A tonary is a listing by mode of the antiphons which were sung in medieval worship. A tonary was often copied as part of a medieval service book, and the church cantors could refer to these lists when preparing their psalm tone recitations. However, not all medieval service books were copied with a tonary, and some tonaries have been separated from their service books. Furthermore, we do not know if existing tonaries are complete or 100% accurate without first comparing their lists of chants with the actual contents of related manuscripts. For purposes of comparison and study, a tonary can easily be created from the Cantus index of a manuscript with a simple database query: this involves merely sorting the antiphons by their modes and differentiae.[3]

Tonary query
Figure 1: Tonary query

Modal and melodic analyses

§ 8 Many chant scholars have an interest the relationships between chants in different melodic modes.[4] Some chant texts exist with multiple melodies and some chant melodies can be interpreted and reinterpreted in different modes based on various characteristics, such as their opening melodic gestures and their final cadences. The numerical assignment of mode numbers 1 to 8 in Cantus indices is a great benefit in this type of research; searching and sorting of many thousands of records can be accomplished in mere minutes.

Melodic incipits in Volpiano font

§ 9 Also aiding the study of chant melodies is one of the more recent developments in Cantus: the inclusion in some indices of the melodic incipits or the complete melodies of the chants in a form of letter notation which presents as a series of Arabic letters and dashes in a data-string, and as round note-heads on a five-line musical staff when the font Volpiano is applied. Volpiano font is named for the early-eleventh-century theorist William of Volpiano (Guillaume de Dijon, cf. Bent et al. 2009), who is credited with the letter notation used in the manuscript Montpellier H. 159.[5] Each letter in the font corresponds to a pitch in William's a-p series of alphabetic notation. For example, the application of Volpiano font to the data-string 1---c--d--e--f---4 results in the presentation of corresponding pitches on a modern staff with a treble clef (see Figure 2).

Volpiano font applied to a data-string.
Figure 2: Volpiano font applied to a data-string.

§ 10 Chant melodies encoded in Volpiano font are searchable and sortable data-strings in Cantus records, as shown in Figure 3.

Cantus data table showing the “Volpiano” field with letter notation (in the ninth column).
Figure 3: Cantus data table showing the “Volpiano” field with letter notation (in the ninth column).

§ 11 The data-strings in a Cantus record display on the Details page of the UWO website as melodies in modern notation if the host computer has Volpiano font installed (see Figure 4).[6]

Website "Details" page showing all the recorded data for one chant, including the melodic incipit displayed in Volpiano font.
Figure 4: Website "Details" page showing all the recorded data for one chant, including the melodic incipit displayed in Volpiano font.

Textual concordance

§ 12 Another use of Cantus data is in the textual concordance.[7] The user can enter into the search box one or more words, such as Ecce nomen or Ave Maria and see a listing of occurrences within the database. The user can then view the context of the search words within any of the manuscripts in which those words occur, that is, the placement of those words among neighbouring chants on the folio side or page. More advanced searches are also available, as detailed in the HELP tab.

Responsory series comparative tool

§ 13 Previous studies have successfully shown that a similarity in the usage and ordering of particular items of the liturgy can be interpreted as an indication of affinity among sources (Hesbert 1963-79); the more the manuscripts resemble one another with respect to the chants they contain and the order in which those chants occur, the more likely there is to be a common tradition linking them together. One could presume that the data housed in Cantus is an ideal resource for such comparisons. A featured programme on the UWO Cantus website is the interactive database Responsory Series: Advent and Lent, an application that can assist researchers in identifying the degrees of similarity between over 900 sources of medieval western chant through comparison of the usage and ordering of responsory chants.[8] The user can select any one of the entered series of responsoria prolixa (the Great Responsories)[9] for the four Sundays of Advent or the six Sundays of Lent and use that series of chants as the basis for a comparison with the other records entered for that particular Sunday. There are three methods of comparison available, that is, three methods of mathematical calculation,[10] and the results are listed with the closest affiliations at the top (see Figure 5).

Responsory series sample results for a comparison involving series for the first Sunday in Lent (L1), with "Klo3" (i.e., Klosterneuburg, Augustiner-Chorherrenstift - Bibliothek, 1017) as the head of the series, the one against which all others are compared.
Figure 5: Responsory series sample results for a comparison involving series for the first Sunday in Lent (L1), with "Klo3" (i.e., Klosterneuburg, Augustiner-Chorherrenstift - Bibliothek, 1017) as the head of the series, the one against which all others are compared.

§ 14 One can see with only a few clicks of the mouse which chant traditions are similar to the source that is the head of the comparison; with a few more clicks, the user can change the head-series to either another Sunday for the same manuscript or to a different source altogether, or select a different method of comparison, and a new set of results will appear.

Cantus series comparative tool

§ 15 Establishing relationships between manuscript sources can lead to new hypotheses regarding the transmission of chant, the development or retention of local customs and numerous other topics. Expanding on the Responsory Series tool, the Cantus Series programme extracts and compares chant series of all types from the Cantus database. The data can be manipulated according to the user's preferences; researchers can select for comparison any series of chants for any liturgical occasion. Results are displayed in long lists, similar to the format for the series comparisons of responsory chants.


§ 16 Since it is difficult to know how to interpret and utilize lengthy lists of numbers, the comparative calculations from both the Responsory Series and Cantus Series programmes can be represented in the visual format of the dendrogram, a model adopted from cluster analysis techniques used in the biological sciences.[11] The dendrogram website tool[12] is an interactive online programme that allows the user to demonstrate the relationships between manuscripts in a branching diagram. These dendrograms, which are created through the calculations of similarity matrices, can assist researchers in identifying related chant repertories, in studying the origins and dissemination of saints' feasts, in providing evidence for the provenance of manuscript sources and, undoubtedly, for numerous other research applications. For example, the relationship of the five series of responsories for the fourth Sunday of Advent shown in Figure 6 can be represented in a similarity matrix as shown in Figure 7.

Five different series of responsory chants for the fourth Sunday of Advent (A4).
Figure 6: Five different series of responsory chants for the fourth Sunday of Advent (A4).

The matrix representing the calculations of similarity and difference among the five series of responsory chants in Figure 6.
Figure 7: The matrix representing the calculations of similarity and difference among the five series of responsory chants in Figure 6.

§ 17 Notice the diagonal of zeroes showing self-similarity in the matrix, much as a distance table on a road map shows the number of kilometres or miles between cities. The calculations for the similarity matrix can be transferred into a clearer visual representation through the use of the dendrogram shown in Figure 8.

Dendrogram showing the degrees of similarity between the five responsory series in Figures 6 and 7.
Figure 8: Dendrogram showing the degrees of similarity between the five responsory series in Figures 6 and 7.

§ 18 The liturgical occasion (that is, the Sunday) is listed in the first of the columns on the right-hand side; for this example, the chant series involved in this comparison are taken from the fourth Sunday of Advent (A4). The manuscript sigla are after the cursus (monastic or secular[13]), followed by an indication of the dates of the sources and a brief word concerning their provenance, the latter being abbreviated to twelve characters owing to space restrictions. Interpreting the dendrogram involves observing the distances of the vertical lines; the closer the vertical connecting lines are to the manuscript sigla (i.e., further to the right side of the page or screen), the more similar the series are. This dendrogram shows that, for this group of five responsory series, there is a fairly close-knit association in the sources from Boulogne, Fritzlar and Paris.[14] The manuscript from Padua is the outsider in this group, while the one from Bohemia takes an intermediary position.[15]

§ 19 Series of chants within a cluster are more similar to each other than they are to series outside the cluster. Therefore, the series from Boulogne and Fritzlar are more similar to each other than either is to the series from Paris, Padua or Bohemia. It is important to note that from the dendrogram we cannot determine exactly how close the series from Paris is to either Boulogne or Fritzlar; we can only see how close the Paris series is to the cluster formed by the other two. These conclusions are, obviously, only valid within the group of these five series. What are interesting and often enlightening are the dendrograms involving hundreds of sources from various regions and liturgical traditions of medieval Europe. The Cantus data awaits the eager users of this online comparative tool.

§ 20 Through its previous two decades, as Cantus has both grown and transformed to serve an increasing base of users, the integrity of and respect for the project have remained strong owing to collaboration within the academic community. A few of the manuscript indices in the Cantus database have been produced by junior research assistants on the Cantus staff but many others have been contributed by scholars worldwide;[16] the high number of donated indices to this database demonstrates the frequent collaborative efforts that we have been witnessing in this burgeoning new era of humanities computing.

§ 21 The usefulness of Cantus as a chant research tool is proven both by the number of visits to the website and by the praiseworthy testimonials of chant scholars from around the world. Amid such affirmations of importance, the database continues to expand and seek new directions in an effort to serve scholars, assist in the development of new research and establish the study of chant firmly within the scope of the digital humanities.


[1]. As of January 2011, the Cantus database contained complete indices of 134 manuscript and early printed sources, a total of 379,206 individual chants.

[2]. After years of support from the Social Sciences and Humanities Research Council of Canada and The University of Western Ontario, the Cantus database ( was fortunate to receive funding from The Andrew W. Mellon Foundation for the period from March 2011 to February 2012 to redesign the website and database using MySQL in a Drupal framework. This has been carried out in collaboration with the MARGOT project at the University of Waterloo. Cantus can now be accessed at: with a direct link to the data at: The analytical tools available on the UWO website will remain online indefinitely; the first priority in the 2011 year of transition has been the transference of the browsing, searching and downloading functions.

[3]. The eight medieval church modes to which the freely-melodic antiphons were assigned are entered into the database as the numbers 1 to 8. Differentiae are the sometimes numerous cadences for the eight formulaic psalm tones that correspond to each mode. The grouping of antiphons whose accompanying psalm recitations employ the same differentia often demonstrates familial melodic relationships among those antiphons; this organization of melodies by mode and differentia simulates the listings in existing medieval tonaries.

[4]. For example, Ike de Loos was interested in chants with multiple melodies or melodies which could be interpreted and reinterpreted in different modes. She engaged in a comparison of modal (i.e., numerical) assignments in de Loos unpublished.

[5]. Montpellier, Bibliothèque Inter-Universitaire, Section Médecine H. 159.

[6]. For more on the benefits of the use of this font in chant scholarship, refer to Helsen and Lacoste 2011.

[7]. The textual concordance is available from the UWO Cantus website.

[8]. Although the information contained in the Responsory Series database shares a similar format with Cantus data with respect to chant ID numbers, normalization of spelling, genre identification codes, etc., the Responsory Series website tool accesses a separate set of database tables. These tables contain only the series of responsory chants for specified Sundays for over 900 sources, whereas the Cantus tables contain full indices of all the chants in, as of January 2011, 134 sources.

[9]. The Great Responsories are lengthy, elaborate chants which were sung primarily during the Office of Matins in series of either nine or twelve, depending on the cursus (monastic or secular). The selection of particular chants and the order in which they were sung varied from place to place across medieval Europe; it is the uniformity or lack thereof within these chant series that provides a useful starting point for research into local or regional chant traditions.

[10]. These are: 1. Matches/Pairs, 2. Edit Distance, and 3. Longest Common Sequence. For an explanation of these methods, see Lacoste and Stafleu 2009.

[11]. For more explanation, see Lacoste and Stafleu 2009.

[12]. The dendrogram tool is available from the UWO website.

[13]. Each manuscript record contains a cursus field that includes M for monastic or S for secular (i.e. cathedral or non-monastic) sources. This information is vital for the calculations of similarity since monastic manuscripts usually provide twelve (or more) responsories for each liturgical day and cathedral sources provide only nine.

[14]. The sigla represent:

  • Blg1 = Boulogne-sur-Mer, Bibliothèque municipale, 93 A
  • Kas05 = Kassel, Landesbibliothek und Murhardische Bibliothek des Stadt Kassel, Theol. 2o 144
  • Bar1 = Bari, Basilica di San Nicola - Biblioteca, 2

[15]. The sigla represent:

  • Pad3 = Torino, Biblioteca Nazionale Universitaria, F IV. 3
  • Pra2 = Praha (Prague), University Library, XIV B. 13

[16]. Contributed files as well as those produced by Cantus staff are thoroughly proofread before being uploaded into the web-database. The proofreading process involves a complete manual pass by an experienced indexer followed by electronic proofreading which employs forty-nine customized queries within Microsoft’s Access.

Works cited

Bent, Ian et al. Notation, 'III, 1: History of western notation: Plainchant, (v) pitch-specific notations, 11th&12th centuries, (a) Alphabetic notations and dasia signs. In Grove music online; Oxford music online, Accessed July 3, 2009.

Beyssac, Gabriel M. 1993. Unpublished work on chant series in the Pro Defunctis Office, as referenced in Ottosen, Knud. The Responsories and versicles of the Latin office of the dead. Aarhus: Aarhus University Press, p. v.

de Loos, Ike. Unpublished. Modes and melodies: An investigation into the great responsories of the Gregorian and Old Roman chant repertoires. Presented at the 18th Congress of the International Musicological Society, Zurich, 2007.

Helsen, Kate and Lacoste, Debra. 2011. Report on the encoding of melodic incipits in the Cantus database with the music font 'Volpiano'. Plainsong & Medieval Music 20/1: pp. 51-65.

Hesbert, René-Jean, ed. 1963-1979. Corpus antiphonalium officii. 6 vols. Rome: Herder

Lacoste, Debra and Stafleu, Gerard. 2009. Similarities in responsory series represented by hierarchical diagrams: New tools for determining manuscript affiliation. In Antiphonaria: Studien zu Quellen und Gesängen des mittelalterlichen Offiziums, Regensburger Studien zur Musikgeschicht, pp. 147-169. Vol. 7. Ed. by David Hiley. Tutzing: Hans Schneider Verlag.



Debra Lacoste (University of Waterloo)





Creative Commons Attribution 4.0


Peer Review

This article has been peer reviewed.

File Checksums (MD5)

  • HTML: ade5c1fa9f78b77b0cb018f99fc90a62