§ 1 This article is the result of the work carried out as part of two research projects, Written Memory in the Catalan Private Domain: the Recovery of Archives and Documents (Project reference: Ministerio de Ciencia e Innovación, HAR2008-01748, La memoria escrita en el ámbito privado catalana: recuperación y estudio de archivos y documentos, PI: Daniel Piñol. Researchers: Ignasi J. Baiges; Elena Cantarell; Mireia Comas; Carme Muntaner), and The Recovery of Catalan Private Archives (Project reference: Universitat de Barcelona, PGIR/08-09 Recuperació d'arxius privats catalans, PI: Daniel Piñol), under the leadership of Dr. Daniel Piñol Alabart. These projects, launched in 2008, focus on the unusually large body of privately-owned historical documentation in Catalonia discovered by [contra]TAEDIUM, the University of Barcelona’s research team in Medieval History.
§ 2 The study and dissemination of such a large volume of
privately-owned documentation in Catalonia is obviously of interest to historians. In
fact, for the historical period from the Middle Ages to Modernity the documentation
hidden, that is, owned by private individuals and
largely beyond the reach of researchers, is as plentiful as that preserved in public
archives. In our view, drawing attention to these materials is essential to the
development of historical research in our country, since we cannot attempt to write
the history of medieval and modern Catalonia on the basis of only a part of the
§ 3 In fact, Catalonia possesses more medieval documents than
any other nation or institution in Europe, with the exception of the Vatican Archive
(Alturo 1998). Accessing the
hidden or inaccessible documentation represents a
considerable challenge. Our project aims to take up this challenge by providing
researchers with access to part of the documentation currently preserved in private
hands in Catalonia.
§ 4 The main goals of the Arquibanc project are to locate, recover, arrange and disseminate the archives and collections of documents belonging to private owners in Catalonia. Arquibanc is the Catalan word for a chest in which documents were kept in the Middle Ages. The abundance of these documents and the long historical period that they cover fully justify our project and its contribution to the study of the country’s history. However, a large number of these materials are privately owned and remain unpublished. Our aim is to find ways of providing researchers with access to these documents. Few of the heritage collections are already deposited in Catalan public archives and are attracting renewed interest due to their importance as historical resources.
§ 5 We first contacted the owners of two large private archive collections in order to introduce our project and gauge their possible reactions to it. The response was positive and led to other contacts. Our classes at the university provided us with another possible source of interesting documents; in the discussions of the project, some of our students told us of the existence of other small individual archives belonging to wealthy farmers, who kept these documents as a guarantee of their rights. We also heard from owners of historical documents who were keen to have these materials published. As a result, we can divide our materials into two kinds: well-established, well-organized archives, and single documents. At present the Arquibanc project is working with 33 archives at various stages of development.
§ 6 In some cases the archives are well preserved and well arranged. In these cases, the recovery process involves digitization and little else, whereas in the case of documents in poor condition their recovery entails a great deal of work. In the latter case we also suggest better ways of preservation to owners.
§ 7 Some archives were already very well organized. The owners of the Fontcuberta archive, for example, an important heritage archive with seventy linear meters of documentation from the tenth to the twenty-first centuries covering the counties of Osona, Alt i Baix Empordà and Vallès Occidental, had already added indices, master books, and so on. But in other cases we have organized the collections following the indications of the ISAD international system (Conseil International des Archives/International Council on Archives 2000) specially designed for heritage archives (Fernandez 1991; Gifre, Matas, and Soler 2002).
§ 8 In addition to the distribution of documents in printed format, online databases constitute the main tool at our disposal. Online databases can provide access to materials preserved in small or privately-owned archives which are hidden or difficult to track down. We currently have two databases: Scripta, which deals with the vast Fontcuberta archive, and comprises three sets of documents, and Memoria, the Arquibanc database, which includes documents from a variety of sources.
§ 9 The databases contain public, semiprivate and private registers and fields for the different kinds of materials or collections. They are hosted on a University of Barcelona server, where they are properly maintained, and can be accessed through the Arquibanc research project website: http://www.ub.edu/arquibanc/home.html.
§ 10 To fulfill our first objective – the location and evaluation of private documentary sources – we contacted, first of all, the owners of those archives which were already arranged in order to draw up cooperation agreements to cover the study, classification, description and cataloguing of the materials. As a result of these first contacts other owners asked us to consider including their documents in our project. Some of our university students also alerted us to the existence of new archives covering a surprisingly long chronological period, from the Middle Ages to the twentieth century. Because of the huge volume of the material available, we were obliged to make a selection. Among the materials we chose was the Fontcuberta archive. We also decided to compile and study materials in which we considered the risk of deterioration or disappearance to be high.
§ 11 One of the most important tasks in this first stage was the appraisal of the dimensions of the materials and the difficulties they presented in order to plan the work teams and the time necessary for their study.
§ 12 To fulfill the second objective – the recovery of documents at risk – we digitized all the materials to prepare them for study and also to preserve the ones at clear risk of deterioration or loss. In many cases, we advised the owners of fragile and poorly preserved documents to entrust them to the National Archive of Catalonia, if they themselves were unable to supervise the restoration needed for adequate preservation.
§ 13 With regards to organization, there is a clear distinction to be made between two sets of materials. The first is the Fontcuberta archive, extremely well organized and described according to the owners’ criteria of use, which is to oversee and manage their patrimony. In this case, no intervention on our part was required. The second is the body of materials that had not previously been classified or described. Their inclusion in the database only partially relieves this situation, but is a great help in carrying out a classification of the documents in compliance with the ISAD(G) (General International Standard for Archival Description) regulations. Once the process is concluded, the documents can be accessed using the database’s search engine or through the documentation classification chart.
§ 14 Our priorities are the needs and desires of each owner. Owners do not usually object to the processes of digitization, systemization and description, but on the question of providing free access to the documentation, their opinions vary widely: some stipulate that the publication should be exclusively in printed form, while others opt to restrict access in different ways. We must not forget that we are talking about private archives situated in private homes. To deal with this situation, the owners and the University of Barcelona sign agreements stating that access will be via the Internet, and will be supervised by the database’s administrators. Our main dissemination tool is the project’s website (http://www.ub.edu/arquibanc/home.html), where the databases containing the materials can be accessed. However, we also plan to publish much of the material in printed form, and in fact two projects of this kind are currently underway (Cantarell, Comas, and Muntaner 2011; Baiges, Cantarell, Comas, Piñol, and Soler).
§ 15 This is where the value of the database is particularly evident. As we noted in the introduction, a large amount of historical material remains unpublished. We feel that any instrument that enables the scientific community to consult these materials is valuable, since it may lead us to reformulate or qualify our previously held ideas. But our aim was not to create an instrument to replace the critical editions of documentary corpora, which we consider to be essential. Indeed, in many cases, so much of the documentation is unpublished that these editions cannot be carried out because of the technical difficulty and economic cost involved. The challenge, then, was to design a tool that was in a way comparable with traditional editions and could at the same time complement them and help to provide new insights.
§ 16 So, our starting point was the belief that this online database should not be limited to a repertoire of document images, but should contain the standard elements of diplomatic editions of documentary corpora. At the same time, it should be able to generate indices to aid consultation. Finally, another key feature of the database, thinking in particular of large-scale research projects, is its ability to promote teamwork. Large teams of researchers would be required for projects of this size.
The Scripta and Memoria databases: characteristics and use
§ 17 Let us now analyze the main features of the Scripta and Memoria databases to establish to what degree the initial expectations have been met and to identify the areas which have not yet been resolved. As a trial, we started with the Cubellis database which catalogues and publishes documentation from the municipal archive of Cubells, a small town in inland Catalonia. The relatively small volume of materials made this database ideal for proof of concept testing. Many of the shortcomings that we were able to detect in this trial stage were corrected, and the database was adapted to the needs of the Arquibanc Project. The result was a tool that facilitated teamwork and allowed several different levels of collaboration.
§ 18 Each document is given a register comprising all the fields necessary for identification, description and classification. These fields can be easily created and defined by the database editors without the need for specific training. Therefore, without the aid of computer technicians, researchers can adapt the structure of the database to meet the particular characteristics and objectives of each project.
§ 19 A special field in each register contains the image(s) of the digitized document in a readable, downloadable format. If there are multiple images, they can be consulted sequentially and as thumbnails. In fact, the size of the images was the first problem we encountered in the project. Especially in the case of the parchments, the size and the state of preservation of many of the documents generated a digital image that was too large to be included in the database. It has been very difficult, and in many cases impossible, to obtain high-quality, high-resolution images of an acceptable size (not more than 1 Mb).
§ 20 Initially, this caused certain problems for browsing, but these have been largely overcome thanks to the improvements in web technology in recent years; however, we still have the problem of server space. The small number of large format documents in the Memoria database do not present difficulties, but for larger projects such as Scripta, which contains the documentation of the Fontcuberta archive, we have hundreds of large format documents, which increases the required storage space and prevents the proper functioning of the tool due to the size of the images of these documents. In these cases, we supply researchers with a high resolution digital copy on request and, if they wish to publish the facsimile of the document, we can provide an image that meets the printer’s requirements.
Indexed fields, data searches and the generation of indices
§ 21 The database editors can create all the indexed fields they need for the objectives of each project. In the case we are describing, the fields are the following: signature (single and required field); date and place; document type in the case of public fields, and collaborator, image control, and state of revision in the case of private fields, that is, fields accessible only to the database editors and administrators. Private fields are useful for internal control of the state of each register, because they provide answers to important questions: who is the author of the description, has it been revised and if so by whom, who is in charge of the digitization, compression and publication of the image, which registers contain images and which do not, and what is the level of access of each document. They also provide access data such as the date of creation or modification, number of visits, and so on.
§ 22 All these fields can be explored using the browse option
and can generate the corresponding indices. All the fields, with the obvious
exception of the image, can be explored with the search option. As we
noted above, the search engine of the databases allows us to identify all the
elements located in any of the fields and produces a variety of lists according to
the search results obtained. This means that each and every one of the words
contained in the abstract of the document, whichever language was used (Catalan /
Spanish / English etc.) can be found by the search engine. This requires that each
document has attached to it an abstract worded as broadly as possible, in order to
enable the user to refine the search at maximum. Searches can be made for proper
names (name and surname, or name and rank) using the
search for adjacent
words option, political or administrative posts, and so on. The
search engine also allows the use of dummy characters (? or *).
Data entry and teamwork
Designed for teamwork
§ 23 The tool allows participation at several levels:
Masters are able to modify the internal structure of
the database and to grant permission to lower levels;
Administrators are authorized to manage fields,
deciding on their inclusion or exclusion, and to prepare the indications for
Editors are collaborative
researchers, who can create new registers and edit the document as they see fit
(this point will be discussed in more detail below), and
Collaborators can carry out brief collaborations in the
management of images, etc. This structure helps to build up teams with
different levels of implication in the project and makes the tool flexible
enough to deal with the edition of different kinds of documentary resources.
Among the collaborators we have also been able to introduce students in the
initial identification tasks – mainly Master’s students, but in some cases
undergraduates as well. As we said above, this tool is easy to use, with a
user-friendly interface for the introduction of data. These instructions appear
in red, and help to establish common criteria for all editors. We have also
prepared a style sheet to ensure uniformity for data entry.
Made-to-measure levels of edition
§ 24 The ease with which fields can be added means that users can produce anything ranging from a simple inventory of the contents of a document to a complete critical edition, in particular because the empty fields are not shown. In addition to entering the date, document type, abstract and image, users can also add an annotated transcription, ex-libris information, and bibliographical references.
§ 25 As we noted above, this database is housed on a University of Barcelona server, which means that we can monitor users by means of proxy. This is particularly important when we make the large step from Cubellis, a small public archive without any access restrictions, to Memoria and Scripta, which contain the privately-owned heritage archives on which we are currently working. Indeed, as we explained above, the owners have their own needs for and opinions on the dissemination of their archives. Researchers wishing to study documentary resources must complete the registration form and will receive an access code with the corresponding authorizations.
§ 26 The Arquibanc project was set up to promote the study of the abundant and extremely important documentation owned by private individuals in Catalonia. Our main objective at present is not just to publish the documents in the Fontcuberta archive and to preserve the documents classified as at risk; we are also determined to disseminate the project as far as possible. We hope to involve new researchers who are interested both in the history of our country or in the digital edition of historical documentation. The amount of work involved has turned out to be far greater than we had originally anticipated, and we would welcome support from the Digital Humanities community as we move forward with the project. The design of the database is ideally suited to fluid teamwork on a large scale. We are confident that the intrinsic interest of the materials will attract researchers from different areas of knowledge and thus help us to build up an interdisciplinary team that will establish itself as a leader in the field.
Conseil International des Archives/International Council on Archives. 2000. ISAD(G) (General International Standard Archival Description). Madrid: Consejo General de Archivos. http://www.mcu.es/archivos/docs/isad.pdf
Gifre, Pere, Josep Matas, and Santi Soler. 2002. Els arxius patrimonials. Girona: Associacio d’Història Rural de les Comarques Gironines, Centre de Recerca d’Història Rural de la Universitat de Girona.