A Brief History of Language Archiving

This section is intended to be a brief overview of the history of language archiving. For a more in-depth study, see Henke & Berez-Kroeker (2016). While we do not consider the contents of this section to be part of the Simple Steps for Archiving Language Documentation Collections, the information here helps to situate the context of the steps that will follow. Feel free to skip this section if you are in a hurry to archive a collection of language documentation materials, but we hope that you will find this contextual information relevant to your work.

Figure 1:

A vocabulary as recorded by Thomas Jefferson in 1791, from the American Philosophical Society.

People have been collecting records of words and phrases in other languages for a long time, but language archiving as a field is fairly new. Two major trends can be seen over time: the increasing emphasis on discoverability and access to materials, and the continual adoption of new born-digital formats. 

Prior to the mid-20th century, language information was stored almost entirely on paper, making the circulation of these items costly and difficult. Some unique manuscripts made their way to physical archives and museums (such as the 1791 list of Unquachog words shown above in Figure 1), some of the printed books and journals made it to libraries, but many more documents remained unpublished and might have been passed around between people who were interested in them. Finding these materials—or even knowing that they existed—was difficult and restricted to people who had the means to travel to archives or libraries or access to the networks of people who knew about these documents. 

Figure 2:

Letter from Jaime de Angulo to Franz Boas in 1924, archived in the American Philosophical Society.

Transcription of Figure 2: Well, as I said, one copy [of my papers] I send to [Manuel] Gamio, one I sent to [Edward] Sapir who is the only man in the world who is likely to read such stuff, and the third one I always send to your office to sort of have it on file somewhere, in case it should ever be of use to someone.

Oftentimes, these networks were very small and wide-ranging. In a 1924 letter to Franz Boas (Figure 2 above), Jaime de Angulo wrote that he would send carbon copies of his research reports to his supervisor, Manuel Gamio, to Edward Sapir, who de Angulo thought would be interested in them, and to Boas, who de Angulo expected would hang on to the reports in case a colleague or student was curious about them. Thus, even though there were copies of his reports in Mexico City, Ottawa, and New York, they would be all but impossible to find for anyone not already acquainted with their recipients.

People who were documenting languages often found transcriptions of words and stories to be lacking when compared to the actual speech sounds and signs from signed languages. Thus, as soon as audio (and later video) recording equipment became available, people began to record examples of languages in use, either when speakers and signers were visiting a recording lab (Figure 3) or when the researcher traveled to record languages in their natural environments (Figure 4).

Figure 3:

Transcription of Figure 3: Well, as I said, one copy [of my papers] I send to [Manuel] Gamio, one I sent to [Edward] Sapir who is the only man in the world who is likely to read such stuff, and the third one I always send to your office to sort of have it on file somewhere, in case it should ever be of use to someone.  Oftentimes, these networks were very small and wide-ranging. In a 1924 letter to Franz Boas (Figure 3 above), Jaime de Angulo wrote that he would send carbon copies of his research reports to his supervisor, Manuel Gamio, to Edward Sapir, who de Angulo thought would be interested in them, and to Boas, who de Angulo expected would hang on to the reports in case a colleague or student was curious about them. Thus, even though there were copies of his reports in Mexico City, Ottawa, and New York, they would be all but impossible to find for anyone not already acquainted with their recipients.  People who were documenting languages often found transcriptions of words and stories to be lacking when compared to the actual speech sounds and signs from signed languages. Thus, as soon as audio (and later video) recording equipment became available, people began to record examples of languages in use, either when speakers and signers were visiting a recording lab (Figure 4) or when the researcher traveled to record languages in their natural environments (Figure 5).  Figure 4 (below):

Figure 4:

Anthropology Professor Melville Jacobs recording the voice of Annie Miller Peterson from the Coos Native American tribe with his newly built portable electric phonograph in Oregon in 1934. From the University of Washington Libraries, Special Collections.

Beginning in the late 20th century, a greater emphasis has been placed on archiving digital language data as an endeavor unto itself, and language archiving has become a core component of most language documentation projects; in many cases creating a digital language collection is required by project funders and/or academic institutions. Lately, there has also been a trend toward archives expanding their audiences and the breaking of the traditional boundaries between depositors, users, and archivists, with depositors themselves taking a central role in the arrangement and description of their own growing collections.

Collections in today’s digital language archives (i.e., digital repositories that specialize in language data) are multimedia, mainly born-digital, and frequently include very large files and/or large numbers of small files. The collections are made specifically for language documentation and preservation purposes; archiving happens soon after the materials are collected; and materials are accessible for many audiences via the internet. 

In this course, you will learn our “simple steps” for language archiving; these steps will guide you through what to do before, during, and after data collection so that you can easily archive your own (or your team’s) language data in a digital repository. We hope this is a helpful resource for you! Now, let’s get started ...

Complete and Continue  
Discussion

1 comments