Arapesh Grammar and Digital Language Archive

Lise Dobrin
Department of Anthropology

This project will analyze, preserve, and document the endangered Arapesh languages, traditionally spoken by people living along the Sepik coast of northern Papua New Guinea (PNG). Like other Arapesh varieties, the Cemaun dialect which the project documents most extensively is seriously endangered. In the Cemaun villages, language shift to the PNG lingua franca Tok Pisin is well advanced. There are fewer than 100 fluent speakers, none of whom are under 40, and none of whom are Arapesh monolingual. Arapesh is of special significance to linguistic theory for its typologically unusual system of noun classification that elaborates phonological, as opposed to semantic, principles of class assignment. Moreover, in northern Arapesh varieties such as Cemaun, a noun's phonological form may directly influence the realization of associated agreeing elements, requiring us to rethink "lexicalist" models of grammar which assume that phonology, morphology, and syntax can interact only in highly restricted ways. The current documentation of Arapesh is fragmentary, inaccessible, and incoherent. Existing sources suffer from serious deficiencies (e.g., problematic transcription, word-breaking, and glossing) that make them inadequate for linguistic research, language preservation, and use by Arapesh people. The confusing patchwork of language and locality names and analytical inconsistencies that they present limit linguists' ability to make confident assertions about Arapesh and to develop a clear picture of the relationships that hold within the Arapesh family. To overcome these problems, the PI will produce a theoretically informed but ecumenical reference grammar that comprehensively describes the Cemaun dialect of Arapesh while synthesizing the data on variation across the family as a whole. This will provide a sound basis for comparative and typological work on Arapesh and the Torricelli phylum to which it belongs, in turn contributing to the interdisciplinary study of Sepik prehistory. The grammar will be based upon tape recordings, elicited linguistic data, and handwritten texts that were transcribed and annotated during the PI's past fieldwork in PNG. It will be richly supported with examples from naturally occurring discourse of different genres, including conversation. The project will also produce a multimedia digital archive of the language to serve the needs of language preservation and research. This will consist of a grammatical database and a collection of digitized, marked-up texts associated with audio, linked by a software layer enabling searches across both parts of the archive. A modest public-facing website demonstrating selected Arapesh grammatical features will also be constructed, providing an accessible educational resource on the language. By presenting website text in Arapesh and Tok Pisin as well as in English, and by presenting Arapesh in audio as well as visual form, the website will benefit PNG people, harnessing the web's prestige to lend value to Arapesh and multilingualism. The archive and grammar will lay the foundation for pedagogical materials to be created later for use by Arapesh children in local schools, such as a reverse nominal dictionary in phonemic orthography that graphically illustrates the logic of the noun classification system that has now begun to disintegrate because it is opaque to young people who do not fully command the phonology. The project is a collaborative one in which linguists will work together with technical experts in web-based language preservation, database construction, text encoding, and humanities computing. Thus, in addition to making knowledge available about a typologically and theoretically important language, and doing so in a way that is flexible, robust, and enduring, the project will contribute to the broader endeavor of endangered language documentation by sharing the technological tools that we develop and by providing feedback on and extending current standards and best practice outlines for the digital archiving of nonwestern languages. The archive and website will be hosted by the Institute for Advanced Technology in the Humanities (IATH), which is institutionally committed to the interoperability, longevity, and humanistic functionality of digitally encoded text, and it is one of the project's broad impacts to serve as a pilot for further work on endangered languages within the IATH infrastructure.

More information at www.virginia.edu

Project Sponsored By: U.S. Nfah - Nat'L Endowment For The Humanities
Start Date: 9/1/2005 - End Date: 8/31/2008
Award Amount: $225,000.00
I am Lise Dobrin and I would like to this information.