University of Bielefeld, June 10-12, 2013
The goal of this workshop is to discuss both research findings on information
structure based on spoken corpora, and methodological issues arising in such
investigations, in a cross-linguistic perspective.
Recent developments in technology have made it possible for linguists to create spoken language corpora on a hitherto unprecedented range of languages, including several lesser-studied languages. For languages without a written tradition, spoken corpora assume an even greater value since they document the only mode of communication. Data obtained from corpora are increasingly used in linguistic research, reflecting a more usage-based orientation on the part of linguists on the one hand, and making analyses verifiable on the other.
Spoken language corpora are promising to be particularly useful to the study of information structure (IS). IS often involves complex correspondences between communicative goals and marking strategies, encompassing prosody, morphology, and syntactic structure, the full range of which can best be observed in naturally occurring data (Brunetti et al.: 2011). However, the investigation of IS in spoken corpora still has many methodological obstacles to overcome, ranging from those related to the prosodic analysis of spontaneous speech to those relating to the very identification of IS categories in such spontaneous data. These challenges explain why much research on IS continues to rely on introspection or on experimental research. These techniques are rarely available to linguists working with lesser-known languages: they are usually not native speakers, making introspection impossible; further, many types of experiments are not applicable in non-literate and/or non-western cultural contexts. Thus analysing spoken corpora is the only means to get insights into the encoding of IS in these languages, and indeed it is only through the study of spontaneous data that it is possible to gather inventories of the full range of IS categories and understand how they are employed in discourse.
This workshop is part of the project "Discourse and prosody across language family boundaries: two corpus-based case studies on contact-induced syntactic and prosodic convergence in the encoding of information structure", funded by DoBeS (VolkswagenStiftung Funding Initiative Documentation of Endangered Languages).
June 10-12, 2013
University of Bielefeld
Main building, A3-126 (Monday and Tuesday, room for Wednesday tba)
Registration is free, and only required for attendants who are not presenting. Please register before May 27th by using this registration form or by emailing your name, affiliation, and which sessions you are planning to attend to firstname.lastname@example.org
Brunetti, Lisa, Stefan Bott, Joan Costa and Enric Vallduví. 2011. A multilingual annotated corpus for the study of information structure. In Grammatik und Korpora 2009: Dritte internationale Konferenz, Korpuslinguistik und interdisziplinäre Perspektiven auf Sprache, Band 1, ed. by Marek Konopka, Jacqueline Kubczak, Christian Mair, František Štícha and Ulrich H. Waßner, 305-327. Tübingen: Gunter Narr Verlag.