The Third Workshop on Treebanks
and Linguistic Theories
(TLT 2004)

Workshop motivation and aims

Treebanks are a language resource that provides annotations of natural languages at various levels of structure: at the word level, the phrase level, the sentence level, and sometimes also at the level of function-argument structure. Treebanks have become crucially important for the development of data-driven approaches to natural language processing, human language technologies, grammar extraction and linguistic research in general. There are a number of on-going projects on compilation of representative treebanks for languages that still lack them (Bulgarian, Danish, Portugese, Spanish, Turkish) and a number of on-going projects on compilation of treebanks for specific purposes for languages that already have them (English). In addition, there are projects that go beyond syntactic analysis to include different kinds of semantic and pragmatic annotation.

The practices of building syntactically processed corpora have proved that aiming at more detailed description of the data becomes more and more theory-dependent (Prague Dependency Treebank and other dependency-based treebanks such as the Danish dependendency treebank, the Italian treebank (TUT), and the Turkish treebank (METU); Verbmobil HPSG Treebanks, Polish HPSG Treebank, Bulgarian HPSG-based Treebank, etc.). Therefore the development of treebanks and formal linguistic theories need to be more tightly connected in order to ensure the necessary information flow between them.

This series of workshops aims at being a forum for researchers and advanced students working in these areas. The third workshop will be held in Tübingen,Germany, 10-11 December 2004. (The first was held in Sozopol, Bulgaria in September 2002; see http://www.bultreebank.org/Proceedings.html), the second one in Växjö, Sweden in November 2003 (http://w3.msi.vxu.se/~rics/TLT2003/).

Topics of interest

We invite submission of papers on topics relevant to treebanks and linguistic theories, including but not limited to:

design principles and annotation schemes for treebanks;
applications of treebanks in acquiring linguistic knowledge and NLP;
the role of linguistic theories in treebank development;
treebanks as a basis for linguistic research;
semantically annotated treebanks;
evaluation of treebanks;
tools for creation and management of treebanks;
standards for treebanks.

Important dates

Deadline for workshop abstract submission
22 August 2004

Notification of acceptance
1 October 2004

Final version of paper for workshop proceedings
1 November 2004

Workshop
10-11 December 2004

Submissions

We invite extended abstracts (maximum 1500 words) describing existing research connected to the topics of the workshop. Please note that as reviewing will be blind, the abstract should not include the authors' names and affiliations. Furthermore, self-references that reveal the author's identity, e.g., "We previously showed (Smith, 1991) ...", should be avoided. Instead, use citations such as "Smith previously showed (Smith, 1991)...". Electronic submission (ps or pdf) is strongly encouraged.

Each submission should additionally include in the accompanying email: title; author(s); affiliation(s); and contact author's e-mail address, postal address, telephone and fax numbers.

The presentation at the workshop will be 25 minutes long (20 minutes for presentation and 5 minutes for questions and discussion). The final version of the accepted papers may not exceed 12 A4 pages.

`Invited speakers`

Fred Karlsson (University of Helsinki)

Collin Baker (International Computer Science Institute)

Program committee

Emily Bender, USA

Thorsten Brants, USA

Koenraad de Smedt, Norway

Eva Ejerhed, Sweden

Tomaz Erjavec, Slovenia

Annette Frank, Germany

Jan Hajic, Czech Republic

Erhard Hinrichs, Germany

Kimmo Koskenniemi, Finland

Tony Kroch, USA

Matthias Trautner Kromann, Denmark

Sandra Kübler, Germany (co-chair)

Yuji Matsumoto, Japan

Detmar Meurers, USA

Joakim Nivre, Sweden (co-chair)

Karel Oliva, Austria, Czech Republic

Petya Osenova, Bulgaria

Beatrice Santorini, USA

Kiril Simov, Bulgaria

Martin Volk, Sweden

Sponsoring organisations

Special Resarch Program "Linguistic Data Structures" (SFB 441) at the University of Tübingen

Nordic Treebank Network (Nordic Language Technology Program 020528)

The Third Workshop on Treebanks and Linguistic Theories (TLT 2004)