Linguistics 884: Seminar in Computational Linguistics (Spring 06)
Processing Unexpected Input

The course addresses an interface of linguistic analysis and Natural Language Processing: How can an NLP system that integrates some linguistic knowledge deal with input that is ill-formed, beyond the scope of the encoded linguistic knowledge, or otherwise unexpected?

We will organize our discussion around a specific source of input: Learners of a foreign language using tools for Computer-Aided Language Learning. Generally speaking, there are two approaches to processing learner language: So-called "mal-rules" try to expect the unexpected and then license ill-formed language in the same way as well-formed language. The second class of approaches tries to alter the general method for combining pieces of linguistic knowledge to allow for analyses to succeed that would otherwise fail.

In approaching the topic, we will primarily focus on syntactic and morphological aspects and integrate research that has been discussed under the name of robust parsing. (As illustrated by the workshop on "Robust Methods in the Analysis of Natural Language Data" to be held at EACL’06, the topic also connects to an active general discussion of the issue of robustness in NLP.)

One issue that will recur throughout the seminar is the question what the different approaches are trying to be robust to—determining and modeling the space between the expected norm and the actual input. Complementing our focus on syntax and parsing, we will finish by taking a look at a related semantic issue that arises when trying to match the contents of an expected answer with the actual answer provided by the user.

Instructor: Detmar Meurers

Course meets: Wednesday 3:30–5:18 in 245 Central Classrooms and Fridays 10:00-11:48 in 262 Denney Hall

Course website: http://ling.osu.edu/~dm/06/spring/884/

Course email (at ling.osu.edu): 884dm (This reaches all people enrolled in the seminar)

Anonymous feedback: If you have comments, complaints, or ideas you’d like to send me anonymously, you can use the web form at http://ling.osu.edu/~dm/feedback/ to do so. Please send me ordinary email for anything that you’d like to receive a reply to—there really is no way for me to find out who sent me something via the anonymous feedback form!

Students with Disabilities: Students who need an accommodation based on the impact of a disability should contact me to arrange an appointment as soon as possible to discuss the course format, to anticipate needs, and to explore potential accommodations. I rely on the Office of Disability Services for assistance in verifying the need for accommodations and developing accommodation strategies. Students who have not previously contacted the Office for Disability Services are encouraged to do so (292-3307; http://www.ods.ohio-state.edu).

Academic Misconduct: To state the obvious, academic dishonesty is not allowed. Cheating on assignments will be reported to the University Committee on Academic Misconduct. The most common form of misconduct is plagiarism. Remember that any time you use the ideas or the materials of another person, you must acknowledge that you have done so in a citation. This includes material that you have found on the Web. The University provides guidelines for research at http://gateway.lib.ohio-state.edu/tutor/.

Course prerequisites: LING 684.01 or an equivalent introduction to parsing and experience with Prolog.

Nature of course: This is a research-oriented seminar, i.e., each participant is expected to take an active role as a researcher.
More concretely, each participant is expected to

  1. actively participate in the class discussion
  2. explore and present one to four approaches: select and announce the approaches, research them, present them in class using overheads (made available as handouts).

    Focus on the following issues when analyzing the approaches:

  3. work out a research/project idea related to the topic of this seminar.

Topics:


This document was translated from LATEX by HEVEA.