Linguistics 384: Language and Computers

Course goals: In the past decade, the widening use of computers has had a profound influence on the way ordinary people communicate, search and store information. For the overwhelming majority of people and situations, the natural vehicle for such information is natural language. Text and to a lesser extent speech are crucial encoding formats for the information revolution.

In this course, you will be given insight into the fundamentals of how computers are used to represent, process and organize textual and spoken information, as well as tips on how to effectively integrate this knowledge into working practice. We will cover the theory and practice of human language technology. Topics include text encoding, search technology, tools for writing support, machine translation, dialog systems, computer aided language learning and the social context of language technology.

GEC: The course satisfies the GEC category 2B (Mathematical and Logical Analysis). It does so by using natural language systems to motivate students to exercise and develop a range of basic skills in formal and computational analysis. The course philosophy is to ground abstract concepts in real world examples. We introduce strings, regular expressions, finite-state and context-free grammars, as well as algorithms defined over these structures and techniques for probing and evaluating systems that rely on these algorithms. The course goes beyond merely subjective evaluation of systems, emphasizing analysis and reasoning to draw and argue for valid conclusions about the design, capabilities and behavior of natural language systems.

Instructor: Detmar Meurers Course meets: Monday 11:30-1:18pm in 48 Derby and Wednesdays 11:30-1:18pm in 345 Central Classrooms

Carmen: We'll be using the Carmen course management tool for the course, which is accessible at http://carmen.osu.edu. You'll use it to Note that email from Carmen is sent to the official email addresses (Name.Number@osu.edu) of the students enrolled in the class and the instructor. You should read email sent to your official osu account on a daily basis---it'll also help you avoid high library fines!

Carmen and privacy: Be aware that the Carmen system as it is set up at OSU keeps detailed logs of your interaction with the system, e.g., when you log in, how long you take to complete which question of the quiz, etc.

Anonymous feedback: If you have comments, complaints, or ideas you'd like to send me anonymously, you can use the web form at http://purl.org/net/dm/feedback/ to do so.

Please send me ordinary email for anything that you'd like to receive a reply to---there really is no way for me to find out who sent me something via the anonymous feedback form!

Readings: There is no textbook for this course (the topic is quite new, at OSU and elsewhere). There will be some readings assigned periodically throughout the course.

I will distribute slides in class for each unit. These will also be available on the web after the class in which they are first distributed. These slides are only a skeleton of the material covered; they cannot replace actually being in class. In my experience, students who actively participate in class enjoy the course more and get much better grades than those who don't---very surprising, isn't it? ;-)

Course requirements: The basic requirement is regular attendance in class and active participation. There will be roughly one online quiz per topic, to ensure the material covered in class is mastered. And there will be one homework (exercise sheet) per topic, which are intended to give the opportunity to explore new aspects of the topics discussed in class. The midterm will consist of the material covered in the first half of the class, and the final will cover the contents covered in the second half of the class.

Grading: Grades will be based on participation in classroom discussion and group work, quizzes, homeworks, a midterm exam, and the final examination, using the following scheme:
Participation 10%  
Quizzes 20%  
Homeworks 30%  
Midterm 20%  
Final 20%  
Make-up Policy: If you know you won't be able to make a deadline or exam, please see me before you miss the deadline or exam. If you miss the midterm or final, you will have to provide extensive written documentation for your excuse.

As you generally will have a week to take them, there are no make-ups for the quizzes.

Academic Misconduct: To state the obvious, academic dishonesty is not allowed. Cheating on tests or on other assignments will be reported to the University Committee on Academic Misconduct. The most common form of misconduct is plagiarism. Remember that any time you use the ideas or the materials of another person, you must acknowledge that you have done so in a citation. This includes material that you have found on the Web or given to you by another student by email, telephone, in person, etc. The University provides guidelines for research on the Web at http://gateway.lib.ohio-state.edu/tutor/ and you can find the Student Code of Conduct at http://studentaffairs.osu.edu/resource_csc.asp

Class etiquette: I expect you to respect one another, to respect me, and to respect yourself. To that end, I expect you to obey the following rules: Topics:
  1. Storing language on the computer: Text and speech encoding.
    Writing systems used for language. Representing text on the computer. Digital representations of speech.
  2. Searching: web, library catalogs, and other language-based databases
    What facilities exist for searching for language-based information? Different query languages and what they allow you to do. Differences between specific and general queries. How to evaluate the results of a search.
  3. Classifying documents: Language identification and spam filtering
    Techniques for classifying documents. What language(s) are they written in? Are they junk mail? Are statistical techniques better than rule-based ones, or not? When will the techniques fail?
  4. Writer's aids: Spelling and grammar correction
    What do so-called ``grammar checkers'' and ``spelling correctors'' do? What do such programs base their advice on? When does it make sense to use such tools and what kind of errors are to be expected?
  5. Machine translation
    What do the free internet-based translation services manage to do---and where do they fail? For what purposes can automatic machine translation work reliably? What translation support functions can a computer provide? A closer look at what makes machine translation such a hard task. Is it the grammar, the meaning, the culture, all three, or something else?
  6. Dialog systems
    Eliza and its surprising success in engaging people in conversation. When are dialog systems used, for what purpose? A closer look at the components of a dialog system. Where is what kind of knowledge needed to make it work?
  7. Computer-Aided Language Learning
    What is involved in learning a foreign language? What role in language learning can computers play: from vocabulary training, via presentation of learning material, to providing feedback on learner errors and progress.
  8. Social context of language technology use
    How do we react to computers that make use of language? What does it mean for the way we see ourselves? What assumptions do we make about every user of language, be it a human or a machine.
Schedule: The latest version of the schedule is always available from our web page. After the lectures, the titles in the schedule below are linked to the handouts we used (in pdf format); the same is true for the homework sheets.

Week Month Date Day Topic Assignments
1 Sep 20 W Introduction due at 11:30am
2   25 M 1. Text and speech encoding (9up)  
    27 W    
3 Oct 2 M    
    4 W 2. Searching (9up) Quiz1, HW1
4   9 M    
    11 W    
5   16 M 3. Text Classification (Spam filtering) (9up) Quiz2, HW2, HW2-solutions
    18 W    
6   23 M 4. Writer's aids (9up) Quiz3
    25 W    
7   30 M   Quiz4, HW3 (Ex. 1 only)
  Nov 1 W Midterm (review sheet)  
8   6 M John Nerbonne talk, 122 Oxley Hall HW3 (Ex. 2 only)
    8 W    
9   13 M 5. Machine Translation (9up)  
    15 W   HW3 (Ex. 3--5)
10   20 M    
    22 W   Quiz5
11   27 M 6. Social context of technology use (9up)  
    29 W   HW4
12 Dec 5 T Final (review sheet)  
        Note that it's on a TUESDAY!  
        Location: CC 345, Time: 11:30-1:18.  

Disclaimer: This syllabus is subject to change.



Students with Disabilities: Students who need an accommodation based on the impact of a disability should contact me as soon as possible to discuss the course format, to anticipate needs, and to explore potential accommodations. I rely on the Office of Disability Services for assistance in verifying the need for accommodations and developing accommodation strategies. Students who have not previously contacted the Office for Disability Services are encouraged to do so (292-3307; http://www.ods.ohio-state.edu).


This document was translated from LATEX by HEVEA.