ISCL Proseminar (Summer Semester 2015)

Statistical Natural Language Processing


First, the course introduces some basic statistics including descriptive statistics, elementary probability theory, distributions, hypothesis testing, as well as an introduction to regression and classification. Second, based on that theoretical background the course covers basic techniques in statistical natural language processing, such as Markov chains and hidden Markov models, as well as applications such as collocation discovery, language models, part-of-speech tagging, word sense disambiguation and text categorization.

Instructor: Serhiy Bykh

Tutor: Björn Rudzewitz

Course meets:

Credits for ISCL BA: 6 SWS = 9 Credit Points
For other degree programs, contact us for requirements and credits.

Moodle: We will be using the university Moodle site for the course, primarily for the discussing forum and to access course materials. Our course is accessible under Moodle at

To log into this specific Moodle site, you use your general ZDV university account id and password. The first time you access the course Moodle site, you need a course subscription password, which you get in class. Moodle and privacy: Note that Moodle generally keeps detailed logs of your interaction with the system, e.g., when you log in, etc.

Webpage: General information regarding the course is provided at

Email: In the Moodle system everyone in the course can send messages to other participants in the class, and we will use this to contact you for class related matters. Such email gets sent to your regular ZDV account ( So register in the Moodle during the first week of the semester, and read your university email regularly please.

Grading: The course will be graded based on participation, two homework assignments (2 25% = 50% of the final grade), and the final exam (50% of the final grade). Note: You have to obtain at least 60% of the points in the final exam to pass the course.

Academic conduct and misconduct: Learning and research are driven by discussion and free exchange of ideas, motivations, and perspectives. So you are encouraged to work in groups, discuss, and exchange ideas. At the same time, the foundation of the free exchange of ideas is that everyone is open about where they obtained which information. Concretely, this means you are expected to always make explicit when you’ve worked on something as a team – and keep in mind that being part of a team means sharing the work! For text you write, you always have to provide explicit references for any ideas or passages you reuse from somewhere else. This includes text “found” on the web, where you should cite the url of the web site in case no more official publication is available. Failure to follow these important guidelines is academic misconduct, which will be sanctioned by failing you on the assignment, exam, or the entire class depending on the severity of the violation.

Class etiquette: Please come to class on time, do not pack up early, read or work on materials for other classes during our class. If for some reason, you have to leave early or miss class for an important reason, please let me know before class. Note: Following the standard rules, missing more than two meetings unexcused, automatically results in failing the class.

Course Readings: