Deep Learning for NLP
Summer Semester 2017

Table of Contents

1 Course information

1.1 Description

In the past two decades, statistical approaches have become dominant in the field of natural language processing, where most work has relied on linear classifiers, such as perceptrons, log-linear models, and support vector machines with a linear kernel. However, due to recent theoretical and technical advances, the field has recently rekindled its interest in deep learning.

Deep learning consists of a set of algorithms and techniques that attempt to infer complex features of data. This typically reduces the amount of feature engineering that is necessary and finds interactions that would be difficult to find for humans. Consequently, deep learning techniques improved the state-of-the-art in many natural language processing tasks considerably.

This hauptseminar consists of two parts. The first part provides an introduction to deep learning-related techniques that are relevant to natural language processing, such as feed-forward neural networks, recurrent neural networks, recursive neural networks, word embeddings, and auto-encoders. In the second part, we will read and discuss papers that use deep learning for typical natural language processing tasks, such as morphological analysis, part-of-speech tagging, parsing, and sentiment analysis.

Throughout the course, we will implement many of the deep learning techniques using Google's Tensorflow library.

1.2 Time/location

  • Lecturer: Daniël de Kok <>
  • When:
    • Tuesday 14:15-16:00, Lecture Room (HS) 06, Neue Aula
    • Thursday 14:15-16:00, Room 0.02, Verfügungsgebäude, Wilhelmstraße 19
  • First lecture: Thursday, April 20
  • Questions: Appointment on request

1.3 Registration

To register for the course, send an email to before April 24 containing:

  • Your name
  • Your student number (Martrikelnummer)
  • Your program of study (including BA, MA, or minor)

1.5 Hardware/software

In order to work on the homework assignments and try out the examples, you will need a machine with the following software:

  • Python 3
  • NumPy
  • Tensorflow 1.0.x

1.5.1 Standard installation

You can install Tensorflow (and NumPy) following the instructions on the Tensorflow webpage:

1.5.2 pyenv installation

I personally use pyenv, which allows you to manage multiple Python versions on Linux and macOS. You can install Python 3 with the necessary packages in the following steps:

$ git clone ~/.pyenv

# Add the necessary paths to your bash configuration:
$ echo 'export PYENV_ROOT="$HOME/.pyenv"' >> ~/.bash_profile
$ echo 'export PATH="$PYENV_ROOT/bin:$PATH"' >> ~/.bash_profile
$ echo 'eval "$(pyenv init -)"' >> ~/.bash_profile

# Or if you are using zsh:
$ echo 'export PYENV_ROOT="$HOME/.pyenv"' >> ~/.zprofile
$ echo 'export PATH="$PYENV_ROOT/bin:$PATH"' >> ~/.zprofile
$ echo 'eval "$(pyenv init -)"' >> ~/.zprofile

# On macOS with Homebrew:
brew install readline xz
env PYTHON_CONFIGURE_OPTS="--enable-framework" pyenv install 3.5.3

# On Linux
$ sudo apt-get install -y make build-essential libssl-dev zlib1g-dev libbz2-dev \
  libreadline-dev libsqlite3-dev wget curl llvm libncurses5-dev libncursesw5-dev \
  xz-utils tk-dev
$ env PYTHON_CONFIGURE_OPTS="--enable-shared" pyenv install 3.5.3

# Version 3.5.3 should now be available through pyenv and you can set
# it to be the default version.
$ pyenv versions
 * system (set by /Users/daniel/.pyenv/version)
$ pyenv global 3.5.3
$ python --version
Python 3.5.3

# Install Tensorflow, this will install NumPy as a dependency:
$ pip install --upgrade tensorflow

2 Tentative schedule

Date Topic Extra
April 20 Introduction  
April 25 Linear algebra recap  
April 27 Linear algebra on machines Decimal to floating point converter
May 2 Linear algebra on machines Ungraded homework exercises
May 4 NumPy Solutions in-class assignments
May 9 Basic linear classifiers Binary perceptron notebook
May 11 Logistic regression  
May 16 Softmax regression Ungraded exercise, source code for the exercise, Code, in-class TensorFlow exercises
May 18 Feed-forward neural networks Logistic regression notebook
May 23 Regularization  
May 25 Holiday (Himmelfahrt)  
May 30 Recurrent neural networks  
June 1 Word embeddings  
June 6 Semesterbreak  
June 8 Semesterbreak  
June 13 Overflow  
June 15 Holiday (Fronleichnam)  
June 20 Literature  
June 22 Literature  
June 27 Literature  
June 29 Literature  
July 4 Guest lecture: Jianqiang Ma  
July 6 Literature  
July 11 Literature  
July 13 Literature  
July 18 Literature  
July 20 Literature  
July 25 Literature  
July 27 Literature  

2.1 Literature

3 Policy

3.1 Grading

During the course, there will be 3 homework assignments. These assignments will be clearly marked as such in the schedule. Besides that you will have to prepare and give a presentation on a paper from the literature list. Your final grade will consist of:

Component Weight
Homework assignment 1 20%
Homework assignment 2 20%
Homework assignment 3 20%
Project 40%

Completion of this course will give you 6 CP. An additional 3 CP can be obtained by writing a paper or doing an extra programming project, after discussion with the lecturer.

3.2 Honor code

You are encouraged to help one another in this course. We all need a little help sometimes and you can also learn quite a bit from helping others. However, there is a point where getting help turns into plagiarism. The solutions to the graded assigments must be your own work.

Things you are allowed to do:

  • Discuss a general approach to the solution of an assignment with your classmates.
  • Get help debugging your code, but only AFTER you have really tried it yourself.

Things you are not allowed to do:

  • Copy someone else's solution (in whole or in part).
  • Give your solution to an assignment (in whole or in part) to a classmate.
  • Get so much help on an assignment that you can no longer honestly call it your own.
  • Get outside help.

3.3 Homework submission

Please submit your homework as indicated in the homework assignment. Do not e-mail it directly to my university address.

Source files should start with the following lines as comments:

Author: Name, Martrikelnummer

Honor Code:  I pledge that this program represents my own work.

Author: Daniël de Kok

Created: 2017-05-22 Mon 10:23