Prof. Dr. Walt Detmar Meurers
Publications
Detecting Errors in Part-of-Speech Annotation

  

Markus Dickinson and Walt Detmar Meurers

  

Proceedings of EACL'03.

  

We propose a new method for detecting errors in ``gold-standard'' part-of-speech annotation. The approach locates errors with high precision based on n-grams occurring in the corpus with multiple taggings. Two further techniques, closed-class analysis and finite-state tagging guide patterns, are discussed. The success of the three approaches is illustrated for the Wall Street Journal corpus as part of the Penn Treebank.

  

The variation n-gram code used in the paper is freely available (written in python). Just send me an e-mail at the address below.

  


  

Electronically available file formats:

  • .pdf (55.116 bytes)

  


  

Bibtex entry:

  
    @InProceedings{dickinson:meurers:03,
    author =       {Markus Dickinson and W. Detmar Meurers},
    title =        {Detecting Errors in Part-of-Speech Annotation},
    booktitle =    {Proceedings of the 10th Conference of the European 
    Chapter of the Association for Computational Linguistics 
    (EACL-03)},
    pages=         {107--114},
    address =      {Budapest, Hungary},
    url =          {http://purl.org/dm/papers/dickinson-meurers-03.html},
    year =         {2003}
    }