MEGRASP

(A syntactic parser for CHILDES trascripts)

 

Kenji Sagae

Institute for Creative Technologies, University of Southern California

 

MEGRASP is a dependency parser for identification of grammatical relations in child language transcripts in the CHILDES Database.

 

MEGRASP will soon be distributed within CLAN, a collection of utilities for editing and  processing child language transcripts (including the MOR morphological analyzer and the POST part-of-speech tagger).  This page contains only MEGRASP itself, which requires input files in CHAT format (PDF link), with part-of-speech tags produced by POST (or manually assigned).  If this does not sound familiar, consult the main CHILDES web site.

 

Download

Warning: the part-of-speech tags used in the CHILDES database have changed since the parser was released! This means that MEGRASP will NOT WORK properly until I update the parser models. If you have the latest version of CLAN and the English lexicon for MOR, the parser will produce garbage.

If you are in a hurry, you can try replacing the file megrasp.mod in the distribution with the following file: megrasp.mod.zip (copy the file to your megrasp directory where the current megrasp.mod is, and unzip megrasp.mod.zip). If this still doesn't work, or if you are unsure how to replace the file, feel free to contact me. Or, just wait for the next update in the coming weeks.

MEGRASP v0.7 (released June 15, 2007)

· MS-Windows

· Cygwin

· Linux

· Mac OSX

·  Updated source code coming soon (if you want source code now, you can get the code for my CoNLL-style dependency parser).

 

After downloading and unzipping the archive for your platform, please look at the README file (README.txt in the Windows distribution) for instructions on running the parser.

 

Please feel free to contact me at sagae+megrasp@cs.cmu.edu with questions, comments, requests and bug reports.

 

For more information about MEGRASP, see the following paper (please cite it in work based on MEGRASP output).

 

Sagae, K., Davis, E., Lavie, A., MacWhinney, B. and Wintner, S. 2007. High-accuracy annotation and parsing of CHILDES transcripts. Proceedings of the ACL-2007 Workshop on Cognitive Aspects of Computational Language Acquisition. Prague, Czech Republic.

 

A (slightly out-of-date) description of the grammatical relations used by MEGRASP is available here.