DRI Discourse Structure In Dialogue

Homework 2 Verbmobil materials

General Information

The second homework consists of doing CGU/IU analyses for one VERBMOBIL and one MAPTASK dialogue. However, the MAPTASK dialogue must be distributed at a later date. We apologize for being 4 days late with the distribution. Since we only have one dialogue to code for now, we'd like to have the VERBMOBIL codings due April 27, 1998 as scheduled. (We will allow extra time for MAPTASK). Email chn@research.att.com about any scheduling difficulties. A few important notes: - Please email all HW2 related codings, questions, criticisms etc. to both traum@cs.umd.edu and chn@research.att.com. - FYI, the VERBMOBIL and MAPTASK dialogues are being coded by all three subgroups (coreference, speech act/utterace labels ("backward/forward-looking functions"), discourse structure). The purpose is to facilitate inter-group discussion and try to relate the various subgroup efforts. - For HW2, the coding procedure will be slightly modified: please do the CGU analysis first. Then, in contrast to HW1, you will use a different file of pre-ordained CGUs (verbmobil.fixed.cgus) as the basis for IU analysis. To repeat: do *NOT* use your CGU coding for IU analysis. The reason is to enable more comparable analysis of IU coding reliability on the HW2 dialogues by using fixed CGU units to start. ***Of course, you should not look at the verbmobil.fixed.cgus file until you are done with your own CGU analysis and are ready to proceed with IU analysis***. [Any methodological reflections on this we would appreciate hearing. Thanks to the many who volunteered ideas in the first round.] - ABOUT VERBMOBIL: [written by Mark Core, B/F group] "The Verbmobil project is a long term effort to develop a mobile translation system for spontaneous speech in face-to-face situations. The current domain of focus is scheduling business meetings. To support this goal, some English human-human dialogs were collected in this domain. Dialog r148c is one of these dialogs. In r148c, the two speakers try to establish a time and place for a meeting. The speakers are affiliated with a college of some kind as one speaker asks if they want to meet on campus or not and they mention a building called "Cyert Hall". The sound files provided correspond to turns (a continuous piece of text with the same speaker) so when playing an utterance's speech you will hear the whole turn." ================================= EXTRA NOTES ON VERBMOBIL DIALOGUE [from Norbert Reithinger] ================================= "These dialogues were recorded in the USA, so I do not know the exact conditions. The standard data collection setup in Verbmobil phase 1 were these data were recorded is as follows: two persons in one room, looking at each other, having calendars marked with dates, instructions about possible dates, and the topics they should mention. They have no [push-to-talk] button are were asked to avoid crosstalk. They did not rehearse, but sometimes speakers recorded more than one dialogue." Three VERBMOBIL files should be sent to chn@research.att.com *and* traum@cs.umd.edu (tarred and uuencoded): Files to turn in: 1. verbmobil.cgu.<coder's_login> 2. verbmobil.iu.<coder's_login> 3. misc.<coder's_login> [How to email uuncoded tar files in UNIX: > tar -cvf <login>.hw2.tar <file1> <file2> <file3> > uuencode <login>.hw2.tar <login>.hw2.tar >tmp > Mail chn@research.att.com,traum@cs.umd.edu <tmp ] Send any and all questions to both chn@research.att.com and traum@cs.umd.edu. Please do *NOT* mail questions on coding choices for the homework dialogues to the whole group, so as to not bias other coders. We are reviewing the HW1 data, comments and criticisms and will report back as soon as we can. We would like to post everyone's codings and misc comments in some form before May 17. If you would like to edit yours in any way or otherwise object, email chn@research.att.com please. Happy tagging! Christine and David