ICT Workshop on Dialogue Research

Research Talk Abstracts


Day One: May 15, 2008




DICO: Managing cognitive load in in-vehicle dialogue

Staffan Larsson, Gothenburg University


The overall purpose of the Dico project is to demonstrate how state-of-the-art spoken language dialogue systems can enable access to communication, entertainment and information services as well as to environment control in vehicles. The use of dialogue systems in vehicles raises the problem of making sure that the dialogue does not distract the driver from the primary task of driving. Of course, flexible dialogue management is one way of decreasing cognitive load; in DICO, we use the GoDiS dialogue manager (Larsson 2002). Concerning in-vehicle dialogue more specifically, earlier studies have indicated that humans are very apt at adapting the dialogue to the traffic situation and the cognitive load of the driver.  A goal in DICO is therefore to investigate strategies for managing cognitive load in in-vehicle dialogue. Results of these investigations will be used as a basis for the development of dialogue strategies in future version of the system.




Using Degrees of Grounding for Dialogue Managment

Antonio Roque, Graduate Research Assistant, USC



The Degrees of Grounding model defines the extent to which material being discussed in a dialogue has been grounded.  This model has been developed and evaluated by a corpus analysis, and includes a set of types of evidence of understanding, a set of degrees of groundedness, a set of grounding criteria, and methods for identifying each of these.  I describe how this model can be used for dialogue management.



A Computational Model of the Collaborative Use of Natural Language despite Private Uncertainty in Dialogue

David Devault, PhD Student University of Rutgers, Visiting Research Assistant, USC,



This dissertation is part of a project to develop a detailed and robust computational model of natural language dialogue that clarifies and reconciles the linguistic and collaborative reasoning that interlocutors need to perform in conversation.  This dissertation makes three primary contributions to this project.


The first contribution is a new lightweight theoretical model of natural language dialogue as a collaboration between interlocutors. The new model weakens strict assumptions about the alignment of interlocutor mental states in conversation that have traditionally been seen as essential to capturing the collaborative aspects of a speaker's linguistic choices (Clark and Marshall, 1981). By eliminating strict assumptions about how interlocutors reason about each other's mental states, our model significantly clarifies the methodological target of collaborative language use in implemented dialogue systems that face real-world uncertainties.


The second contribution is a new theoretical approach to modeling the current state of a conversation as an objective product of prior interlocutor action rather than a by-product of the interlocutors' current mental states.  We illustrate this objective view of context using COREF, an implemented dialogue system that collaboratively identifies visual objects with human users.  We use a set of user interactions with COREF to argue that this objective view of context offers several compelling

advantages for builders of collaborative dialogue systems. It enables a coherent, intuitive, and methodologically workable notion of an agent's uncertainty about what the current context is. It allows system builders to explore more flexible reasoning and strategies to manage their agent's uncertainty in interpreting utterances. And it supports more transparent and data-oriented system-building techniques.


The third contribution more closely characterizes the uncertainty that interlocutors face in dialogue as including numerous tacit events which affect both the current state of their collaborative activity and the details of how subsequent utterances should be formulated and interpreted. In particular, domain tasks, such as COREF's object identification task, exhibit their own idiosyncratic patterns of tacit events which interlocutors recognize and exploit for efficient communication.  We develop a framework by which dialogue systems can incorporate tacit events into interpretation and thereby support implicature and accommodation while communicating collaboratively under uncertainty.


A Common Ground for Virtual Humans: Using an Ontology in a Natural Language Oriented Virtual Human Architecture

Arno Hartholt, Research Programmer, USC



When dealing with large, distributed systems that use state-of-the-art components, individual components are usually developed in parallel. As development continues, the decoupling invariably leads to a mismatch between how these components internally represent concepts and how they communicate these representations to other components: representations can get out of synch, contain localized errors, or become manageable only by a small group of experts for each module. In this paper, we describe the use of an ontology as part of a complex distributed virtual human architecture in order to enable better communication between modules while improving the overall flexibility needed to change or extend the system. We focus on the natural language understanding capabilities of this architecture and the relationship between language and concepts within the entire system in general and the ontology in particular.




Computational Models of Non-cooperative dialogue

David Traum, Research Assistant Professor, USC


This  talk will outline some cases of noncooperative

communication behavior and computational dialogue mechanisms that can

support these kinds of behavior, including generating, understanding,

and deciding on strategies of when to engage in uncooperative behavios. Behaviors of

interest include

      unilateral topic shifts or topic maintenance



      unhelpful criticism

      withholding of information or services

      lying & deception



      rejection of empathy

The decision of whether to be cooperative or not and how to behave in

each case depends on a number of factors, including the standard

notions of belief, desire, intention, obligation, and initiative, but

also factors such as trust, solidarity, power, status, and respect.


We will present preliminary computational models of these factors and

illustrate their use with examples of interactions with the characters from the SASO and TACQ domains.



Day Two: May 16, 2008


Field Testing of an Interactive Question-Answering Character

Ron Artstein, Manager of Corpus Development, USC



We tested a life-size embodied question-answering character at a convention where he responded to questions from the audience. The character's responses were then rated for coherence. The ratings, combined with speech transcripts, speech recognition results and the character's responses, allowed us to identify where the character needs to improve, namely in speech recognition and providing off-topic responses.


Modelling and Detecting Decisions in Multi-party Human-Human Meetings

Raquel Fernandez, Stanford University



In an era where almost anything we do and say is recorded, the demand for automatic methods that process, understand and summarize information encoded in audio and video recordings of meetings is rapidly growing. Decision-making discussions constitute one of the key aspects of meeting interaction. In this talk I will present our ongoing research on modelling decisions in muti-party meetings and describe an automatic process to detect decision-making subdialogues. I will also briefly present recent results on using multimodal information for addressee detection in small-group meetings.


Tracking Dragon-Hunters with Language Models

Anton Leuski, Research Scientist, USC



We are interested in the problem of understanding the connections between human activities and the content of textual information generated in regard to those activities. Firstly, we define and motivate this problem as an important part in making sense of various life events. Secondly, we introduce the domain of massive online collaborative environments, specifically online virtual worlds, where people meet, exchange messages, and perform actions as a rich data source for such an analysis. Finally, we outline three experimental tasks and show how statistical language modeling and text clustering techniques may allow us to explore those connections successfully.



Towards a formal treatment of corrective feedback

Robin Cooper and Staffan Larsson

University of Gothenburg



In this paper we will present some preliminary work that we have been conducting on corrective feedback as in the example:


Child: Nice bear.


Adult: Yes, it's a nice panda


The idea is to bring together our previous work on plasticity and evolution (Larsson 2007), the GF (Ranta 2004, 2007) and TrindiKit/GoDiS architectures (Larsson 2002, Traum and Larsson 2003) and work on TTR (Cooper 2003, 2006).  We will suggest constructing GF/GoDiS agents in which resources can be updated. We also sketch a formal account of the semantic plasticity involved in corrective feedback.



* GF:  http://www.cs.chalmers.se/~aarne/GF/

* TrindiKit: http://www.ling.gu.se/projekt/trindi//trindikit/

* Cooper, Robin (2003): Records and record types in semantic theory, invited paper, Workshop on Lambda-Calculus, Type Theory, and Natural Language, King's College London, 8 and 9 December 2003, /Journal of Logic and Computation/ <http://www3.oup.co.uk/logcom/>, Vol. 15 No. 2, pp. 99--112.

* Cooper, Robin (2006): Austinian truth, attitudes and type theory (previous title: Austinian truth in Martin-Lf type theory), paper presented at the workshop Barwise and Situation Semantics, Stanford, Cal., 26 June 2003, in /Research on Language and Computation/ <http://www.dcs.kcl.ac.uk/journals/rolc/>, Vol. 3 (2005), pp. 333-362, published 2006.

* Staffan Larsson (2002): Issue-based Dialogue Management <http://www.ling.gu.se/%7Esl/Thesis>. PhD Thesis, Goteborg University.

* Staffan Larsson (2007): Coordinating on ad-hoc semantic systems in dialogue <http://www.ling.gu.se/%7Esl/Papers/larsson-decalog.pdf>. In Artstein and Vieu: Proceedings of DECALOG - The 2007 Workshop on the Semantics and Pragmatics of Dialogue.

* A. Ranta. Grammatical Framework: A Type-Theoretical Grammar Formalism. /Journal of Functional Programming/, 14(2), pp. 145-189, 2004. Draft available as ps.gz <http://www.cs.chalmers.se/%7Eaarne/articles/gf-jfp.ps.gz>.

* A. Ranta. Modular Grammar Engineering in GF. /Research on Language and Computation/, 2007, to appear. Draft available as pdf <http://www.cs.chalmers.se/%7Eaarne/articles/multieng3.pdf>.


* David Traum and Staffan Larsson (2003): The Information State Approach to Dialogue Management. In Smith and Kuppevelt (eds.): Current and New Directions in Discourse & Dialogue, Kluwer Academic Publishers.



Culture-Specific Conversational Behavior

David Hererra, University of Texas at El Paso





Virtual Extras: Multiparty Dialog Simulation for Background Virtual Humans

Dusan Jan, Graduate Research Assistant, USC



In this talk I will present our framework for behavior simulation of background virtual humans involved in multiparty conversation. I will present an overview of the theories that the framework is built on and describe the three main components of the framework: conversation algorithm, movement and repositioning algorithm

and cultural model. The talk will also cover our current work on introducing task-based variability to the simulations and our plan for future improvements.





Unsupervised Methods for Creating and Evaluating Dialogue Models

Sudeep Gandhe, Graduate Research Assistant, USC


Virtual humans are being used in a number of applications, including simulation-based training, multi-player games, and museum kiosks. Natural language dialogue capabilities are an essential part of their human-like persona. These dialogue systems have a goal of being believable and generally have to operate within the bounds of their restricted domains. Most dialogue systems operate on a dialogue-act level and require extensive annotation efforts. Semantic annotation and rule authoring have long been known as bottlenecks for developing dialogue systems for new domains. In this talk, we investigate several dialogue models for virtual humans that are trained on an unannotated human-human corpus. These are inspired by information retrieval and work on the surface text level. We evaluate these in text-based and spoken interactions and also against the upper baseline of human-human dialogues.




Evaluating such dialogue systems is seen as a major challenge within the dialogue research community. Due to very nature of the task, most of the evaluation methods need substantial amount of human involvement. Following the tradition in machine translation, summarization and discourse coherence modeling, we introduce the the idea of evaluation understudy for dialogue coherence models. Following (Lapata 2006), we use the information ordering task as a testbed for evaluating dialogue coherence models. This talk reports findings about the reliability of the information ordering task as applied to dialogues. We find that simple n-gram co-occurance statistics similar in spirit to BLEU (Papineni et al. 2001) correlate very well with human judgments for dialogue coherence.