SIGIR 2004 Workshop
New Directions For IR Evaluation: Online Conversations
Sheffield, UK, July 29 2004

Background and Theme

An online conversation is not very different from the conversations people have been having for thousands of years. Topics are introduced, ideas are shared, and sometimes enlightenment is forthcoming. The difference is that these conversations are sometimes recorded and archived for future use. The knowledge that is created and shared in conversations can therefore be preserved and later accessed by people who did not participate in the original discussion.

Another difference is that many non-verbal cues that we are used to interpreting in face-to-face conversations are missing. Contextual information about where conversation partners are located and what they are doing is also reduced. Consequently, the knowledge that we exchange via online textual conversations is primarily explicit; tacit knowledge tends to be thin if present at all.

Examples of "on-line conversations" include personal electronic mail, mailing lists, instant messaging (IM), Short Message Service (SMS) notes, chat rooms, Usenet newsgroups, threaded Web-based discussion lists, and massive multi-player on-line role playing games (MMORPG). Conversational content poses a number of interesting challenges to systems designed to support access, including exploitation of discourse and dialog structure (e.g., to support thread-based access), the prevalence of informal language and emergent sub-languages, and the importance of establishing adequate context to interpret retrieved materials.

Online conversations have three distinct properties that set them aside from the traditional document-based dissemination of information and provide us with a fascinating opportunity to study the immediate connections between peoples, their actions, behavior, and language:

  • Authorship: Every part of a conversation has a unique, and often identifiable, author. Conversations link different people together, and people link different conversations together. If a single conversation can be represented as a graph of message exchanges, then a collection of conversations creates a meta-network on top of the multiple text fragments that could explored and exploited.

  • Interactivity: Imagine a Web-based search engine that always returns the most relevant Web page at the top of the ranked list. How should we study what happens next? Online conversations may provide more insight into the evolutionary nature of information seeking. An on-line conversation's initiation and existence is sometimes tightly linked real-life information needs. Responses that are immediate and on-topic, provided by other participants in the conversation, may help us to better understand how the information need evolves with each response and how to determine when that need is satisfied.

  • Outcome: a conversation may result in a transfer of information or in actions taken by the participants after the conversation. In some cases (e.g., email) the outcome of a conversation might be indirectly traced from followup discussions; in other situations (e.g. on-line games) the result of the discussion will be readily available from system logs.

Goal

Our goal for this workshop is to focus on the domain of on-line conversations, bring together researchers from information retrieval and related research communities (e.g., recommender systems, text data mining, computer-supported cooperative work, and online communities) to see whether there is a sufficient interest in the IR community to study the genre. We plan to organize this workshop around two key questions:

  1. What are the unique information seeking tasks that exist in the domain of online conversations
  2. What opportunities exist to foster important new research through the creation of test collections for genre that have not previously been available?

One possible set of dimensions of the online conversations that can be explored is the following:

  • Direction: One-way vs. two-ways vs. group discussion
  • Timing: Asynchronous vs. asynchronous
  • Channel: Peer-to-peer (P2P) vs. client-server
  • Context: Social vs. organization vs. individual
  • Content: Structured vs. free-form

We hope that this workshop would result in the development of at least one specific proposal for creation of a new track at TREC or some similar venue.

Our goal for this workshop is to focus on the domain of on-line conversations, bring together researchers from information retrieval and related research communities (e.g., recommender systems, text data mining, computer-supported cooperative work, and online communities) to see whether there is a sufficient interest in the IR community to study the genre, and propose the ways in which such a study could be facilitated through the creation of standard test collections.