The fourth type of connection that can be explored using eArchivarius is the relationship between individual messages. This approach is similar to the traditional Information Retrieval where we show the system one message and ask it for “more information like that.” This figure shows the top 50 the most similar messages. This presentation is similar to the social network visualization. Here the spheres correspond to the individual email messages and the columns contain the subject lines from those messages. The spheres are positioned in proportion to the inter-message similarities. We can define that similarity in several different ways. For example, we can define the similarity based on the words or the content of the messages as it is done in the Lighthouse system. In our example the similarity is based on how close the audiences of two messages match -- how much overlap is between sets of people that appear on each message. Thus if two message were circulated among the same group of people, the corresponding spheres will be located near each other.
The original message (or the “query”) is shown as a yellow sphere at the center of the picture. eArchivarius can assign category labels to individual objects based on the user’s input. In our example we selected one message with the subject header “CONTRA PROJECT” and highlighted it in blue. eArchivarius automatically discovered and assigned the same category to similar messages. Those messages form a tight cluster to the right of the query sphere. A brief examination of the subject lines for those messages reveals that they are discussing Nicaragua and Iran. The spheres corresponding to the the messages dealing with Libyan terrorism occupy the left side of the visualization and colored in green. We can easily tell that those messages have a very different audience from the Nicaragua and Iran emails.