Task 15: TempEval Temporal Relation IdentificationOrganized by
Latest News and Mailing List
Short Task Description
We specify three separate tasks that involve identifying event-time and event-event temporal relations. A restricted set of temporal relations will be used, which includes only the relations: BEFORE, AFTER, and OVERLAP (defined to encompass all cases where event intervals have non-empty overlap).
Long Task Description
Newspaper texts, narratives and other such texts describe events which occur in time and specify the temporal location and order of these events. Text comprehension, even at the most general level, involves the capability to identify the events described in a text and locate these in time. This capablity is crucial to a wide range of NLP applications, from document summarization and question answering to machine translation. Furthermore, recent work on the annotation of event and temporal relations have resulted in both a de-facto standard for expressing these relations (TimeML) and a hand-built gold standard of annotated texts (TimeBank). These have already been used as the basis for automatic Time and Event annotation tasks in a number of research projects in recent years.
As in many areas of NLP an open evaluation challenge in the area of temporal annotation will serve to drive research forward. The automatic identification of all temporal referring expressions, events and temporal relations within a text is the ultimate aim of research in this area. However, addressing this aim in a first evaluation challenge is likely to be too difficult and a staged approach more effective. Thus we here propose an initial evaluation exercise based on three limited tasks that we believe are realistic both from the perspective of assembling resources for development and testing and from the perspective of developing systems capable of addressing the tasks.
Given a set of test texts (DataSet1) for which (1) sentence boundaries are annotated, (2) all temporal expressions are annotated in accordance with TIMEX3, (3) the document creation time (DCT) is specially annotated, and (4) a list of root forms of event identifying terms (the Event Target List or ETL) is supplied, complete the following tasks
Participants will be supplied with a version of TimeBank (183 documents, approx. 2500 sentences) which has had TimeML annotations removed or modified so they contain only the information to be supplied in the test corpus plus the TLINK annotations to be found as part of the task definitions.
The test corpus will consist of a number of articles not currently included within TimeBank, which will be annotated in accordance with the schemes outlined above. For tasks A and B, it is intended that this should include at least 5 occurrences for each item in the ETL. For task C, we propose to annotate around 20-25 news articles (including of the order of 200-250 sentences) drawn from sources similar to those used for TimeBank.
Evaluation MethodologyTasks A, B and C can all be seen as classification tasks, where a given temporal links is assigned a relation type from the set BEFORE, AFTER, OVERLAP, BEFORE-OR-OVERLAP, OVERLAP-OR-AFTER or VAGUE. Precision and recall over these relation types are used as evaluation metrics.
Last modified January 12th, 2007