ROMIP: Russian Information Retrieval Evaluation Seminar

 News 
 About 
 Manifesto 
 General principles 
 Participation 
 Test collections 
 Relevance tables 
 History 
 2004 
 2005 
 Publications 
 Forum 

По-русскиПо-русски
 

Fact Extraction Track

Fact extraction from news reports.

Overview

This track is dedicated to the problems related to fact extraction from texts. In 2006 the concrete tasks were:

  • proper nouns extraction
  • extraction of named entities of a given type
  • extraction of facts of a given type

Task Rules

  1. Extract all the named entities:
    For each given text participating system must build a list of named entities.
    For each entity the following information must be provided:
    • list of references to usages of the entity in the text (offsets and lengths in bytes)
    • (optional) specify the type of the entity: person/organization/place-name/other
  2. Extract facts of the following types:
    • Who worked/works in this organization?
    • Where worked/works the given person?
    • Who is the owner of the given organization?
    • What companies did/does the given person/organization own?
    Note: Company buyers, sellers, and shareholders are also accepted as owners.

    Participants must process the whole collection without using the results of the name entitities extraction.

    Fact description must include the following information:
    • fact type
    • reference to the text fragment, containing the fact description (offset, length (not longer than 500 bytes))
    • two standardized names of the objects referenced in the fact
    • reference to the entity in the text (offset from the beginning of the text fragment)
Participants are allowed to perform only the first task, the second one is optional.

Evaluation Methodology

Evaluation is carried out in two stages:
  1. Proper nouns check
    A random subset of the news reports in the collection is selected. Then we evaluate how good do participating systems extract the proper nouns found in this subset of news reports.

    Instructions for assessors: Is the given line a proper noun in the context of the given text fragment? If yes, then is it an organization, a person, or a place?
    Possible answers:not a proper noun, organization, person, place, other proper noun

  2. Facts check
    A certain number of proper nouns is selected (the selection procedure is not yet defined, but will be discussed with the participants) and fact extraction for these objects is evaluated.

    Instructions for assessors: Does the given text fragment contain the fact description connected with the following objects: (A, B)? If yes, which fact type it is?
    Possible answers: not a fact, purchase, selling, ownership, belonging, other.

Summary