ROMIP: Russian Information Retrieval Evaluation Seminar

 General principles 
 Test collections 
 Relevance tables 


Web Adhoc Track

Adhoc search in a Web collection


The purpose of this track is to evaluate adhoc search methods on a Web collection. The dataset that is used imitates Web documents and Web queries.

For this track the standard procedure is used.

Test Collection

The source dataset is the union of BY.web and collections.

Task Description for Participating Systems

Each participant is granted access to the BY.web and collections and the set of queries. The number of queries in the set is 19627. The set was formed as follows:

  • all queries from ad hoc search in the web collection tracks of previous ROMIP cycles
  • a selection of queries from the logs of Yandex search engines.
    Selection procedure: start from log of queries with at least one result document, remove all queries with search operators and "adult" queries, select every 100th query.
  • a selection of queries from the logs of search engine

Expected result is an ordered list of document URLs. Maximum list size is 100 per query.

If processing of both collections is not possible then participant may return results based on search in one of the collections only (i.e. either from KM.RU or By.web).

Evaluation Methodology

  • instructions for assessors:
    Assessors evaluate document relevance to the query basing on the extended description of the user information need.
  • evaluation method: pooling (pool depth is 50)
  • relevance scale:
    • yes / probably yes / perhaps yes / no / impossible to evaluate
    • yes / no / impossible to evaluate
  • official metrics:
    • precision
    • recall
    • TREC 11-point precision/recall graph
    • bpref

Data Formats