ROMIP: Russian Information Retrieval Evaluation Seminar

 News 
 About 
 Manifesto 
 General principles 
 Participation 
 Test collections 
 Relevance tables 
 History 
 2004 
 2005 
 Publications 
 Forum 

По-русскиПо-русски
 

Legal Classification Track

Overview

The purpose of this track is to evaluate methods of document classification on a collection of legal documents.

For this track the standard procedure is used.

Test Collection

The source dataset is the collection of legal collection (2004).

Task Description for Participating Systems

Each participant is granted access to the tarining set, and a set of documents from the collection. The task is to assign topic(s) from the training set to each document from the collection. Valid number of topics per document is from 0 to 5.

The training set is a subset of the categories based on the Kodeks catalog. Documents of the training set are located in legal_training.* archives.

All the documents from the collection must be classified by participants, so for each of these documents must be specified a list of topics (which should be sorted in descending order of confidence).

Evaluation Methodology

  • A random subset of the categories is selected. The full Kodeks catalog (which was verified by experts manually) is used for the evaluation of the selected subset.
  • official metrics:
    • precision
    • recall

Data Formats