Here is detailed information about the sixth cycle of ROMIP:
Results and ROMIP'2009 participants reports are available in the Publications section.
- February 5, 2009
Official start of ROMIP'2009. The call for participation
- September 21, 2009
ROMIP'2009 workshop took place in Petrozavodsk on September 16, 2009. It was collocated with RCDL'2009.
Agenda included 15 reports from participants and the round table discussion about ROMIP's future.
- Mikhail Ageev (Moscow State University, Russia)
- Alexander Antonov (Galaktika-Zoom, Moscow, Russia)
- Pavel Braslavski (Yandex, Ekaterinburg, Russia)
- Maxim Gubin (Facebook, USA)
- Boris Dobrov (UIS RUSSIA, Moscow, Russia)
- Mikhail Kostin (Mail.Ru, Moscow, Russia)
- Igor Kuralenok (St. Petersburg State University, Russia)
- Igor Nekrestyanov (Oracle Corporation, Russia)
- Marina Nekrestyanova (RedAril, St. Petersburg, Russia)
- Vladimir Pleshko (RCO, Moscow, Russia)
- Ilya Segalovich (Yandex, Moscow, Russia)
- Vlad Shabanov (Vertical Search, Moscow, Russia)
- Natalia Vassilieva (HP Labs, St. Petersburg, Russia)
ARE (Anchors and RElations) system extracts information from free text.
First, ARE extracts cue phrases (anchors) and, second, ARE evaluates
relations between the anchors.
The system EventSupervisor is experimental system of structurization of
a news web-stream. The basic idea of system consists in statistical
classification of documents with use of features inherent in web-stream of
news and actually news.
"Galaktika-Zoom" is a text mining solution working with unstructured
The system includes proprietary tools for textual data repository
creation and management, full-text search, automatic structuring, and
data analysis tools based on linguistic, mathematical and statistical
Explorative search engine based on mix of classic and our own original
algorithms. Testing of new ranging formulae is planned during the seminar.
IFM2 - experimental system for near-duplicate image detection. The system is
combines bio-inspired computational attention principles with
methods. The main idea is computation of local image regions, salient in
attention model. To represent these regions standard scale invariant
descriptors, such as SIFT, PCA-SIFT and SURF are used. The image is
described by a set of salient regions' feature vectors.
Thus the image comparison becomes a comparison of local interest point sets.
KGCDA is the system of context-dependent annotation based on use of text
fragments estimation multifactorial model and parametrical optimisation by
means of documents teaching sample.
In the context of CBIR track solution of a modified task is proposed:
build and save textual annotations for all of the images in the task
and then search among obtained annotations. For annotation we use the
probabilistic methods. In the task of near duplicate detection an
improving of the method based on multiscale representation of the
image is suggested. The idea is to analyze the signs of the gradient
of images for a few scales.
MnoGoSearch is free open source software for Unix-style operating
systems to organize search for a Web site or a group of sites.
mnoGoSearch is build on the inverted index technology
and uses the TF*IDF weight when ranking documents,
taking into account various additional parameters such
as word distance, section break-down, stemming word forms
and synonyms, and others.
RCO team is focused on research in area of computer linguistics and
development of text analysis solutions for full-text databases,
data-warehouses and BI systems. In the workshop we are planning to drive
several experiments on text categorization and document retrieval tasks.
Information retrieval system, version mod.2.5. The system is based on
traditional algorithms combined with our own developments.
Subject Search Sleuth (SSS)
Subject Search Sleuth (SSS) is a text search and annotation engine based on
the fast non-reconsidering full-text fuzzy pattern search algorithm
developed by Sergey Kryloff. The SSS algorithm supports cases when search
terms are absent, swapped or alternated with other terms in the answer.
Being based on notion of Q-Term (instead of word, its canonical form or,
stem) SSS is very flexible with regard to supporting multiple languages.
Current version supports 40 languages, including Asian ones, Arabic,
Indonesian and Hebrew.