Branch Image is a real time research search engine for image retrieval
and classification. Branch Image uses a clustering in space of image
features having different weights. The weights are obtained as a result of
subjective experiments. A number of high level and low level image features
is already elaborated but the research continues now.
Topic detection and tracking system based on modified version of CMU TDT
algorithm that takes into account some semantic and stylistics aspects.
"Galaktika-Zoom" is a text mining solution working with unstructured data.
The system includes
proprietary tools for textual data repository creation and management,
automatic structuring, and data analysis tools based on linguistic,
mathematical and statistical
Experimental search engine using as standard as well as custom search
algorithms. On the seminar
it's planned to test several relevance estimation algorithms based on
in-depth analysis of
IFM - experimental system for near-duplicate image retrieval and detection.
The system is based on interest point detection methods such as Difference
Laplasian of Gaussian and other.
The main idea is computation of local image features, which are robust to
changes due to
Instead of using low-level global features, the image is described by a set
interest point feature vectors.
Thus the image comparison becomes a comparison of local interest point sets.
For this task to be solved, scalable indexing and retrieval methods are
The goal of research is comparison and generalization of existing methods.
ImSim is a near-duplicates detection system developped by HPLabs.
Our approach involves a descriptor extraction step for every image which is
followed by a hashing
algorithm. Image descriptors are calculated based on texture and color
features for regions of
interest (ROI) built around key points. SIFT algorithm is used to find key
points in the picture.
Texture of the ROI is described by smoothed and normalized gray-scale
intensity levels. Color
histogram is used as a color feature.
Matching of image descriptors is performed by using the Locality Sensitive
Hashing (LSH) approach.
Every descriptor is mapped to a hash: the closer the descriptors are to each
other in cosine
distance, the higher the probability that their hashes are identical. The
descriptors we use are
designed to work well with cosine similarity measure.
The main idea of the method proposed for
content-based image retrieval implies transformation of a source image to
the special form. The core representation of an input image is realized by
means of so-called Matrix of Brightness Variation. To compare similarity of
given images a special measure is introduced. In fact, this measure is a
weighted pseudometrics which involve signs of partial derivatives of
brightness function of color image components. The proposed approach can be
used both for content-based image retrieval and near duplicates detection.
Open source search engine using database as repositary.
Context-dependent classification and search system rested on representation
of the text corpus in the form of an associative semantic network.
RCO team is focused on research in area of computer linguistics and
development of text analysis solutions for full-text databases,
data-warehouses and BI systems. In the workshop we are planning to drive
several experiments on text categorization and news clustering tasks.
A library and a set of test utilities developed
for experiments in the areas of data compression, optimal indexing,
statistical modeling, and machine learning.
Information retrieval system, version mod.2. The system is based on
traditional algorithms combined with our own developments.
Research project aimed for creating and evaluating recurrent
thematic Web search system.
Subject Search Sleuth (SSS)
Subject Search Sleuth (SSS) is a text search and annotation engine based on
the fast non-reconsidering full-text fuzzy pattern search algorithm
developed by Sergey Kryloff. The SSS algorithm supports cases when search
terms are absent, swapped or alternated with other terms in the answer.
Being based on notion of Q-Term (instead of word, their canonical form or
stem) SSS is very flexible in regard to supporting multiple languages.
Current version supports 40 languages, including some Asian languages,
Arabic, Indonesian and Hebrew.