This comprehensive library set can provide you with a search results clustering engine that features several clustering algorithms

Carrot2 is a reliable document clustering workbench, which can help you perform advanced searches, crawl and index website components. The program is a suitable tool for testing clustering algorithms on Web search results or on data that you provide.

Powerful organizer and search engine

Carrot2 Workbench is Java-built and allows you to perform advanced searches and organize the search into various topics. It can automate this process using the integrated functions, not needing third-party tools, such as taxonomies or pre-classified content.
The program offers you two documents clustering algorithms, which allow you to cluster the search results based on Suffix Tree Clustering and Lingo methods. The program can also fetch data from searching engines, which support specific APIs, such as Microsoft Bing or PubMed and sources of documents.
Lucene, Apache Solr or ElasticSearch are supported as sources of documents, plus they are used as replacements for the native crawler and indexer.

Easily control and preview the results of your search

Even though Carrot2 does not feature native crawler and indexer functions, it supports several projects that can replace these features. For example, Nutch can be used for website crawling and you can index or search through your content using Lucene, Solr.
Moreover, the program cannot crawl a website, but it can add search results clustering features to an existing engine.
Carrot2 Workbench can automatically cluster your search results, but it also allow you to manually configure the clustering settings and fine tune the process.

Integrate clustering search results options in other applications

Carrot2 works as a standalone program, but it can also be integrated into other Java-based applications to help you implement other clustering functions. Moreover, you can extend its functionality by associating it with the supported crawling or indexing algorithms. Additionally, it can help you cluster up a multitude of documents with several paragraphs each.
26.6 MB
Update Date
BSD License
Created By
Stanislaw Osinski & Dawid Weiss
Related software Development