About

Egothor is an Open Source, high-performance, full-featured text search engine written entirely in Java. It is technology suitable for nearly any application that requires full-text search, especially cross-platform. It can be configured as a standalone engine, metasearcher, peer-to-peer HUB, and, moreover, it can be used as a library for an application that needs full-text search.

Key features of egothor

  • Written in JAVA for cross platform compatibility
  • New dynamization algorithm for fast index updating
  • Fully 64-bit kernel
  • Transactions (ACID)
  • Plagiarism detection
  • Document revisions, Xdelta
  • Incremental updates
  • Queries can be solved in a parallel manner
  • Able to recognize the most familiar file formats: HTML, PDF, PS, and Microsoft's DOC, and XLS
  • Based on the extended Boolean model which can operate as the Vector or Boolean models
  • Universal stemmer that processes any language

History

Egothor v1 tried to solve an issue of fragments related to index maintenance. It also developed a universal stemmer that was able to process any language.

The next version (v2) was partially developed by students of the Faculty of Mathematics and Physics, Charles University in Prague. They implemented a couple of interesting components, and tested the system in several specific configurations.