Zoie is a real-time search and indexing system built on Apache Lucene.
News: Zoie 2.0.0-rc2 is released (12/14/2009) - Compatible with Lucene 2.9.x.
Originally developed at LinkedIn.com.
Donated by LinkedIn.com on July 19, 2008.
Zoie is a mature open source project and has been deployed in a real-time large-scale consumer website: LinkedIn.com handling millions of searches as well as hundreds of thousands of updates daily.
All Zoie releases have gone through extensive functional and performance testing by LinkedIn before made public. All major versions are released after a trial period on the production environment.
In a real-time search/indexing system, a document is made available as soon as it is added to the index. This functionality is especially important to time-sensitive information such as news, job openings, tweets etc.
This poses the following challenges which Zoie addresses:
- Additions of documents must be made available to searchers immediately
- Indexing must not affect search performance
- Additions of documents must not fragment the index (which hurts search performance)
- Deletes and/or updates of documents must not affect search performance.
- ...
Additional Zoie features:
- fast lucene docid to uid mapping
- fast uid to lucene docid mapping (reverse id mapping)
- custom MergePolicy to handle realtime updates
- partial delete expunge for enhancing search performance without full optimize
- balanced index segment management
- full jmx console for indexing management/monitoring
- ...
Architecture Diagram: