Details
Kamikaze version 1.0.0 provides docset implementations on various underlying document id set representations for inverted lists in search engines. Currently the supported implementations include
- Integer Array representation : Document set based on Dynamic Integer Arrays
- OpenBitSet representation : Document Set based on OpenBitSet implementation from Lucene.
- P4Delta representation : Document Set for sorted Integer segments compressed using a variation of the P4Delta compression algorithm.
References
The library also provides elementary set (AND|OR|NOT) operations on DocSets without materializing the final document set, this is extremely useful for large sorted integer segments.