Under the Hood in Apache Lucene 4.0

“One of the most significant changes in Lucene 4.0 is the full
switch to using bytes (UTF8) in place of text strings for indexing
within the search engine library. This change has improved the
efficiency of a number of core processes: the ‘term dictionary’,
used as a core part of the index, can now be loaded up to 30 times
faster; it uses 10% of the memory; and search speeds are increased
by removing the need for string conversion.

“This switch to using bytes for indexing has also facilitated
one of the main goals for Lucene 4.0, which is ‘flexible indexing’.
The data structure for the index format can now be chosen and
loaded into Lucene as a pluggable codec. As such, optimised codecs
can be loaded to suit the indexing of individual datasets or even
individual fields.”

Complete Story

Under the Hood in Apache Lucene 4.0

Get the Free Newsletter!

Must Read

Banana Pi BPI-F3 Single Board Computer Running Linux: Power Consumption

Secretless Broker: Open-source tool connects apps securely without passwords or keys

TEAMGROUP PD20 Mini External SSD Review

Chapter #18: How to Manage Containers Using Podman and Skopeo in RHEL

10 Linux Interview Questions with Examples – Part 3

Our Brands

Under the Hood in Apache Lucene 4.0

Get the Free Newsletter!

Must Read

Banana Pi BPI-F3 Single Board Computer Running Linux: Power Consumption

Secretless Broker: Open-source tool connects apps securely without passwords or keys

TEAMGROUP PD20 Mini External SSD Review

Chapter #18: How to Manage Containers Using Podman and Skopeo in RHEL

10 Linux Interview Questions with Examples – Part 3

Our Brands

Chapter #18: How to Manage Containers Using Podman and Skopeo in RHEL