Under the Hood in Apache Lucene 4.0 | Linux Today

Under the Hood in Apache Lucene 4.0

Written By
Web Webster
Web Webster
Aug 23, 2011

“One of the most significant changes in Lucene 4.0 is the full
switch to using bytes (UTF8) in place of text strings for indexing
within the search engine library. This change has improved the
efficiency of a number of core processes: the ‘term dictionary’,
used as a core part of the index, can now be loaded up to 30 times
faster; it uses 10% of the memory; and search speeds are increased
by removing the need for string conversion.

“This switch to using bytes for indexing has also facilitated
one of the main goals for Lucene 4.0, which is ‘flexible indexing’.
The data structure for the index format can now be chosen and
loaded into Lucene as a pluggable codec. As such, optimised codecs
can be loaded to suit the indexing of individual datasets or even
individual fields.”


Complete Story

Web Webster

Web Webster

Web Webster has more than 20 years of writing and editorial experience in the tech sector. He’s written and edited news, demand generation, user-focused, and thought leadership content for business software solutions, consumer tech, and Linux Today, he edits and writes for a portfolio of tech industry news and analysis websites including webopedia.com, and DatabaseJournal.com.

Linux Today Logo

LinuxToday is a trusted, contributor-driven news resource supporting all types of Linux users. Our thriving international community engages with us through social media and frequent content contributions aimed at solving problems ranging from personal computing to enterprise-level IT operations. LinuxToday serves as a home for a community that struggles to find comparable information elsewhere on the web.

Property of TechnologyAdvice. © 2026 TechnologyAdvice. All Rights Reserved

Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.