Linux Today: Linux News On Internet Time.
Search Linux Today
Linux News Sections:  Developer -  High Performance -  Infrastructure -  IT Management -  Security -  Storage -
Linux Today Navigation
LT Home
Contribute
Contribute
Link to Us
Linux Jobs


Top White Papers

  • Corporate e-Learning technology has a long and diverse pedigree. As far back as the 1980s, companies were adopting computer-based training to supplement...
    Download

  • It's not unusual for a company to use a variety of formal and informal file-sharing methods. Many methods are fraught with significant operational, financial,...
    Download

More on LinuxToday


Search Structured LDAP Data With a Vector-Space Engine

Sep 24, 2007, 04:30 (0 Talkback[s])
(Other stories by Nathan Harrington)

[ Thanks to An Anonymous Reader for this link. ]

"Articles describing vector-space searching usually begin with a description of vector spaces and how to project a specified query into a term space. Let's work backward, instead, with the following example: With a specified query of Nathen, we want to match data entries of Nathan and Jonathan, in that order. Existing approaches might involve building a regular expression based on stems of a word, or metaphone, and other linguistic derivatives of a search term. In our case, effective search results can be obtained by creating a vector for each letter in a word and returning results based on the closest match in vector space. In this case, the Nathan result will be printed first because it has five of six letters (vectors) in common, and Jonathan will be printed second because it only has five of eight letters in common.

"The code and descriptions in this article are a highly simplified view of vector spaces and how to search them effectively..."

Complete Story

Related Stories: