Linux Today: Linux News On Internet Time.

Search Structured LDAP Data With a Vector-Space Engine

Sep 24, 2007, 04:30 (0 Talkback[s])
(Other stories by Nathan Harrington)

[ Thanks to An Anonymous Reader for this link. ]

"Articles describing vector-space searching usually begin with a description of vector spaces and how to project a specified query into a term space. Let's work backward, instead, with the following example: With a specified query of Nathen, we want to match data entries of Nathan and Jonathan, in that order. Existing approaches might involve building a regular expression based on stems of a word, or metaphone, and other linguistic derivatives of a search term. In our case, effective search results can be obtained by creating a vector for each letter in a word and returning results based on the closest match in vector space. In this case, the Nathan result will be printed first because it has five of six letters (vectors) in common, and Jonathan will be printed second because it only has five of eight letters in common.

"The code and descriptions in this article are a highly simplified view of vector spaces and how to search them effectively..."

Complete Story

Related Stories: