---

Search Structured LDAP Data With a Vector-Space Engine

[ Thanks to An Anonymous Reader for
this link. ]

“Articles describing vector-space searching usually begin with a
description of vector spaces and how to project a specified query
into a term space. Let’s work backward, instead, with the following
example: With a specified query of Nathen, we want to match data
entries of Nathan and Jonathan, in that order. Existing approaches
might involve building a regular expression based on stems of a
word, or metaphone, and other linguistic derivatives of a search
term. In our case, effective search results can be obtained by
creating a vector for each letter in a word and returning results
based on the closest match in vector space. In this case, the
Nathan result will be printed first because it has five of six
letters (vectors) in common, and Jonathan will be printed second
because it only has five of eight letters in common.

“The code and descriptions in this article are a highly
simplified view of vector spaces and how to search them
effectively…”


Complete Story