"Your humble writer knows a little bit about a lot of things;
but despite writing a fair amount about text processing (a book,
for example), linguistic processing is a relatively novel area for
me. Forgive me if I stumble through my explanations of the quite
remarkable Natural Language Toolkit (NLTK), a wonderful tool for
teaching, and working in, computational linguistics using Python.
Computational linguistics, moreover, is closely related to the
fields of artificial intelligence, language/speech recognition,
translation, and grammar checking.
"It is natural to think of NLTK as a stacked series of layers
that build on each other. Readers familiar with lexing and parsing
of artificial languages (like, say, Python) will not have too much
of a leap to understand the similar -- but deeper -- layers
involved in natural language modeling. While NLTK comes with a
number of corpora that have been pre-processed (often manually) to
various degrees, conceptually each layer relies on the processing
in the adjacent lower layer. Tokenization comes first; then words
are tagged; then groups of words are parsed into grammatical
elements, like noun phrases or sentences (according to one of
several techniques, each with advantages and drawbacks); finally
sentences or other grammatical units can be classified..."
Some of the products that appear on this site are from companies from which QuinStreet receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. QuinStreet does not include all companies or all types of products available in the marketplace.