SHARE
Facebook X Pinterest WhatsApp

LinuxNewbie.org: Text Processing Pipelines NHF

Written By
thumbnail
Web Webster
Web Webster
Jul 25, 2000

[ Thanks to Sensei for this link.
]

Sure the command line is evil, but mastering it will unlock
the powers of a Unix box that remain unrealized under modern
graphical user interfaces.
This article details the
construction of text processing pipelines, using ordinary GNU
utilities, to accomplish a few fairly challenging tasks.”

“Suppose that for whatever reason, one is interested in the word
usage of a piece of text, perhaps from an article such as this one,
downloaded from the Net. One might want to know what word is most
frequently used while ignoring all the non-words, like variable
names in source code or other random bits of junk. Perhaps a
ranking of word frequency is required. Should one resort to writing
a special word counting program in Perl? Here’s how to do it using
a few GNU text utilities and the assistance of that great resource
/usr/dict/words.”

“First we begin by breaking up the sentences in the text file so
that there is no more than one word per line. The “tr” tool is
useful here. This tool translates files one character at a time.
For example here is the essential USENET tool rot13 using “tr”


Complete Story

thumbnail
Web Webster

Web Webster

Web Webster has more than 20 years of writing and editorial experience in the tech sector. He’s written and edited news, demand generation, user-focused, and thought leadership content for business software solutions, consumer tech, and Linux Today, he edits and writes for a portfolio of tech industry news and analysis websites including webopedia.com, and DatabaseJournal.com.

Recommended for you...

Germany Puts Microsoft on Five Years Probation for Antitrust Bullying
brideoflinux
Oct 12, 2024
Linus Torvalds Expresses Frustration With Bcachefs Development Process
Senthil Kumar
Oct 7, 2024
Mozilla Thunderbird Lands On Android With New Beta Release
Senthil Kumar
Oct 1, 2024
Tor and Tails Merge to Fight Global Surveillance and Censorship
Bobby Borisov
Sep 26, 2024
Linux Today Logo

LinuxToday is a trusted, contributor-driven news resource supporting all types of Linux users. Our thriving international community engages with us through social media and frequent content contributions aimed at solving problems ranging from personal computing to enterprise-level IT operations. LinuxToday serves as a home for a community that struggles to find comparable information elsewhere on the web.

Property of TechnologyAdvice. © 2025 TechnologyAdvice. All Rights Reserved

Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.