Speaking UNIX: Peering into pipes
Nov 04, 2009, 15:03 (0 Talkback[s])
(Other stories by Martin Streicher)
Full Text Search: The Key to Better Natural Language Queries for NoSQL in Node.js
[ Thanks to An Anonymous Reader for
this link. ]
"One of the cleverest and most powerful innovations in
UNIX is the shell. It's more efficient than a GUI, and you can
write scripts to automate many tasks. Better yet, the pipe operator
assembles ad hoc programs right at the command line. The pipe
chains commands in sequence, where the output of an earlier command
becomes the input of a subsequent command.
"But the pipe has one major detractor: It's something of a black
box. If you string commands together, the only evidence of progress
is the output that the last command in the series generates. Yes,
you can interject tee in the sequence, and you can watch an output
file grow with tail, but those solutions work best once, lest the
standard output (stdout) and standard error (stderr) of multiple
phases commingle. Further, both solutions are crude indicators and
likely mask how much computation each step requires.
"Of course, you could deconstruct a complex sequence into
multiple individual steps, each with its own interim output file.
And indeed, if you want to verify results at each interval,
decomposition is ideal. Write a script, produce one data file for
each step, use a data file between each pair of steps as input, and
collect the final file as the ultimate result. However, such a
practice is not well suited to the impromptu nature of the command
"What's needed is a progress meter that you can embed in the
command line to measure throughput. Ideally, the meter could be
repeated to benchmark each step—and because the sky's the
limit, the tool would be open source and portable to multiple UNIX
variants, such as Linux® and Mac OS X."