[ Thanks to An Anonymous Reader for
this link. ]
“With configuration, installation, and the use of
Hadoop in single- and multi-node architectures under your belt, you
can now turn to the task of developing applications within the
Hadoop infrastructure. This final article in the series explores
the Hadoop APIs and data flow and demonstrates their use with a
simple mapper and reducer application.“The first two articles of this series focused on the
installation and configuration of Hadoop for single- and multinode
clusters. This final article explores programming in
Hadoop—in particular, the development of a map and a reduce
application within the Ruby language. I chose Ruby, because first,
it’s an awesome object-oriented scripting language that you should
know, and second, you’ll find numerous references in the Resources
section for tutorials addressing both the Java™ and Python
languages. Through this exploration of MapReduce programming, I
also introduce you to the streaming application programming
interface (API). This API provides the means to develop
applications in languages other than the Java language.“Let’s begin with a short introduction to map and reduce (from
the functional perspective), and then take a deeper dive into the
Hadoop programming model and its architecture and elements that
carve, distribute, and manage the work.”