Linux Today: Linux News On Internet Time.
Search Linux Today
Linux News Sections:  Developer -  High Performance -  Infrastructure -  IT Management -  Security -  Storage -
Linux Today Navigation
LT Home
Contribute
Contribute
Link to Us
Linux Jobs


More on LinuxToday


Distributed data processing with Hadoop - Part-3: App Build

Jul 21, 2010, 07:39 (0 Talkback[s])
(Other stories by M. Tim Jones)

[ Thanks to An Anonymous Reader for this link. ]

"With configuration, installation, and the use of Hadoop in single- and multi-node architectures under your belt, you can now turn to the task of developing applications within the Hadoop infrastructure. This final article in the series explores the Hadoop APIs and data flow and demonstrates their use with a simple mapper and reducer application.

"The first two articles of this series focused on the installation and configuration of Hadoop for single- and multinode clusters. This final article explores programming in Hadoop—in particular, the development of a map and a reduce application within the Ruby language. I chose Ruby, because first, it's an awesome object-oriented scripting language that you should know, and second, you'll find numerous references in the Resources section for tutorials addressing both the Java™ and Python languages. Through this exploration of MapReduce programming, I also introduce you to the streaming application programming interface (API). This API provides the means to develop applications in languages other than the Java language.

"Let's begin with a short introduction to map and reduce (from the functional perspective), and then take a deeper dive into the Hadoop programming model and its architecture and elements that carve, distribute, and manage the work."

Complete Story

Related Stories: