Facebook has open sourced its Corona scheduling component for Hadoop, which the company calls “the next version of Map-Reduce”. Facebook is using its own fork of Apache Hadoop which is optimised for the massive scale of its operations.
The current Hadoop implementation of the MapReduce technique uses a single job tracker, which causes scaling issues for very large data sets. The Apache Hadoop developers have been creating their own next-generation MapReduce, called YARN, which Facebook engineers looked at but discounted because of the highly-customised nature of the company’s deployment of Hadoop and HDFS.

