Spark: Open Source Superstar Rewrites Future of Big Data

Part of the trick is Spark can store data in the memory subsystems of the thousand of servers it pulls together. Hadoop stores its data on good old fashioned hard disks, and grabbing data from memory requires far less time. But Spark also is what you might call a Swiss Army knife of Big Data analytics tools, says Reynold Xin, one of the Berkeley researchers who works on the project. Hadoop is often used in tandem with sister data analysis tools — tools that let you rapidly examine “real-time” data such as Tweets or ask questions of data via the familiar SQL query language — but Spark lets you do all this from a single piece of software.

Get the Free Newsletter!

Subscribe to Developer Insider for top news, trends, & analysis