Apache Spark sorts 100 TB of data in world-record 23 minutes

Databricks participated in the Sort Benchmark and set a new world record for sorting 100 terabytes (TB) of data, or 1 trillion 100-byte records. The team used Apache Spark on 207 EC2 virtual machines and sorted 100 TB of data in 23 minutes.

Get the Free Newsletter!

Subscribe to Developer Insider for top news, trends, & analysis