Do you have a data warehouse that stores more than 300 petabytes of data and struggle with the latency of queries? Well, few companies have data at Facebook’s scale, but the performance of queries against your data warehouse is often still a serious productivity issue. Facebook has been developing a solution to that problem and today it offered that answer up to the open source community. It calls the answer Presto.
Today, Facebook released Presto to the open source community under the Apache 2.0 license.
“Facebook’s warehouse data is stored in a few large Hadoop/HDFS-based clusters,” writes Martin Traverso, software engineer at Facebook, in a blog post Wednesday.
“Hadoop MapReduce and Hive are designed for large-scale, reliable computation, and are optimized for overall system throughput. But as our warehouse grew to petabyte scale and our needs evolved, it became clear that we needed an interactive system optimized for low query latency.”