When Cloudera became the first vendor to ship and support Apache Spark in February 2014, Spark was already well on its way toward becoming the framework of choice for faster batch processing, machine learning, advanced analytics, and stream processing. Today many Cloudera customers have begun moving these workloads from MapReduce to Spark in their production…
Author Archives: Jairam Ranganathan
Apache Spark in the Apache Hadoop Ecosystem
Categories: Open Source Software
At the recently concluded Spark Summit conference, Mike Olson spoke about the emergence of Apache Spark as a new standard for Hadoop data processing. As part of that, we announced an industry wide collaboration with key organizations in the Hadoop community to evolve projects built on top of MapReduce to migrate to the Spark execution…
Apache Spark – Welcome to the CDH family
Categories: General
The neatest part about being part of our market is the rapid rate of innovation we experience. Ideas from a variety of sources – industry, academia and sometimes industry spawned from academia (in the case of our partner Databricks) – regularly become mainstream and create net new ways to interact with and analyze our data…