Pentaho and Cloudera – Strength in Partnership

By: Doug Moran, Co-Founder and Product Manager, Big Data at Pentaho

As the first analytics vendor to announce support for Hadoop in May 2010, we became very familiar quickly with another rising vendor in the big data space that shared the same open source heritage, Cloudera. From sponsoring the first Hadoop World (and every Hadoop World/Strata after) to growing our customer bases together, Cloudera has been a Pentaho ally and friend. This early leadership and open platform approach enabled us to offer deep integration with Impala and YARN. We recently announced that Pentaho Business Analytics is now certified on Cloudera 5, and we couldn’t be more excited for the opportunities this will bring.

14-032 Pentaho Enterprise Graphic v10.indd

The strongest technology partnerships break through difficult limitations and figure out critical points of integration in order to make customer adoption easier. Together, Pentaho and Cloudera do just that.

Pentaho + Cloudera Make Coupons Passé

edo is a Cloudera and Pentaho customer leveraging Hadoop for streamline data refinery to better understand customer preferences and behaviors. The digital marketing startup connects brands with consumers by harnessing billions of daily data records related to customer conduct, synthesizing trends and delivering personalized offers most likely to trigger a sale.

For edo, having lightning fast data analytics systems is a matter of survival. The company has to keep innovating in a crowded social, local and mobile advertising market. But in early 2013, edo hit a wall due in part to the sheer amount of data they could access. The team couldn’t process data fast enough with their existing SQL database and as a result, couldn’t get the right offers to the right people quickly enough. They needed to do something different to overcome the data overload.
PentahoWorking with Cloudera and Pentaho allowed edo to dramatically scale the amount of data that could be processed. Pentaho Business Analytics utilizes Cloudera Impala, Hive and HBase, to streamline edo’s preparation and analytics processes – extracting, integrating and analyzing 25 million transactions a day consisting of over 50TB of data! Combining Cloudera and Pentaho has drastically reduced the amount of time edo spends on data preparation and the overall analytics processes, so managers can create real-time and ad-hoc reports for customers. As a result, they cut the processing window from 29 hours to under four all while growing the amount of data processed 974%! This has been a key factor to customer preference and retention.

In the case of edo, Cloudera and Pentaho are proven better together. To find out more about how Pentaho can complement your Hadoop data and streamline business processes, go to



Filed under: Partners


Leave a Reply