The enterprise data hub has established itself as critical infrastructure in any big data architecture. This is in large part to the business value customers derive from the rapidly expanding ecosystem of partner solutions that run against it.
That’s why it’s critical that our customers trust that the software running against their Cloudera deployment operates in the manner they expect, while taking advantage of the unique security, management, processing and analytics workloads delivered by Cloudera.
The Cloudera Connect partner certification program is the most comprehensive in the Hadoop market. It’s also the easiest for customers to understand. Can a particular BI tool leverage Impala for rapid queries? Does my ETL tool integrate with Cloudera Navigator for metadata management and governance? Can I monitor the health of the tools running against my cluster through Cloudera Manager?
Simply put, if a partner carries the Cloudera Certified logo, the answer to all these questions is “yes.”
A Cloudera Certified partner has demonstrated that their product has been used on a cluster that meets certain specific requirements towards a use case. That means before a partner can achieve a certification, they must prove their software will work in an environment – similar to that of a large financial institution – and interoperate seamlessly with components like Spark, Impala, CM, Kafka, and Sentry. See our list of Cloudera Certified partners
Like the Hadoop market in general, Cloudera is always evolving and maturing, and it’s no different with our certification requirements. Specifically, we have raised the bar for our partners on what constitutes a valid certification test environment to one that more closely aligns with what our customers expect. In particular:
1. Certification must be done on a multi-node cluster. Our customers’ experience shows us that the ability to demonstrate the integration in a “Quickstart” style virtual machine makes for an excellent showcase in a trade show booth, a pre-sales demo, or a proof of concept. However, it sidesteps the complications brought about by real production deployments, most notably around multi-tenancy and security, but also around a variety of other operational factors.
If a customer sees the Cloudera certification logo on a joint solution, they should take this to mean that the integration has seen the light of day on an actual cluster doing meaningful work.
2. Partners are required to show us an end-to-end functional test on a real-world style dataset that exercises every product integration point with Cloudera. This means that we’ve differentiated certification tests from QA tests. Our partners often come to us with automated QA test results showing their integration with our platform. While this is appreciated and definitely useful, our customers expect more with regard to certification.
Customers want to know that their staff will be able to sit down with the solution and be productive with it in a reasonable amount of time. Most of the time, this isn’t a tough requirement for a partner to meet, and they actually welcome the opportunity to show their product to us.
3. Secured environments are required. Cloudera is differentiated by the completeness of security that we offer. We’re aggressively delivering improved security across our entire stack. Before certifying any partner product, we need to make sure we understand how that product functions when the customer chooses to enable capabilities such as Kerberos security, Apache Sentry, or HDFS encryption. While all of this functionality is designed to be reasonably seamless, nothing beats the reassurance of knowing the partner product has actually been tested with each one where appropriate.
While it is impossible to cover all specific combinations of any customer’s security requirements, we do require basic authentication and authorization settings are configured in clusters used for certification. This gives customers the confidence that enabling our security features won’t disrupt their use of our partner products.
4. Products that require special builds or are not compatible with the components of Cloudera’s enterprise data hub (EDH) will not be certified. Few things in life are as disappointing as when a customer is sold a joint solution and then has to make tradeoffs between products in order to meet their needs. While most of our partners embrace the EDH message, certification is a chance to get them on the same page with respect to the simple requirements imposed by the EDH concept.
For example, a product that requires a different version or unique build of a component such as Spark or Kafka will not be certified. Likewise, a solution that precludes the use of Cloudera Manager in order to function will not be certified. In fact, we won’t certify partner products that preclude use of any component of our stack. Given the speed at which the ecosystem has evolved, this can be a little challenging, but we do our best to work with partners to remove obstructions.
5. Workloads must be complete. For hardware, virtualization, and cloud partners, we work to ensure the reference architectures have been exposed to full EDH workloads, including Spark, Solr, HBase, and Impala. We’re also starting to track the degree to which customer deployments match the reference architectures we recommend. This prevents customers from making an investment that inadvertently closes doors to new workloads in the future.
These certification requirements have been in place long enough for us to observe how they benefit our customers. We are seeing higher quality integrations and more effective resolutions when issues arise with certified partner products. Our customers are happy because they’re able to recognize value from their EDH investment far quicker and can start to look beyond their initial set of use cases.
In our next post, we’ll discuss how we classify a partner product ecosystem that is staggeringly deep and wide, enabling our customers to select and deploy a solution that is a best match for their needs.