The real power in machine learning and analytics is when multiple analytics disciplines are able to work together in concert, sharing data in service of solving more complex and more valuable questions. That’s what Cloudera SDX (Shared Data Experience) enables for our customers and why we’re so excited to introduce it today for Cloudera Altus. But before I tell you what Altus SDX is, let’s talk about why it’s important….
People are gravitating to the analytics services of the large public cloud providers because the “house-brand” offerings seem to be the easiest choice. But these offerings are not integrated. Instead, they have separate data stores and inconsistent (if any) frameworks for data governance, management, and security. This leads to extra cost, effort, and risk to stitch together a sub-optimal platform for multi-disciplinary, cloud-based analytics applications. Further, much of the value of cloud is for elastic workloads. If catalog metadata and business definitions live with transient compute resources, they will be lost, requiring work to recreate later and making auditing impossible. Indeed, industry analyst firm Enterprise Management Associates (EMA) found the number one barrier to cloud was increased complexity.
A shared data catalog and data context including definitions, permissions, and governance, offered as part of our Cloudera Altus platform to simplify configurations in the cloud, makes it easy to consolidate all data into a single, well-defined, persistent repository in object storage. This clearly defines the data context even for transient compute applications. It also enables safe and easy self-service for knowledge workers and more operational efficiency for IT. This is part of the Cloudera “Shared Data Experience”, or SDX for short. Altus SDX automates this, delivering multiple business benefits, including the ability to:
- Innovate around high-value, multi-disciplinary analytics applications – Most challenges occur when users bring disparate data services (ML & BI; batch & streaming) together because these functions were not designed for convergence. This puts the burden on the users to determine how to unify complex workflows. Altus SDX enables companies to more easily build and deploy high-value applications for customer analytics, IoT, cyber-security, and more. Nimbly run many distinct applications against shared data.
- Eliminate analytics services silos and drive operational efficiency in the public cloud – To drive big data initiatives, companies need a platform that scales, runs anywhere, enables self-service, and eliminates silos of redundant data and limited usability. Administrative and productivity burden increases when moving data and its definitions across systems, and this burden grows exponentially with the addition of more analytics workloads and tools. Management and troubleshooting then means hunting around disparate environments. Altus SDX includes a shared metadata catalog that puts data in context. Because metadata is always associated with your data, you can open up self-service access to more diverse users and apps without those apps becoming data silos in cloud. Risk and effort are greatly reduced.
- Enjoy safe self-service access to the data you need – End users struggle to discover what data is available in cloud, so they might use the wrong data, and get the wrong answers, or fail to find an answer at all. Analytics can be severely delayed by waiting on IT to provide access to the data users need. Altus SDX creates a common catalog of data, defining what is available and how it can be used by whom. Facilitates productivity and development efficiency by making all data safely accessible in one place.
You can check out Altus Data Engineering which is GA on AWS or beta on Azure, or join our beta for Altus Analytic Database on AWS, and power both with Altus SDX. Soon we’ll have Altus Data Science, too!
Source: “Charting the Expanding Horizons of Big Data”, Enterprise Management Associates