There is an infrastructure shift taking place within the enterprise. Increasingly, customers are evaluating cloud environments for deploying their big data solutions. Trends around streaming data and the Internet of Things have helped to accelerate this as more of this data tends to be generated and stored in the cloud already. Additionally, the provisioning speed and flexibility that the cloud provides aligns well with ever-shrinking IT budgets and top-down pressure for more results, faster. Cloud is certainly cementing its place within modern enterprise architectures, and Cloudera wants to ensure that you maintain all the benefits of the fastest, easiest, most secure Hadoop platform while embracing the agility that the cloud provides.
One of the most common use cases we see for Hadoop, regardless of deployment environment, are ETL/batch workloads. With Hadoop, enterprises can process larger volumes of data of all types at a much faster rate than previously possible. A common feature of these workloads is that they tend to be short-running – once the data is done being processed, the job ends. The difference is when the batch workload ends in an on-premises deployment, you have resources that have already been provisioned that are no longer being used.
Cloudera Director is the tool that leading enterprises rely on to deploy and manage Hadoop clusters in cloud environments. For short-running, or transient, workloads, Cloudera Director automates cluster lifecycle management, so clusters can be spun up to run the job and terminated at completion, all without manual intervention. This means you are only paying for the resources needed for these workloads. Additionally, if more compute power is needed for certain jobs, clusters can elastically scale through the Cloudera Director interface to support the added load. Finally, for added cost-savings, enterprises can use the tool to take advantage of spot instances to scale-out clusters at a much lower price point.
The below demo shows how Cloudera Director can lower costs for transient workloads and showcases our recently added support for Apache Hive and Apache Spark processing data on Amazon S3. Cloudera Director allows you to use the same powerful Hadoop tools no matter where your data lives. Check it out below:
To learn more about Hadoop in the cloud, visit Cloudera’s booth at Strata + Hadoop World San Jose this week and check out the session “Bringing the Apache Hadoop ecosystem to the Google Cloud Platform.”