Cloudera and VMware are working together to validate and performance test the leading Hadoop distribution, CDH, on the vSphere platform. In addition, the two companies are working to make it easier for customers to deploy Cloudera’s enterprise data hub offering in virtualized environments.
Cloudera Manager is the best-in-class holistic interface that provides end-to-end system management and key enterprise features. It is the tool of choice for Hadoop administrators to deploy, configure, monitor, and manage an enterprise data hub. Cloudera Manager also exposes a rich set of APIs for streamlined integration with several Cloudera’s ISV partners. VMware vCenter is the primary tool for vSphere/virtualization administrators to manage virtual machines, and monitor the hosts and the various resources, such as storage and networking that are consumed by those virtual machines.
By leveraging Cloudera Manager APIs behind the scenes, virtualization administrators get a seamless experience in vCenter to easily deploy an EDH cluster on virtualized infrastructure. This blog explains how the two market leading tools VMWare’s vCenter and Cloudera Manager work in concert to cleanly install, configure, manage, and monitor your enterprise data hub deployments on VMware.
When building a set of virtualized Hadoop clusters, the process has two separate phases. First, you need a set of virtual machines with guest operating systems set up, networking applied, users created, and other appropriate services configured. Virtualization makes all of that easy through cloning – and vCenter gives you a friendly way of doing that. Additionally, the choices on where to place those virtual machines once fully configured is better made in an automated way by the vCenter placement algorithms. Those types of configuration and VM-to-host placement intelligence are built into vSphere Big Data Extensions. It also takes care of cloning the right sizes and numbers of virtual machines for you. BDE makes it much easier to repeat this process where multiple clusters are needed – and reduces the occurrence of human operator error.
As a second stage in the installation, given a set of virtual machines you would use the installation facilities of Cloudera Manager to set up your CDH cluster in various modes on the virtual machines. In the late 2014 release of VMware vSphere BDE, in addition to creating the virtual machines, BDE can now call Cloudera Manager APIs directly to install and configure CDH clusters on those machines. This makes the provisioning of CDH on virtualized infrastructure a seamless one though vCenter, the tool of choice for the virtualization admin.
This significantly speeds up time to value for IT operations person servicing Hadoop deployments on virtualized infrastructure. The architect, developer, QA testing person or other user comes with a request to the administrator, perhaps with a specification of their desired cluster. The vSphere administrator can now carry out the provisioning task for them using VMware BDE integration with Cloudera Manager.
We’re very excited to see more of our key partners like VMware leveraging Cloudera Manager APIs to enable better Hadoop experiences for their users. To learn more: Visit http://www.vmware.com/bde to learn more about the vSphere Big Data Extensions and Cloudera Manager integration.