Apache Hadoop had another banner year in 2015. Just when we thought it couldn’t get any better, the year showed us that the era of Hadoop is really just getting going, and there’s so much still to be done. We saw Apache Spark become a more prominent part of the stack, new storage capabilities like Apache Kudu (incubating) got added, and security got honed further. It’s hard to think of what can follow. But that’s the beautiful thing about Hadoop: the pace never lets up, the possibilities are seemingly endless, and as a result, we’re constantly amazed by what comes next.
So what can we expect in 2016? Obviously no one knows for certain, but here are few things I anticipate we’ll see:
Hadoop will continue to disappear
Okay, I admit, this one isn’t very original given Mike Olson talked about it during his keynote at Strata+Hadoop World NYC last year, but it needs to be called out nonetheless. Whereas the initial focuses for Hadoop use cases have been cheap and deep storage and data processing, we continue to see enterprises large and small do truly transformative things with this platform today and the range of applications continues to grow.
Communications Service Providers (CSPs) like True and British Telecom (BT) are building applications to build an integrated 360-degree customer view across their businesses, reshaping how customer interaction is done. Financial services organizations like Northern Trust are improving the delivery of services and implementing real-time health monitoring for payment processing. And across industry, common applications will modernize. Countertack, for example, is now delivering innovative cybersecurity solutions addressing a growing concern for virtually every organization, regardless of industry.
Barriers to adoption will erode
At Cloudera, we were proud to have eclipsed 100 members in our Cloudera Academic Partnership (CAP) program, a big milestone in helping curate the Hadoop professionals we’ll need for the future. As time goes on, more and more people are entering the workplace with the skills needed to drive success with Hadoop. This is a big step in the right direction, but it’s not everything.
In addition to acquiring the necessary skills, perhaps more challenging is identifying and assessing the right fit for Hadoop in your organizations. As 2015 came to a close, conversations had clearly shifted from “what” and “why” to “how”. Technology is just one part of the equation when it comes to re-architecting with Hadoop. Organizations realize that they also need to factor in people and process changes, and often times, figuring how to manage all the change is daunting.
In 2016 we’ll see a desire for education on the path to success. “What’s the right starting point today?” “ How do I scale it once deployed?” and “how do I move from smaller departmental projects to larger enterprise-wide initiatives?” are all questions that need to be addressed. The technology is wicked cool but it’s only one piece of the puzzle. Adoption will accelerate as more and more organizations step back and assess where they are, where they’re trying to go, and developing a thoughtful strategy to make the journey as smooth as possible.
For our part, we’ll continue to simplify the product as much as possible. Take Cloudera Navigator Optimizer, for example, which provides visibility into workloads, and helps customers understand the ones that are best suited for deployment on Hadoop to reduce development time and improve performance. Incremental improvements like this go a long way to making adoption easier.
Cloud-y with a chance of IoT
Talk about buzzword density, right? Internet of Things (IoT) is arguably one of the most hyped topics since big data. Hype or not, it’s here to stay and it will only continue to pick up speed as the year progresses. Cloud it’s clearly not going away either. For Hadoop in particular, I expect cloud will continue to come to the fore as the gravity of data shifts from on premises to cloud.
In a recent webinar, we asked attendees how much of their data resides in cloud versus on premises. Perhaps not surprisingly, 45% of respondents said all of their data lives on premises, or roughly have of enterprises today. By contrast, only 33% expect all of their data to reside in the cloud by 2020. I’d say we’re hard pressed to see a time when all data will reside in the cloud but if our findings in this survey tell us anything it’s that there is clearly a shift to a hybrid cloud model for data and we anticipate the same for Hadoop deployment. In a separate webinar on tips for success in production with Amazon Web Services (AWS), 54% of surveyed respondents said “perceived Hadoop complexity” was a barrier to cloud adoption. That’s why we’ve invested in simplifying the cloud experience through tools like Cloudera Director, and will continue to focus on making the entire platform more consumable on an ongoing basis.
IoT is, well, one of those topics we’re going to hear a lot about. Cloudera customers have long been doing very interesting things in the area of IoT. Vivint, for example, is delivering consumer IoT solutions using Cloudera, enabling the connected home and delivering an improved service experience. On the other hand, Omneo is pushing the boundaries in industrial IoT, optimizing supply chain in real time and delivering $15-20M USD in savings along the way. Be it improved customer experience or data driven products, expect IoT to be a central part of virtually every conversation in the foreseeable future.
So tell me what do you think?