Cloudera and Intel partnered in an exercise to evaluate the state of big data adoption and get a pulse on the modern day challenges associated big data system implementation and adjacent data center transformation in March of this year. Over the years, key shifts in technology have had implications on data center architecture which has motivated companies to periodically modernize their entire environment. One of the most pronounced is the rise of open-source software which no longer couples application software to a rigid set of hardware requirements. Apache Hadoop in general is a strong divergence from traditional database architectures in the aspect that it scales horizontally and can utilize commodity server hardware. Implementing Hadoop is more than just rolling in an appliance and putting some thought into integration points, it requires a look at the systems you run today and a strategy about how Hadoop can complement, extend, or even replace traditional solutions.
Cloudera and Intel jointly commissioned Unisphere Research, a division of Information Today, Inc., to survey IT and corporate line of business managers involved in or responsible for data center operations. The 319 respondents came from a wide range of companies in terms of size and industry. Respondents were asked a series of questions about how they use big data technology today or plan to use it in the future. The full research is available here, but here are some of my favorite highlights.
- Security and data governance are major challenges presented by big data growth. At its inception many users leveraged Hadoop for data sources like clickstream, web logs, and social media feeds. These were easy first endeavours because they didn’t require much thought around governance and security. They were often publically available data sets that didn’t need too much thought around securing access from bad actors. However, as Hadoop’s capabilities grew over the years and organizations began building out large-scale architectures to house all their data in Hadoop; security and governance became a necessity. It became even more of a concern when mixing trusted and untrusted data. In order to make an enterprise data hub a reality we have to ensure that compliance concerns are met and that we have enhanced visibility into what people are doing with the data. So it is not surprising that 60% of respondents said that security and governance were top concerns with big data growth.Cloudera has been focused for years on addressing many of the complex issues facing enterprise security. This often goes beyond just making sure that data is encrypted and access controls are met, but also that users have good visibility into who is using their data and can leverage lineage and auditing capabilities in the event that something unfortunate does occur. Customers choose Cloudera because tools like Cloudera Navigator are the most widely used big data governance tools on the market, being run in some of the most demanding scenarios.
- Accessing data stuck in silos is the most challenging element in creating an impactful data pool for analytics. Walling off data and limiting access is great for security, there is no other sure fire way to keep data safe than creating real physical separation. But this has its drawbacks. Limited data access means limited insight. How do we expect our analysts and data scientists to build the best models on limited data? Not only is this cumbersome for analysts but it creates operational complexity and lower ROI due to managing multiple systems. So for the data management professional it is not surprising that 38% of respondents viewed data being stuck in data silos as a pervasive issue.A key benefit to a data hub architecture approach is all data in one place with the ability to expand or restrict access to data down to the row/column level. Without this core consideration, then we are adding more data to already constrained disparate analysis environments.
- Users want to move from descriptive to predictive analytics. Perfunctory reporting and rigid dashboard building has had its moment in the sun, but the allure is quickly fading. How can a modern company that needs to shift based on data within days or minutes be constrained to looking at data from last week? It simply is not powerful enough to make data a strategic asset. Big data has had a major role in advancing the state of new analytics and forward thinking models. That is why many organizations are already using Hadoop to move to more predictive analytic capabilities. Let’s start by looking at what companies are doing today. While 36% have moved to incorporating advanced analytics, 60% are still operating on descriptive analytic outputs.Another great data point here is that a major consideration for those considering implementing Hadoop is a move to advanced analytics, shown below.
- Clear alignment with business goals is the key factor in moving forward. Understanding the technology is one aspect to making an integral shift in your information architecture, but it’s not the only one. You can partner with a vendor, engage a systems integrator, or hire the right people the ensure you are successful with the technology. A much more complex challenge is changing the culture of the organization to not only be successful with new technology but also ensure that you have the right minds thinking the same way about how data is supposed to fuel the business. Gartner predicted that by 2017 only 50% or fewer of organizations will have made the cultural adjustments to benefit from big data. This is why nearly the same percentage of respondents to our survey stated that clear alignment with a specific business strategy was a determining factor for considering modernization.Even the best intentions with technology need to have business and executive buy-in in order to be successful. At Cloudera, we try to educate our users on how to best build a data strategy, assemble the right teams, and define the best criteria for determining success. At the end of the day, you are trying to accomplish a business goal so your use of technology should map directly back to that business goal.
Cloudera and Intel are excited to share these results straight from the battlefields with you in our full report. We imagine that as time passes the conversation around “why big data?” will slowly fade into the background in favor of more strategic conversations on driving value from data. If you are like many of the users in our survey and are still unsure of many aspects of your data strategy then we encourage you to reach out to our experts. We have helped some of the most forward thinking organizations in this space start their journey, and it’s never too early to understand how you can begin to drive your business with strategic data.