Apache Hadoop – The Data Management Platform for IoT

Categories: General IoT / Connected Products

IoT – It’s all about the Data

Billions of devices including everything from cars, homes, airplanes, apparels, parking meters, factories, oil rigs, heavy machinery, and wearables will be connected to the internet and more importantly will be interconnected enabling businesses to work smarter, faster, and more profitably. According to recent research from IDC, about 32 billion things will be connected by 2020, helping enterprises drive efficiencies and launch new products and services.

IoT_expansionWhile a lot of attention and focus up until now has been on the things or objects, Internet of Things (IoT) isn’t going about the things itself or connecting these things to the Internet. IoT is really going to be all about data. With 30+ billion things connected, IoT will drive an explosion of data that will need to be stored, processed, analyzed, and served, in some cases in real time to drive business value. More importantly, the success of IoT deployments will depend on the ability of organizations to gain insights from all of this data in order to drive efficiencies and improve customer experience.

IoT Data – A different paradigm for Enterprises

The volume, variety, and inherent characteristics of data generated from IoT and connected devices will challenge traditional data management approaches and methodologies. Some of the key characteristics of the IoT data include:

  • Massive volumes of intermittent data streams – millions of events per minute
  • Predominantly time-series data
  • Varied data sources – from sensor readings to live video streams
  • Diverse data structures & schemas based on the sources
  • Comes in streams or batches
  • Some of it may be perishable – value of data decreases over time

Given this complexity, organizations will need to fundamentally re-think their data management strategy and will need a platform that is optimized for the scale and complexity that IoT presents.

Hadoop as the Data Platform for IoT

Given the characteristics of IoT data streams, leading organizations around the globe are increasingly adopting Apache Hadoop and Cloudera as the standard data management platform for storing, managing, processing and, more importantly, driving analytics from all of their data.

Some of the key attributes of why Hadoop lends itself perfectly into the world of IoT data management and analytics include:

  • Flexible Data Ingest:  Easily ingests data from multiple data sources and supports both batch as well as real-time data ingest from sensors using tools such as Apache Kafka and Apache Flume
  • Handle Data Variety: Effectively handles multiple IoT data-types, structures, and schemas – from intermittent sensor readings of temperature and pressure to real-time location data or streaming live video feeds

Architecture Slide

  • Flexible & Scalable Data Processing Platform: Scales easily and efficiently based on the data growth, enabling an enterprise to store unlimited amounts of data. More importantly, the platform enables you to effortlessly combine IoT/ sensor data with other internal and external data sources to drive deeper business insights
  • Deployment Flexibility:  Deploy the platform on-premise, in the cloud, or in a hybrid environment based on the needs of your business, while still benefitting from centralized management
  • Fundamentally Secure:  Security is at the core when it comes to IoT and with Cloudera, organizations can take advantage of the only compliance-ready Hadoop platform with multiple layers of security and industry-leading security tools
  • Fast Analytics: Open up this data to self-service business intelligence and analytics with tools like Apache Impala (incubating) and integrations with leading BI partner tools

Today, a number of leading organizations, including leading  automotive manufactures, utilities, industrial automation companies, insurers, healthcare organizations, telecom and technology leaders are adopting Hadoop and Cloudera Enterprise as their data management platform to power some of the most compelling IoT use cases including everything from – connected vehicles & telemetry, connected homes, predictive maintenance & industrial IoT, smart cities, usage based insurance & healthcare IoT.

Connect with Cloudera @ IoT World


To learn more about the Cloudera Enterprise Hadoop platform and hear about IoT customer use cases, connect with us at the Internet of Things World event in Santa Clara, between 10- 12th May 2016, at Booth# 603


Related Links:
Learn more about how Cloudera powers the Internet of Things
Discover more about the Internet of Things World Event


Leave a Reply