Open Source and Its Influence on the Apache Hadoop Ecosystem

Categories: Open Source Software

By any measure, the Apache Hadoop ecosystem–less than 10 years old–is astoundingly successful. Few other open source platforms have been adopted so rapidly, and so widely. Throughout history, only Linux rivals it with respect to sheer gravitational influence on users and vendors.

But don’t take the reasons for Hadoop’s success for granted. They include:

  • All ecosystem components that store or process data are open source under a permissive license.
  • The platform is collaboratively developed and maintained in manner that guarantees a level playing field for all contributors–not just those with commercial motives and money to spend.
  • The platform rapidly matures because it is continually refreshed with new components that arise from a vibrant, organic innovation process (see #2). The successful ones become de facto standards by virtue of widespread, grassroots adoption by users and then subsequent “ratification” by the commercial ecosystem (for example, Apache Spark, Impala, and Apache Kafka became multivendor standards in this manner).

Essentially the story of Hadoop, and most likely its future, is a story of continuing innovation from the ground up/edges in, and without centralized control. And that’s the best possible arrangement for users and customers.

To learn more about this story, view a new four-part conversation between Tony Baer, Principal Analyst at Ovum Research, and Doug Cutting, Chief Architect at Cloudera and co-founder of Apache Hadoop. Watch Part 1 below, then click through to watch Parts 2-4:

Part 4: The Future of Open Source and Hadoop

In these conversations, you’ll learn how and why open source, open standards, and organic innovation have influenced the development and adoption of Hadoop.

To learn more about open source, open standards, and the Hadoop ecosystem, go to


One response on “Open Source and Its Influence on the Apache Hadoop Ecosystem

Leave a Reply