The Apache Hadoop community recently released version 3.0.0 GA, the third major release in Hadoop’s 10-year history at the Apache Software Foundation. We covered earlier releases like 3.0.0-alpha1 and 3.0.0-alpha2 on the Cloudera Engineering blog, and 3.0.0 GA is bigger and better than ever. General availability (GA) marks a point of quality and stability for the release series that indicates it’s ready for broader use.
To recap, some of the major new features include:
- HDFS Erasure Coding, which lowers storage costs by up to 2x.
- YARN resource types, which allows scheduling for user-defined resources like GPUs, software licenses, and locally-attached storage.
- YARN Timeline Service v2, which improves the scalability, reliability, and usability of the existing Timeline Service.
- Improved support for cloud storage systems like S3 (with S3Guard), Microsoft Azure Data Lake, and Aliyun OSS.
Congratulations to the Apache Hadoop community on this major milestone for the project!
Andrew Wang is a Software Engineer at Cloudera, an Apache Hadoop PMC member, and the release manager for 3.0.0.