Get Trained on Apache Spark with New Comprehensive Learning Paths

Categories: Cloudera University

Cloudera has been providing world-class Apache Hadoop training for developers and administrators since 2009, and we have always striven to make sure that our courses reflect customers’ real-world needs; we don’t teach concepts in isolation, we teach them in such a way that they can immediately be applied to real-world problems. That focus and quality has helped to ensure that we are, year after year, the leading provider of Hadoop training – with over 40,000 trained – and has provided our customers with the foundations they need to be successful in their Hadoop deployments.

We recently announced the One Platform initiative, our plan to make Apache Spark the de facto processing framework for Hadoop data processing. As Mike Olson wrote, we believe Spark will supplant MapReduce – and, indeed, over 150 of our customers are already using it for a wide range of workloads. Spark is easier for developers to learn, and was designed to address some of MapReduce’s shortcomings.

In support of that, we’re pleased to announce an expanded set of Spark-based training courses. We were the first company to offer multi-day, structured Spark training, which draws many hundreds of students every month, and the only company whose Spark training emphasizes the fact that Spark is part of a complete solution, rather than existing in a vacuum. We have now expanded our Spark offerings into a full developer learning path designed to ensure that our customers can create production-ready big data applications as quickly as possible.

To successfully create big data applications, developers need to understand how Spark fits in as part of the whole Hadoop environment: how to get data in to the cluster, how to prepare it for analysis with tools such as Impala and Hive and, of course, how to process it with Spark. And that’s how we designed our new Developer learning path: it is, quite simply, the fastest way to go from a standing start to being able to create complex, real-world big data applications.

The Developer learning path starts with Developer Training for Spark and Hadoop I, where we cover data ingestion using tools such as Apache Sqoop and Apache Flume, data modeling (choosing the right data format, partitioning data appropriately and so on), and data processing with Spark. Students who complete this four-day course will be ready to start building applications, even if they were not at all familiar with Hadoop beforehand.

The learning path continues with Developer Training for Spark and Hadoop II: Advanced Techniques. This is designed for people who have taken the first course, had a little real-world experience, and are now ready to take the next step with a deep dive into more advanced tools and techniques. Covering subjects such as Spark Streaming and Apache Kafka, this second course will prepare developers for the more complex challenges they’ll face while creating large-scale production applications.

Of course, many developers – particularly those who have attended our MapReduce Developer Training course – are already familiar with much of the Hadoop ecosystem and just want to learn how to develop Spark applications. For those people, our Developer Training for Spark course is the right choice. This course concentrates solely on how to develop Spark and Spark Streaming applications on a YARN-enabled CDH cluster.

For those who want to take their learning further still and use Spark for machine learning and analytics on massive data sets, we are also announcing our Data Science at Scale with Spark and Hadoop course, which will include coverage of the popular MLlib machine learning library.

As the first Hadoop distribution to ship and support Spark, Cloudera has unprecedented experience running Spark in production – powering a wide range of use cases across all industries at scale. This experience translates to expert Spark trainers and courses based on real-world experience, so students always have access to the latest innovations and applicable, comprehensive training.

At Cloudera, we believe that success with Hadoop starts with getting the right training. And you’ll find the world’s best training in our new, Spark-based courses.


2 responses on “Get Trained on Apache Spark with New Comprehensive Learning Paths

Leave a Reply