Processing and Data Retrieval in a Hadoop and Spark Environment
In this course, you will characterize Hive, Drill, Impala, and JAQL-like query languages; describe Pig and Pig Latin for creating MapReduce jobs; load and inspect data in Apache Spark; and create a Spark application. You also will use Flume to collect, aggregate, and move streaming data.
You'll Walk Away with
- The ability to use MapReduce to process unstructured data
- An understanding of how Spark fits into the big data application stack
- The confidence to build and launch a standalone Spark application
- The practical experience of loading data with Pig and building a data flow using the data to illustrate the extraction, transformation, and loading of data
- The skills to use Flume to collect, aggregate, and move streaming data in the Hadoop Distributed File System