Introduction to Big Data and the Hadoop Ecosystem
This course provides an introduction to big data; the Hadoop database; and the so-called Hadoop ecosystem of products used for querying, analyzing, and scripting work. The course first will cover the installation of virtual machine software and prebuilt Hadoop systems. Then, it will discuss the earlier databases, such as network and relational databases, whose shortcomings led to the development of Hadoop. The course will further investigate the features of Hadoop that are used for in-memory databases. You will learn basic SQL, as the most recent query tools of Hadoop use dialects of SQL for querying and analysis. This course also will examine what big data is, introduce the Hadoop components, and describe examples of the uses of Hadoop.
You'll Walk Away with
- The ability to contrast the characteristics of Hadoop HBase to the earlier network and hierarchical databases
- The skills to use the set of Linux commands that are applicable to the Hadoop ecosystem
- The confidence to perform query and basic analysis using the SQL language tools with Hadoop, such as Hive and Drill
- The capacity to describe the primary processing technique of MapReduce
- Knowledge of the function and use of the primary tools in the Hadoop ecosystem