Impala Introduction & Essentials - 2 Day Bootcamp

The Impala-an Open Source SQL Engine for Hadoop is an ideal course package for individuals who want to understand the basic concepts of Massively Parallel Processing or MPP SQL query engine that runs on Apache Hadoop. On completing this course, learners will be able to interpret the role of Impala in the Big Data Ecosystem. 

The course focuses on the basics of Impala. It further provides an overview of the superior performance of Impala, against other popular SQL-on-Hadoop systems.

Through instructor-led discussion and interactive, hands-on exercises, participants will navigate the Impala ecosystem, learning topics such as:

  • Describe Impala and its role in Hadoop Eco-system

  • Explain how to query data using impala SQL

  • Discuss partitioning of Impala tables and explain its benefits

  • List the factors affecting the performance of Impala

  • Describe the complete flow of a SQL query execution in Impala

Outline:

 

1. An Introduction to Impala

 

An overview to the Impala

What is Impala?

The benefits of Impala

Exploratory Business Intelligence

The Impala Installation

Starting and Stopping Impala

Data Storage

Managing Metadata 

Controlling Access to Data 

Impala Shell Commands and Interface

 

2. Querying with Hive and Impala

Querying with Hive and Impala

SQL Language Statements

DDL Statements

CREATE the DATABASE

CREATE the TABLE

Internal and External Tables

Loading Data in Impala Table

The ALTER TABLE

The DROP TABLE

What is DROP DATABASE?

Describing the Statement

Explaining the Statement

SHOW the TABLE Statement

INSERT Statement SELECT Statement

Data Type

The Operators

About the Functions

The CREATE VIEW in Impala

Hive and Impala Query Syntax Impala

 

3. Data Storage and File Format

About the Data Storage and File Format

The Partitioning Tables

SQL Statements for Partitioned Tables

File Format and Performance Considerations

Choosing the File Type and Compression Technique

 

4. Working with the Impala

Working with the Impala

Know Impala Architecture

What is Impala Daemon?

About the Impala Statestore

Impala Catalog Service

Query Execution Flow in Impala

User - Defined Functions

Hive UDFs with Impala

Improving Impala Performance