Sign in to view content

Sign in to view this lesson and continue learning.

Spark Batch Processing - Caching, DataFrame, Dataset, SparkSQL, and Bucketing in Iceberg (Day 2 Lab)

Week 4: Batch Pipelines with Apache Spark V2
63 mins
SQLData ModelingETL/ELTApache SparkDockerApache Iceberg

Description

In this Spark lab session, we cover caching, DataFrame, Dataset, and SparkSQL, and explore how bucketing works in Iceberg for efficient data management.