YoVDO

Columnar Data Formats Enabling Serverless Data Analysis at Scale

Offered By: Devoxx via YouTube

Tags

Devoxx Courses Data Collection Courses Big Data Analytics Courses Data Classification Courses Serverless Computing Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore the power of columnar data formats and serverless computing for efficient data analysis at scale in this 52-minute conference talk from Devoxx. Delve into the benefits of Parquet and ORC formats for optimizing query performance and costs in analytics scenarios. Learn how combining columnar storage with serverless platforms like AWS Lambda can simplify big data analytics, data collection, and ETL orchestration while reducing total ownership costs. Discover strategies for addressing data challenges, implementing effective database solutions, and leveraging columnar data layouts. Gain insights into use cases for traditional data warehouses and data lakes, and explore components such as data catalogs, classification techniques, and partitioning strategies. Understand the impact of these technologies on long queries and big data processing, and learn how to harness dark data for valuable insights.

Syllabus

Intro
Data challenges
Big data
Database strategy
Collect phase
Columnar data
Columnar layout
Why it matters
Long queries
Big data guys
Use cases
Traditional data warehouse
Data lake
Components
Service
Dark data
Data catalog
Data classification
Schema similarity
Partitions
Spectrum
Statistics


Taught by

Devoxx

Related Courses

Introduction to Cloud Infrastructure Technologies
Linux Foundation via edX
Cloud Computing
Indian Institute of Technology, Kharagpur via Swayam
Elastic Cloud Infrastructure: Containers and Services en Español
Google Cloud via Coursera
Kyma – A Flexible Way to Connect and Extend Applications
SAP Learning
Modernize Infrastructure and Applications with Google Cloud
Google Cloud via Coursera