YoVDO

The Free Lunch is Over: Dealing with Unstructured Data in the Era of LLMs

Offered By: The ASF via YouTube

Tags

Unstructured Data Courses Apache Spark Courses Image Processing Courses Audio Processing Courses Data Engineering Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore the evolving landscape of unstructured data processing in this 37-minute conference talk from The Apache Software Foundation. Delve into the challenges and opportunities presented by recent advancements in Machine Learning and Large Language Models (LLMs) for Data Engineers. Learn how to process audio, images, and text using vector embeddings, and discover the requirements for building unstructured data-based pipelines. Gain insights into leveraging open-source tools and the Apache data engineering stack, including Apache Spark, Lucene, and Tika, to meet the growing demand for natural language processing of unstructured data sources. Speaker Ismaël Mejía, a Senior Cloud Advocate at Microsoft and Apache Software Foundation member, shares his expertise on adapting to these changes and maintaining unstructured data sources effectively.

Syllabus

The free lunch is over, we have to ‘really’ deal with unstructured data


Taught by

The ASF

Related Courses

CS115x: Advanced Apache Spark for Data Science and Data Engineering
University of California, Berkeley via edX
Big Data Analytics
University of Adelaide via edX
Big Data Essentials: HDFS, MapReduce and Spark RDD
Yandex via Coursera
Big Data Analysis: Hive, Spark SQL, DataFrames and GraphFrames
Yandex via Coursera
Introduction to Apache Spark and AWS
University of London International Programmes via Coursera