The Free Lunch is Over: Dealing with Unstructured Data in the Era of LLMs
Offered By: The ASF via YouTube
Course Description
Overview
Explore the evolving landscape of unstructured data processing in this 37-minute conference talk from The Apache Software Foundation. Delve into the challenges and opportunities presented by recent advancements in Machine Learning and Large Language Models (LLMs) for Data Engineers. Learn how to process audio, images, and text using vector embeddings, and discover the requirements for building unstructured data-based pipelines. Gain insights into leveraging open-source tools and the Apache data engineering stack, including Apache Spark, Lucene, and Tika, to meet the growing demand for natural language processing of unstructured data sources. Speaker Ismaël Mejía, a Senior Cloud Advocate at Microsoft and Apache Software Foundation member, shares his expertise on adapting to these changes and maintaining unstructured data sources effectively.
Syllabus
The free lunch is over, we have to ‘really’ deal with unstructured data
Taught by
The ASF
Related Courses
Recolección y exploración de datosTecnológico de Monterrey via Coursera Applied Machine Learning
Microsoft via edX Creating an Analytical Dataset
Udacity NoSQL Database Systems
Arizona State University via Coursera Foundations of mining non-structured medical data
EIT Digital via Coursera