YoVDO

State-of-the-Art Retrieval Augmented Generation at Scale in Spark NLP

Offered By: Databricks via YouTube

Tags

Retrieval Augmented Generation Courses LangChain Courses Vector Databases Courses Data Normalization Courses Text Embedding Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore a 33-minute conference talk on scaling Retrieval Augmented Generation (RAG) systems using Spark NLP. Learn how to overcome challenges in processing large document sets and complex pipelines when moving from proof-of-concept to production. Discover techniques for efficiently scaling pre-processing pipelines, handling multimodal inputs, document segmentation, and data normalization. Understand how to calculate text embeddings faster than Hugging Face and load them into vector databases. Explore post-processing modules like reranking, filtering, expansion, and keyword extraction without additional libraries. Gain insights on integrating with LangChain and HayStack. Ideal for data scientists building production-grade LLM systems, this talk by David Talby and Veysel Kocaman from John Snow Labs offers practical solutions for enhancing RAG performance at scale.

Syllabus

State-of-the-Art Retrieval Augmented Generation at Scale in Spark NLP


Taught by

Databricks

Related Courses

Compare time series predictions of COVID-19 deaths
Coursera Project Network via Coursera
Connecting Systems and Machines to AWS for Industrial Manufacturing
Amazon Web Services via AWS Skill Builder
Data Warehouse Fundamentals
IBM via Coursera
Getting Started with Teradata
LearnQuest via Coursera
Preprocessing Unstructured Data for LLM Applications
DeepLearning.AI via Coursera