YoVDO

State-of-the-Art Retrieval Augmented Generation at Scale in Spark NLP

Offered By: Databricks via YouTube

Tags

Retrieval Augmented Generation Courses LangChain Courses Vector Databases Courses Data Normalization Courses Text Embedding Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore a 33-minute conference talk on scaling Retrieval Augmented Generation (RAG) systems using Spark NLP. Learn how to overcome challenges in processing large document sets and complex pipelines when moving from proof-of-concept to production. Discover techniques for efficiently scaling pre-processing pipelines, handling multimodal inputs, document segmentation, and data normalization. Understand how to calculate text embeddings faster than Hugging Face and load them into vector databases. Explore post-processing modules like reranking, filtering, expansion, and keyword extraction without additional libraries. Gain insights on integrating with LangChain and HayStack. Ideal for data scientists building production-grade LLM systems, this talk by David Talby and Veysel Kocaman from John Snow Labs offers practical solutions for enhancing RAG performance at scale.

Syllabus

State-of-the-Art Retrieval Augmented Generation at Scale in Spark NLP


Taught by

Databricks

Related Courses

Prompt Templates for GPT-3.5 and Other LLMs - LangChain
James Briggs via YouTube
Getting Started with GPT-3 vs. Open Source LLMs - LangChain
James Briggs via YouTube
Chatbot Memory for Chat-GPT, Davinci + Other LLMs - LangChain
James Briggs via YouTube
Chat in LangChain
James Briggs via YouTube
LangChain Data Loaders, Tokenizers, Chunking, and Datasets - Data Prep
James Briggs via YouTube