YoVDO

Building a Production Scale, Totally Private, OSS RAG Pipeline with DBRX, Spark, and LanceDB

Offered By: Databricks via YouTube

Tags

Retrieval Augmented Generation (RAG) Courses MLOps Courses Vector Databases Courses Generative AI Courses Data Security Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Discover how to construct a production-scale, fully private, open-source RAG pipeline using DBRX, Spark, and LanceDB in this informative 22-minute conference talk. Learn about the challenges enterprises face when implementing AI in production, particularly regarding data security and the need to use external services for LLMs, embedding models, and vector databases. Explore how the latest release of DBRX offers a breakthrough in open-source model quality, providing enterprises with a viable option for high-quality, self-hosted generative AI responses. Gain insights into LanceDB, an open-source solution that enables real-time serving for billion-scale embedding datasets with lower resource requirements than alternatives. Understand how LanceDB utilizes the Lance columnar format for data storage, allowing large-scale updates to be written quickly via Lance's Spark DataSource. Discover the versatility of using the same dataset for both offline analytics and online serving in LanceDB for AI retrieval in RAG, agents, and more. Learn about LanceDB's embedding function registry and its ability to target custom embedding models served from MLFlow without sending data off-premises. Explore how combining Spark, DBRX, and LanceDB enables the creation of a completely private generative AI pipeline within the lakehouse environment.

Syllabus

Building a Production Scale, Totally Private, OSS RAG Pipeline with DBRX, Spark, and LanceDB


Taught by

Databricks

Related Courses

Machine Learning Operations (MLOps): Getting Started
Google Cloud via Coursera
Проектирование и реализация систем машинного обучения
Higher School of Economics via Coursera
Demystifying Machine Learning Operations (MLOps)
Pluralsight
Machine Learning Engineer with Microsoft Azure
Microsoft via Udacity
Machine Learning Engineering for Production (MLOps)
DeepLearning.AI via Coursera