Superfast RAG with Llama 3 and Groq - Implementing a Retrieval-Augmented Generation Pipeline

Offered By: James Briggs via YouTube

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!

Explore a 17-minute video tutorial on implementing a Retrieval-Augmented Generation (RAG) pipeline using Meta's Llama 3 70B model via Groq API, an open-source e5 encoder, and Pinecone vector database. Learn how to leverage Language Processing Units (LPUs) for ultra-fast LLM inference, set up Llama 3 in Python, initialize e5 for embeddings, and utilize Pinecone for efficient RAG. Discover the rationale behind concatenating title and content, test RAG retrieval performance, and generate answers using Llama 3 70B. Gain insights into why Groq matters for AI applications and access the provided code repository for hands-on practice.

Syllabus

Groq and Llama 3 for RAG
Llama 3 in Python
Initializing e5 for Embeddings
Using Pinecone for RAG
Why We Concatenate Title and Content
Testing RAG Retrieval Performance
Initialize connection to Groq API
Generating RAG Answers with Llama 3 70B
Final Points on Why Groq Matters

Taught by

James Briggs

Related Courses

Pinecone Vercel Starter Template and RAG - Live Code Review Part 2
Pinecone via YouTube Will LLMs Kill Search? The Future of Information Retrieval
Aleksa Gordić - The AI Epiphany via YouTube RAG But Better: Rerankers with Cohere AI - Improving Retrieval Pipelines
James Briggs via YouTube Advanced RAG - Contextual Compressors and Filters - Lecture 4
Sam Witteveen via YouTube LangChain Multi-Query Retriever for RAG - Advanced Technique for Broader Vector Space Search
James Briggs via YouTube