Superfast RAG with Llama 3 and Groq - Implementing a Retrieval-Augmented Generation Pipeline
Offered By: James Briggs via YouTube
Course Description
Overview
Explore a 17-minute video tutorial on implementing a Retrieval-Augmented Generation (RAG) pipeline using Meta's Llama 3 70B model via Groq API, an open-source e5 encoder, and Pinecone vector database. Learn how to leverage Language Processing Units (LPUs) for ultra-fast LLM inference, set up Llama 3 in Python, initialize e5 for embeddings, and utilize Pinecone for efficient RAG. Discover the rationale behind concatenating title and content, test RAG retrieval performance, and generate answers using Llama 3 70B. Gain insights into why Groq matters for AI applications and access the provided code repository for hands-on practice.
Syllabus
Groq and Llama 3 for RAG
Llama 3 in Python
Initializing e5 for Embeddings
Using Pinecone for RAG
Why We Concatenate Title and Content
Testing RAG Retrieval Performance
Initialize connection to Groq API
Generating RAG Answers with Llama 3 70B
Final Points on Why Groq Matters
Taught by
James Briggs
Related Courses
Metadata Filtering for Vector Search - Latest Filter TechJames Briggs via YouTube Cohere vs. OpenAI Embeddings - Multilingual Search
James Briggs via YouTube Building the Future with LLMs, LangChain, & Pinecone
Pinecone via YouTube Supercharging Semantic Search with Pinecone and Cohere
Pinecone via YouTube Preventing Déjà Vu - Vector Similarity Search for Security Alerts, with Expel and Pinecone
Pinecone via YouTube