YoVDO

The Magic of Multilingual Search with Pinecone Serverless and Inference

Offered By: Pinecone via YouTube

Tags

Pinecone Courses Vector Databases Courses Language Models Courses Semantic Search Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore the intricacies of multilingual search and learn how to leverage Pinecone Inference and Serverless for building multilingual applications in this 54-minute talk by Arjun Patel, Developer Advocate at Pinecone. Dive into the use of multilingual embedding models, understand the benefits of multilingual models and Pinecone Inference in search applications, and witness a practical language learning demo involving cross-lingual and mono-lingual search. Gain insights into topics such as vector embeddings, LLMs, XLM-RoBERTA, Multilingual E5, and vector search basics with Pinecone. The presentation covers advanced concepts like weakly supervised contrastive pretraining, supervised finetuning, and handling cultural nuances in multilingual semantic search. Includes a Q&A session addressing evaluation methods, out-of-domain languages, and challenges with low-resource languages.

Syllabus

- Introduction
- Tatoeba and Multilingual Semantic Search
- What is Multilingual Semantic Search?
- Applications of Multilingual Semantic Search
- How do we achieve multilingual semantic search?
- A Crash Course in LLMs
- What are Vectors and Vector Embeddings?
- Distributional Hypothesis
- What are LLMs anyway?
- How does XLM-RoBERTA work?
- XLM-R: Big Multilingual Datasets
- XLM-R: Tokenization
- XLM-R: Masked Language Modeling
- Getting Doc embeddings
- Why XLM-R Isn't Enough
- Multilingual E5 for Multilingual Search Embeddings
- mE5: Training Data
- mE5: Weakly Supervised Contrastive Pretraining
- mE5: Supervised Finetuning and Dataset Distribution
- Basics of Vector Search with Pinecone
- Using Pinecone Inference
- Querying with Pinecone
- Demo Time: Language Learning with Multilingual Semantic Search
- Demo Architecture
- Live walkthrough of Notebook
- Embedding with Pinecone Inference
- Batch Embedding and Upsertion
- Query Embeddings, and cross-lingual search
- Tips and Tricks for Multilingual Semantic Search
- QA Time
- Evaluating Semantic Search
- Language Embedding Theory
- What happens for Out of Domain Languages? Transfer Theory
- Why isn't Translation Sufficient?
- Handling Negation in Queries
- Handling Cultural Nuance
- Low Resource Languages


Taught by

Pinecone

Related Courses

Metadata Filtering for Vector Search - Latest Filter Tech
James Briggs via YouTube
Cohere vs. OpenAI Embeddings - Multilingual Search
James Briggs via YouTube
Building the Future with LLMs, LangChain, & Pinecone
Pinecone via YouTube
Supercharging Semantic Search with Pinecone and Cohere
Pinecone via YouTube
Preventing Déjà Vu - Vector Similarity Search for Security Alerts, with Expel and Pinecone
Pinecone via YouTube