Medical Search Engine with SPLADE + Sentence Transformers in Python
Offered By: James Briggs via YouTube
Course Description
Overview
Learn how to build a medical search engine using hybrid search with NLP information retrieval models in Python. Explore the implementation of hybrid search combining sentence transformers and SPLADE for medical question-answering. Discover how to leverage both dense and sparse vectors to cover semantics and enable exact matching and keyword search. Dive into SPLADE, a powerful sparse embedding method outperforming BM25, and learn how it minimizes vocabulary mismatch problems. Follow along with a practical demo using SPLADE and a sentence transformer model trained on MS-MARCO, implemented via Hugging Face transformers. Gain hands-on experience with the Pinecone vector database for the search component, supporting SPLADE vectors natively. Cover topics including data preprocessing, creating dense and sparse vector embeddings, preparing data for Pinecone, creating a sparse-dense index, and making hybrid search queries.
Syllabus
Hybrid search for medical field
Hybrid search process
Prerequisites and Installs
Pubmed QA data preprocessing step
Creating dense vectors with sentence-transformers
Creating sparse vector embeddings with SPLADE
Preparing sparse-dense format for Pinecone
Creating the Pinecone sparse-dense index
Making hybrid search queries
Final thoughts on sparse-dense with SPLADE
Taught by
James Briggs
Related Courses
Semantic Search for AI - Testing Out Qdrant Neural SearchDavid Shapiro ~ AI via YouTube How to Use OpenAI Whisper to Fix YouTube Search
James Briggs via YouTube Spotify's Podcast Search Explained
James Briggs via YouTube Is GPL the Future of Sentence Transformers - Generative Pseudo-Labeling Deep Dive
James Briggs via YouTube Train Sentence Transformers by Generating Queries - GenQ
James Briggs via YouTube