Vision Language Models and PDFs: What You See Is What You Search - Haystack EU 2024
Offered By: OpenSource Connections via YouTube
Course Description
Overview
Explore a groundbreaking approach to information extraction from complex PDF documents in this conference talk from Haystack EU 2024. Discover how Vision Language Models (VLMs) are revolutionizing the traditional multi-step process of text extraction, OCR, layout analysis, chunking, and embedding. Learn about ColPali, a new retrieval model that efficiently embeds entire PDF pages, including text, figures, and charts, resulting in improved retrieval quality and a simplified extraction and indexing process. Gain insights into representing ColPali in Vespa and its superior performance on the Visual Document Retrieval (ViDoRe) Benchmark. Benefit from the expertise of Jo Kristian, Chief Scientist at Vespa.ai, as he shares his two decades of experience in building and deploying search and recommender systems.
Syllabus
Haystack EU 2024 - Jo Kristian Bergum:What You See Is What You Search: Vision Language Models & PDFs
Taught by
OpenSource Connections
Related Courses
Semantic Web TechnologiesopenHPI أساسيات استرجاع المعلومات
Rwaq (رواق) 《gacco特別企画》Evernoteで広がるgaccoの学びスタイル (ga038)
University of Tokyo via gacco La Web Semántica: Herramientas para la publicación y extracción efectiva de información en la Web
Pontificia Universidad Católica de Chile via Coursera 快速学习
University of Science and Technology of China via Coursera