YoVDO

Takeaways from the SCALE 2024 Workshop on Video-based Event Retrieval

Offered By: Center for Language & Speech Processing(CLSP), JHU via YouTube

Tags

Information Retrieval Courses Computer Vision Courses Speech Recognition Courses Optical Character Recognition Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore key insights from the SCALE 2024 Workshop on Video-based Event Retrieval in this comprehensive talk presented by Reno Kriz and Kate Sanders from the Center for Language & Speech Processing at Johns Hopkins University. Delve into the challenges of multilingual event-centric video retrieval, focusing on user-generated content from non-professional sources. Learn about the development of MultiVENT 2.0, a large-scale video retrieval dataset, and the efforts of five sub-teams working on improving models for various modalities including Vision, Optical Character Recognition (OCR), Audio, and Text. Discover the three primary findings: the importance of extracting specific text from videos, the benefits of LLM summarization for noisy text outputs, and the necessity of fusing multiple modalities for optimal performance. Gain valuable insights into the evolving landscape of event analysis and retrieval in the age of user-generated content.

Syllabus

Reno Kriz and Kate Sanders: Takeaways from the SCALE 2024 Workshop on Video-based Event Retrieval


Taught by

Center for Language & Speech Processing(CLSP), JHU

Related Courses

Semantic Web Technologies
openHPI
أساسيات استرجاع المعلومات
Rwaq (رواق)
《gacco特別企画》Evernoteで広がるgaccoの学びスタイル (ga038)
University of Tokyo via gacco
La Web Semántica: Herramientas para la publicación y extracción efectiva de información en la Web
Pontificia Universidad Católica de Chile via Coursera
快速学习
University of Science and Technology of China via Coursera