Continuous Data Pipeline for Real-Time Benchmarking and Data Set Augmentation

Offered By: Data Council via YouTube

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!

Explore a 15-minute conference talk from Data Council on building continuous data pipelines for real-time benchmarking and dataset augmentation. Learn how to generate datasets and implement real-time precision/recall splits to detect data shifts, prioritize data collection, and retrain models. Discover the importance of curating representative datasets for accurate ML systems and monitoring post-deployment metrics. Gain insights into addressing data shifts in unstructured language models and leveraging open-source APIs and annotation tools to streamline processes. Presented by Ivan Aguilar, a data scientist at Teleskope, this talk covers topics such as the problem statement, usual approaches, open-source data APIs, task overview, annotations overview, and final thoughts on improving ML model performance through effective data management strategies.

Syllabus

Intro
Why is this a problem?
Usual Approaches
Open Source Data API's
Task Overview
Annotations Overview
Final Thoughts

Taught by

Data Council

Continuous Data Pipeline for Real-Time Benchmarking and Data Set Augmentation

Tags

Course Description

Overview

Syllabus

Taught by

Related Courses

Continuous Data Pipeline for Real-Time Benchmarking and Data Set Augmentation

Tags

Course Description

Overview

Syllabus

Taught by

Related Courses

Login to Continue