YoVDO

Continuous Data Pipeline for Real-Time Benchmarking and Data Set Augmentation

Offered By: Data Council via YouTube

Tags

Data Pipelines Courses Machine Learning Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore a 15-minute conference talk from Data Council on building continuous data pipelines for real-time benchmarking and dataset augmentation. Learn how to generate datasets and implement real-time precision/recall splits to detect data shifts, prioritize data collection, and retrain models. Discover the importance of curating representative datasets for accurate ML systems and monitoring post-deployment metrics. Gain insights into addressing data shifts in unstructured language models and leveraging open-source APIs and annotation tools to streamline processes. Presented by Ivan Aguilar, a data scientist at Teleskope, this talk covers topics such as the problem statement, usual approaches, open-source data APIs, task overview, annotations overview, and final thoughts on improving ML model performance through effective data management strategies.

Syllabus

Intro
Why is this a problem?
Usual Approaches
Open Source Data API's
Task Overview
Annotations Overview
Final Thoughts


Taught by

Data Council

Related Courses

Google Cloud Big Data and Machine Learning Fundamentals en Español
Google Cloud via Coursera
Data Analysis with Python
IBM via Coursera
Intro to TensorFlow 日本語版
Google Cloud via Coursera
TensorFlow on Google Cloud - Français
Google Cloud via Coursera
Freedom of Data with SAP Data Hub
SAP Learning