ML Data Version Control and Reproducibility at Scale
Offered By: Linux Foundation via YouTube
Course Description
Overview
Explore data version control and reproducibility techniques for large-scale machine learning in this 38-minute talk by Einat Orr from Treeverse. Learn how to overcome challenges in ML data management, including reproducibility constraints and inefficient data transfer. Discover open-source tools for versioning data locally and best practices for working with data in the cloud without copying it. Gain insights into training models at scale using an OSS stack including Langchain, TensorFlow, PyTorch, and Keras. Acquire practical methods to enhance data management for developing and iterating on ML models, specifically tailored for modern computer vision research.
Syllabus
ML Data Version Control and Reproducibility at Scale - Einat Orr, Treeverse
Taught by
Linux Foundation
Tags
Related Courses
Données et services numériques, dans le nuage et ailleursCertificat informatique et internet via France Université Numerique Introduction to Digital Curation
University College London via Independent Excel Avanzado
Miríadax SAP Business Warehouse powered by SAP HANA
SAP Learning Programming Mobile Applications for Android Handheld Systems: Part 2
University of Maryland, College Park via Coursera