Let's Talk About Raw Documents - Extracting Structured Data for ML Pipelines
Offered By: MLOps.community via YouTube
Course Description
Overview
Syllabus
[] Introduction to Crag Wolfe
[] Agenda
[] Unstructured.io introduction
[] Then open-source community
[] The goal
[] Rapidly build custom preprocessing API
[] Staging
[] Demo
[] Developer quick start
[] SEC Filing Section Pipeline
[] Section 1: Pulling in Raw Documents
[] Section 2: Reading the Document
[] Section 3: Custom Partitioning Bricks
[] Section 4: Cleaning Bricks
[] Section 5: Staging Bricks
[] Section 6: Define the Pipeline API
[] SEC Sentiment Analysis Model notebook
[] Stage for transformers
[] Training a summarization model with Unstructured + Argilla + Huggingface
[] Crag's previous engineering experience
[] Deciding what to tackle next
[] Editing documents
[] Scaling issues
[] Moving out of NLP
[] Wrap up
Taught by
MLOps.community
Related Courses
Text Mining and AnalyticsUniversity of Illinois at Urbana-Champaign via Coursera Introduction to Natural Language Processing
University of Michigan via Coursera Enabling Technologies for Data Science and Analytics: The Internet of Things
Columbia University via edX Machine Learning Capstone: An Intelligent Application with Deep Learning
University of Washington via Coursera moocTLH: Nuevos retos en las tecnologĂas del lenguaje humano
Universidad de Alicante via MirĂadax