YoVDO

On-Device Speech Models Optimization and Deployment for Mobile Hardware

Offered By: tinyML via YouTube

Tags

TensorFlow Courses Quantization Courses Model Optimization Courses Self-Attention Courses

Course Description

Overview

Explore on-device speech model optimization and deployment in this tinyML Summit 2022 presentation. Dive into the challenges of real-time execution on mobile hardware, focusing on latency and memory footprint constraints. Learn about streaming-aware model design using functional and subclass TensorFlow APIs, and discover various quantization techniques including post-training quantization and quantization-aware training. Compare the pros and cons of different approaches and understand selection criteria based on specific ML problems. Examine benchmarks of popular speech processing model topologies, including residual convolutional and transformer neural networks, as demonstrated on mobile devices. Gain insights into local self-attention, multi-head self-attention, and real-world model implementations to enhance your understanding of efficient on-device speech processing.

Syllabus

Introduction
Agenda
Hardware detector
Streaming
Subclass API
Edge cases
Quantization
Posttraining Quantization
Fake Quantization
Native Quantization
Observations
Local Selfattention
MultiHealth Selfattention
Real Model
Models


Taught by

tinyML

Related Courses

3D-печать для всех и каждого
Tomsk State University via Coursera
Developing a Multidimensional Data Model
Microsoft via edX
Launching into Machine Learning 日本語版
Google Cloud via Coursera
Art and Science of Machine Learning 日本語版
Google Cloud via Coursera
Launching into Machine Learning auf Deutsch
Google Cloud via Coursera