YoVDO

VENOM: A Vectorized N:M Format for Unleashing the Power of Sparse Tensor Cores

Offered By: Scalable Parallel Computing Lab, SPCL @ ETH Zurich via YouTube

Tags

High Performance Computing Courses Deep Learning Courses Linear Algebra Courses Transformers Courses GPU Acceleration Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Discover a groundbreaking approach to sparse tensor computation in this conference talk from the International Conference for High Performance Computing, Networking, Storage, and Analysis (#SC23). Explore the innovative V:N:M format that enables execution of arbitrary N:M ratios on NVIDIA's Sparse Tensor Cores (SPTCs), overcoming the limitations of the current 2:4 format. Delve into the high-performance sparse library Spatha, designed to efficiently exploit this new format, achieving up to 37x speedup over cuBLAS. Examine a novel second-order pruning technique that allows for high sparsity ratios in modern transformers with minimal accuracy loss. Gain insights into GPU Tensor Cores, sparse formats, sparse linear algebra, and evaluation methods as you uncover the potential of this vectorized approach to unleash the power of sparse tensor cores in deep learning applications.

Syllabus

Intro
GPU Tensor Cores
Sparse Formats
Sparse Linear Algebra
Second Order Pruning
Evaluation


Taught by

Scalable Parallel Computing Lab, SPCL @ ETH Zurich

Related Courses

Coding the Matrix: Linear Algebra through Computer Science Applications
Brown University via Coursera
Mathematical Methods for Quantitative Finance
University of Washington via Coursera
Introduction à la théorie de Galois
École normale supérieure via Coursera
Linear Algebra - Foundations to Frontiers
The University of Texas at Austin via edX
Massively Multivariable Open Online Calculus Course
Ohio State University via Coursera