YoVDO

Non-Parametric Transformers - Paper Explained

Offered By: Aleksa Gordić - The AI Epiphany via YouTube

Tags

Neural Networks Courses Deep Learning Courses Attention Mechanisms Courses

Course Description

Overview

Dive deep into the world of Non-Parametric Transformers with this comprehensive 46-minute video lecture. Explore the key concepts from the paper "Self-Attention Between Datapoints: Going Beyond Individual Input-Output Pairs in Deep Learning". Learn about the NPT architecture, its connections to BERT, Graph Neural Networks, and CNNs, and understand how it achieves impressive results on tabular data benchmarks. Discover how NPT learns underlying relational and causal mechanisms, and examine its ability to attend to similar vectors. Gain valuable insights into this innovative approach to machine learning through detailed explanations and visual aids.

Syllabus

Key ideas of the paper
Abstract
Note on k-NN non-parametric machine learning
Data and NPT setup explained
NPT loss is inspired by BERT
A high-level architecture overview
NPT jointly learns imputation and prediction
Architecture deep dive input embeddings, etc
More details on the stochastic masking loss
Connections to Graph Neural Networks and CNNs
NPT achieves great results on tabular data benchmarks
NPT learns the underlying relational, causal mechanisms
NPT does rely on other datapoints
NPT attends to similar vectors
Conclusions


Taught by

Aleksa Gordić - The AI Epiphany

Related Courses

Neural Networks for Machine Learning
University of Toronto via Coursera
機器學習技法 (Machine Learning Techniques)
National Taiwan University via Coursera
Machine Learning Capstone: An Intelligent Application with Deep Learning
University of Washington via Coursera
Прикладные задачи анализа данных
Moscow Institute of Physics and Technology via Coursera
Leading Ambitious Teaching and Learning
Microsoft via edX