Non-Parametric Transformers - Paper Explained

Offered By: Aleksa Gordić - The AI Epiphany via YouTube

Course Description

Overview

Dive deep into the world of Non-Parametric Transformers with this comprehensive 46-minute video lecture. Explore the key concepts from the paper "Self-Attention Between Datapoints: Going Beyond Individual Input-Output Pairs in Deep Learning". Learn about the NPT architecture, its connections to BERT, Graph Neural Networks, and CNNs, and understand how it achieves impressive results on tabular data benchmarks. Discover how NPT learns underlying relational and causal mechanisms, and examine its ability to attend to similar vectors. Gain valuable insights into this innovative approach to machine learning through detailed explanations and visual aids.

Syllabus

Key ideas of the paper
Abstract
Note on k-NN non-parametric machine learning
Data and NPT setup explained
NPT loss is inspired by BERT
A high-level architecture overview
NPT jointly learns imputation and prediction
Architecture deep dive input embeddings, etc
More details on the stochastic masking loss
Connections to Graph Neural Networks and CNNs
NPT achieves great results on tabular data benchmarks
NPT learns the underlying relational, causal mechanisms
NPT does rely on other datapoints
NPT attends to similar vectors
Conclusions

Taught by

Aleksa Gordić - The AI Epiphany

Non-Parametric Transformers - Paper Explained

Tags

Course Description

Overview

Syllabus

Taught by

Related Courses

Non-Parametric Transformers - Paper Explained

Tags

Course Description

Overview

Syllabus

Taught by

Related Courses

Login to Continue