Transformers Courses
Center for Language & Speech Processing(CLSP), JHU via YouTube Do Pretrained Transformers Learn In-Context by Gradient Descent?
Center for Language & Speech Processing(CLSP), JHU via YouTube The NLP Task Effectiveness of Long-Range Transformers
Center for Language & Speech Processing(CLSP), JHU via YouTube Transformers in Time Series: A Survey
AI Institute at UofSC - #AIISC via YouTube Knowledge Circuits in Pretrained Transformers Explained
Unify via YouTube The Emergence of Essential Sparsity in Large Pre-trained Models
Unify via YouTube Elastic Decision Transformer Explained
Unify via YouTube Attention with Linear Biases Explained
Unify via YouTube Flash Attention Explained - Algorithm, Applications, and Performance
Unify via YouTube Drug Discovery Generative AI Using Tensor Network GPT or BERT
ChemicalQDevice via YouTube