AI Safety and Robustness: Recent Advances and Future Directions
Offered By: INSAIT Institute via YouTube
Course Description
Overview
Explore recent advances in AI safety and robustness in this 44-minute INSAIT Tech Series talk by Prof. Zico Kolter. Delve into the challenges of preventing undesirable outputs from large language models (LLMs) and learn about the built-in "guardrails" designed to enforce developer-specified policies. Discover how adversarial attacks have historically circumvented these safeguards and manipulated LLMs for unintended purposes. Examine the latest breakthroughs that have significantly improved the practical robustness of LLMs, including a recent competition where attackers failed to breach a deployed LLM over a month-long period. Gain insights into the current state of AI safety, ongoing challenges in the field, and the future prospects for developing safe AI systems.
Syllabus
INSAIT Tech Series: Prof. Zico Kolter - AI Safety & Robustness: Recent Advances & Future Directions
Taught by
INSAIT Institute
Related Courses
Machine Learning and Artificial Intelligence Security Risk: Categorizing Attacks and Failure ModesLinkedIn Learning How Apple Scans Your Phone and How to Evade It - NeuralHash CSAM Detection Algorithm Explained
Yannic Kilcher via YouTube Deep Learning New Frontiers
Alexander Amini via YouTube Deep Learning New Frontiers
Alexander Amini via YouTube MIT 6.S191 - Deep Learning Limitations and New Frontiers
Alexander Amini via YouTube