YoVDO

AI Safety and Robustness: Recent Advances and Future Directions

Offered By: INSAIT Institute via YouTube

Tags

Cybersecurity Courses Machine Learning Courses AI Ethics Courses Language Models Courses Adversarial Attacks Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore recent advances in AI safety and robustness in this 44-minute INSAIT Tech Series talk by Prof. Zico Kolter. Delve into the challenges of preventing undesirable outputs from large language models (LLMs) and learn about the built-in "guardrails" designed to enforce developer-specified policies. Discover how adversarial attacks have historically circumvented these safeguards and manipulated LLMs for unintended purposes. Examine the latest breakthroughs that have significantly improved the practical robustness of LLMs, including a recent competition where attackers failed to breach a deployed LLM over a month-long period. Gain insights into the current state of AI safety, ongoing challenges in the field, and the future prospects for developing safe AI systems.

Syllabus

INSAIT Tech Series: Prof. Zico Kolter - AI Safety & Robustness: Recent Advances & Future Directions


Taught by

INSAIT Institute

Related Courses

Introduction to Artificial Intelligence
Stanford University via Udacity
Natural Language Processing
Columbia University via Coursera
Probabilistic Graphical Models 1: Representation
Stanford University via Coursera
Computer Vision: The Fundamentals
University of California, Berkeley via Coursera
Learning from Data (Introductory Machine Learning course)
California Institute of Technology via Independent