Action Recognition, Temporal Localization and Detection in Videos
Offered By: University of Central Florida via YouTube
Course Description
Overview
Explore action recognition, temporal localization, and detection in trimmed and untrimmed videos through this 36-minute lecture by Rui Hou from the University of Central Florida. Dive into topics such as knowledge transfer to novel categories, comparison with semantic attributes, and experimental results on datasets like UCF101 and UCF-Sports. Learn about the pipeline for detecting actions, including the adaptation of Faster R-CNN from 2D to 3D, Tube Proposal Network, and Tube of Interest Max Pooling. Examine video action and object segmentation techniques, including encoder-decoder architectures, 3D Pyramid Pooling, and dilated convolution. Gain insights into the current limitations and future directions of research in this field.
Syllabus
Intro
Action Recognition
Temporal Action Localization
Outline
Knowledge Transfer to Novel Categories
Comparison with Semantic Attributes (THUMOS)
Experiment Results on UCF101
Pipeline Overview
Detecting Actions
Experimental Setup
Detection examples
Generalizing Faster R-CNN from 2D to 3D
Tube Proposal Network
Tube of Interest Max Pooling
Experiment results on UCF-Sports
Evaluation on YouTube Videos
Limitations
Video Action Segmentation -- Overview
Video Object Segmentation -- Overview
Video Object Segmentation -- Encoder
Video Object Segmentation - 3D Pyramid Pooling
Video Object Segmentation -- Decoder
Dilated Convolution
Summary
Future Work
Taught by
UCF CRCV
Tags
Related Courses
Create Image Captioning Models - DeutschGoogle Cloud via Coursera Encoder-Decoder Architecture - Deutsch
Google Cloud via Coursera Advanced Chatbots with Deep Learning and Python
Packt via Coursera Machine Translation with Keras
DataCamp Natural Language Generation in Python
DataCamp