YoVDO

Action Recognition, Temporal Localization and Detection in Videos

Offered By: University of Central Florida via YouTube

Tags

Computer Vision Courses Machine Learning Courses Encoder-Decoder Architecture Courses

Course Description

Overview

Explore action recognition, temporal localization, and detection in trimmed and untrimmed videos through this 36-minute lecture by Rui Hou from the University of Central Florida. Dive into topics such as knowledge transfer to novel categories, comparison with semantic attributes, and experimental results on datasets like UCF101 and UCF-Sports. Learn about the pipeline for detecting actions, including the adaptation of Faster R-CNN from 2D to 3D, Tube Proposal Network, and Tube of Interest Max Pooling. Examine video action and object segmentation techniques, including encoder-decoder architectures, 3D Pyramid Pooling, and dilated convolution. Gain insights into the current limitations and future directions of research in this field.

Syllabus

Intro
Action Recognition
Temporal Action Localization
Outline
Knowledge Transfer to Novel Categories
Comparison with Semantic Attributes (THUMOS)
Experiment Results on UCF101
Pipeline Overview
Detecting Actions
Experimental Setup
Detection examples
Generalizing Faster R-CNN from 2D to 3D
Tube Proposal Network
Tube of Interest Max Pooling
Experiment results on UCF-Sports
Evaluation on YouTube Videos
Limitations
Video Action Segmentation -- Overview
Video Object Segmentation -- Overview
Video Object Segmentation -- Encoder
Video Object Segmentation - 3D Pyramid Pooling
Video Object Segmentation -- Decoder
Dilated Convolution
Summary
Future Work


Taught by

UCF CRCV

Tags

Related Courses

Create Image Captioning Models - Deutsch
Google Cloud via Coursera
Encoder-Decoder Architecture - Deutsch
Google Cloud via Coursera
Advanced Chatbots with Deep Learning and Python
Packt via Coursera
Machine Translation with Keras
DataCamp
Natural Language Generation in Python
DataCamp