Discrete Inference and Learning in Artificial Vision
Offered By: École Centrale Paris via Coursera
Course Description
Overview
Artificial vision applications, such as object detection in natural images and automatic segmentation of medical acquisitions, rely on models that interpret the visual information provided to a computer. The model provides a compromise between the support given by the observations and the prior domain knowledge. This course is concerned with the two computational problems that arise when using such models in practice.
Inference (Energy Minimization):
Given a visual observation (for example, an image or an MRI scan), we are interested in estimating its most likely interpretation (i.e. the location of all the objects in the image, or the segments of the MRI scan) according to the model. While the problem cannot be solved optimally, we will describe state of the art approximate algorithms that provide very accurate solutions in practice. While the theoretical properties of the algorithms will be discussed briefly, the main emphasis will be on their application.
Learning
(Parameter Estimation):
Given a set of training samples consisting of inputs and their desired outputs, (for example, images and the location of the objects, or MRI scans and their segmentations) we would like to estimate a model that is suited to the task at hand. We will show how the problem of learning a model can be formulated as empirical risk minimization. Furthermore, we will present efficient algorithms for solving the corresponding optimization problem.
Syllabus
- Lecture 1: Introduction to artificial vision with discrete graphical models: In this lecture, the interdisciplinary nature of computational vision is briefly introduced along with its potential use in different application domains. Subsequently, the concept of discrete modeling of artificial vision tasks is introduced from theoretical view point along with short examples demonstrating the interest of such an approach in low, mid and high-level vision. Examples refer to blind image deconvolution, knowledge-based image segmentation, optical flow, graph matching, 2d-to-3d view-point invariant detection and modeling and grammar-driven image based reconstruction.
- Lecture
2: Reparameterization and dynamic programming: In this lecture, we provide a brief introduction to
undirected graphical models. We also provide a formal definition of the problem
of inference (specifically, energy minimization). We introduce the concept of
reparameterization, which forms the building block of all the inference
algorithms discussed in the course. We describe a simple inference algorithm
known as dynamic programming, which consists of a series of reparameterization.
We show how dynamic programming can be used to perform exact inference on
chains.
- Lecture 3: Maximum flow and minimum cut: In this lecture, we introduce the concept of functions on arcs of a directed graph. We focus on a special function known as the flow function. Associated with this function is the combinatorial optimization problem of computing the maximum flow of a directed graph. We also introduce the concept of a cut in a directed graph, and prove that the minimum cost cut is equivalent to the maximum flow. We describe a simple algorithm for solving the maximum flow, or equivalent the minimum cut, problem.
- Lecture
4: Minimum cut based inference: In this lecture, we show how the problem of inference for undirected
graphical models with two labels can be formulated as a minimum cut problem. We
characterize the energy function that can be minimized optimally using the
minimum cut problem. We show examples using the image segmentation and texture
synthesis problems, which can be formulated using two labels. We consider the
multi-label problem, and devise approximate algorithms for inference based on
the minimum cut algorithms. We show examples using the stereo reconstruction
and the image denoising problems.
- Lecture
5: Belief propagation: In this lecture we present the basic concepts of
message passing and belief propagation networks. The concept is initially
demonstrated using chains, extended to the case of trees and then eventually to
arbitrary graphs. The strengths and the limitations of such an optimization
framework are presented. The image completion and texture synthesis problems
are considered as examples to demonstrate the interest of such a family of
optimization algorithms.
- Lecture 6: Linear programing and duality: In this lecture, discrete inference is addressed through concepts coming from linear programming relaxations. In particular, we explain how a graph-optimization problem can be expressed as a linear programing one and then how one can take benefit of the duality theorem to develop efficient optimization methods. The problem of optical flow and its deformable registration variant in medical image analysis is considered as an example to demonstrate the interest of such optimization algorithms.
- Lecture
7: Dual decomposition and higher order graphs: In this lecture, we introduce the dual decomposition
framework for the optimization of low rank and higher order graphical models.
First, we demonstrate the concept of the method using a simple toy example and
then we extend to the most general optimization problem case. Three different
examples are considered in the context of higher order optimization, the
problem of linear mapping between images, the case of dense deformable graph
matching and the development of pose invariant object segmentation methods in
the context of medical imaging.
- Lecture 8: Parameter learning: In this lecture, we introduce two frameworks for estimating the parameters of a graphical model using fully supervised training data. The first framework maximizes the likelihood of the training data while regularizing the parameters. The second framework minimizes the empirical risk, as measured by a user-defined loss function, while regularizing the parameters. We provide a brief description of the algorithms required to solve the related optimization problems. We show the results obtained on standard machine learning datasets.
Taught by
Nikos Paragios and Pawan Kumar
Related Courses
2D image processingHigher School of Economics via Coursera 3D Reconstruction - Multiple Viewpoints
Columbia University via Coursera 3D Reconstruction - Single Viewpoint
Columbia University via Coursera AI-900: Microsoft Certified Azure AI Fundamentals
A Cloud Guru TensorFlow Developer Certificate Exam Prep
A Cloud Guru