YoVDO

Introduction to data analysis

Offered By: Saint Petersburg State University via Coursera

Tags

Data Analysis Courses Statistics & Probability Courses Machine Learning Courses Descriptive Statistics Courses Dimensionality Reduction Courses Clustering Courses

Course Description

Overview

With this course, you will begin to take the first steps in the world of data analysis. You will see in detail the main concepts and processes that make up this discipline. The main goal of the course is acquisition of knowledge about the mathematical and statistical basics underlying the main ideas and approaches used in data science. This is achieved through setting and solving typical tasks, which a researcher in the field of data science can face in his work. You will get practical skills in working with data analysis tools used in different spheres of human activity. You will be acquainted with the main tasks, methods and basic algorithms, as well as with the spheres of their practical applications. You will know how applied problems of data processing and analysis are being solved. You will be acquainted with the main concepts of artificial neural networks and the ways they are being trained.

Syllabus

  • Data and Big Data Analysis: Approaches, Functions and Software Tools
    • The 1-st module explores the concept of data analysis and introduces
      basic techniques of this analysis. It discusses the concept of big data
      and its possible applications. It also considers the relationship between
      different approaches to process data as well as basic software for data
      analysis. Some useful functions for data analysis are presented. The
      principles of big data processing are discussed, in particular the
      MapReduce model.
  • Basic Characteristics of Data. Distributions, Statistics and Regressions
    • In Module 2, descriptive statistics and exploratory data analysis are
      discussed. The main characteristics of data distributions are introduced
      and their calculations are presented in some examples. Frequency and
      Bayesian approaches to hypothesis testing are explained. The basic concepts
      of regression and correlation analysis are formulated, focusing on linear
      analysis methods.
  • Clustering and Dimensionality Reduction
    • Module 3 discusses the clustering problem and the algorithms for solving it.
      Hierarchical clustering, k-means algorithm and CURE-algorithm are explained.
      Peculiarities of the algorithms operation in non-Euclidean space are specified.
      The module also covers some questions of dimensionality reduction, the basic
      facts of singular value decomposition, and illustrates its applications.
      It also considers the principal component analysis and CUR-decomposition,
      applicable for big data processing.
  • Machine Learning and Artificial Neural Networks
    • Module 4 discusses models and methods of machine learning. The model of the
      perceptron, its functioning, advantages and disadvantages are discussed in
      detail. The basic support vector machine and its generalizations are
      considered. Further it discusses artificial neural networks, their
      organization and training. The main features of deep neural networks,
      problems that appear with such networks and modern methods to overcome
      these problems are discussed. The convolutional and recurrent neural networks
      are also considered.

Taught by

Григорьев Юрий Александрович, Руднев Владимир Александрович, Андронов Иван Викторович, Яревский Евгений Александрович and Яковлев Сергей Леонидович

Tags

Related Courses

Accounting for Death in War: Separating Fact from Fiction
Royal Holloway, University of London via FutureLearn
Advanced Machine Learning
The Open University via FutureLearn
Advanced Statistics for Data Science
Johns Hopkins University via Coursera
農企業管理學 (Agribusiness Management)
National Taiwan University via Coursera
AI & Machine Learning
Arizona State University via Coursera