Development of machine learning - free course from the School of Data Analysis, training 4 semesters, Date: December 2, 2023.
Miscellaneous / / December 06, 2023
This direction is suitable for those who like to program and create services and applications that can be used by thousands and millions of people.
Write effective code, build and optimize industrially efficient data-driven systems.
In the development of high-tech products based on machine learning.
Each student must successfully complete at least three courses during the semester. For example, if there are two of them in the main program, then you need to choose one of the special courses.
Knowledge is tested primarily through homework - exams and tests are conducted only in some subjects.
First semester
Mandatory
Algorithms and data structures, part 1
01Complexity and computational models. Analysis of accounting values (beginning)
02Analysis of accounting values (end)
03Merge-Sort and Quick-Sort algorithms
04Ordinal statistics. Heaps (beginning)
05Heaps (end)
06Hashing
07Search trees (beginning)
08Search trees (continued)
09Search trees (end). System of disjoint sets
10Problems of RMQ and LCA
11Data structures for geometric search
12Problem of dynamic connectivity in an undirected graph
C++ language training, part 1
C++ is a powerful language with a rich heritage. For those who have just set out on the path of mastering this language, it is very easy to get lost in the abundance of techniques and techniques created over the past 30 years. The course teaches "Modern C++" - a modern subset of the language (standards 11, 14 and 17). A lot of attention is paid to tools and libraries - things that are not part of the language, but without which it will not be possible to build a large and complex project.
01Introduction to C++.
02Constants. Pointers and links. Passing arguments to a function.
03Classes.
04Dynamic memory management.
05Variables, pointers and references.
06Memory management, smart pointers, RAII.
07Standard template library.
08Inheritance and virtual functions.
09Error handling.
10Design patterns.
11Namespaces Move semantics Perfect forwarding.
12Representation of structures and classes in memory. Data alignment. Pointers to class members/methods. Variadic templates.
Machine learning, part 1
01Basic concepts and examples of applied problems
02Metric classification methods
03Logical classification methods and decision trees
04Gradient linear classification methods
05Support Vector Machine
06Multivariate linear regression
07Nonlinear and nonparametric regression, non-standard loss functions
08Time series forecasting
09Bayesian classification methods
10Logistic regression
11Search for association rules
Second term
Mandatory
Machine learning, part 2
01Neural network methods of classification and regression
02Compositional methods of classification and regression
03Criteria for selecting models and methods for selecting features
04Ranking
05Reinforcement learning
06Learning without a teacher
07Problems with partial training
08Collaborative filtering
09Topic modeling
To choose from
Algorithms and data structures, part 2
01Bypass in width. Depth First Traversal (start)
02 Depth crawl (continued)
03 Depth crawl (end). 2-cuts
04Finding shortest paths (beginning)
05Finding shortest paths (continued)
06Minimum spanning trees
07Minimal cuts. Search for substrings (start)
08Search for substrings (continued)
09Search for substrings (end)
10Suffix trees (beginning)
11Suffix trees (ending). Suffix arrays (start)
12Suffix arrays (ending)
13Longest common substrings. Approximate substring search.
or
Python language
01Language Basics (Part 1)
02Language Basics (Part 2)
03Object-oriented programming
04Error handling
05Code design and testing
06Working with strings
07Memory model
08Functional programming
09Library review (part 1)
10Library review (part 2)
11Parallel computing in Python
12Advanced work with objects
or
C++ language training, part 2
The second part of the C++ course, which covers advanced topics and language capabilities.
01Multi-threaded programming. Synchronizing threads using mutexes and condition variables.
02Atomic variables. C++ memory model. Examples of lock-free data structures.
03Advanced meta-programming techniques in C++. Metafunctions, SFINAE, concepts.
04Competitive programming, interaction with the network.
05llvm architecture. Working with the C++ parse tree. Development of tools for analyzing C++ code.
Third semester
To choose from
Natural Language Processing
“NLP (Natural Language Processing) is a subset of the broader field of AI that attempts to teach a computer to understand and process raw data in natural language. Most of the information available today is not structured text. As humans, of course, it is not difficult for us to understand them (if they are in our native language), but we are not able to process as much data as a machine could process. But how can you make a machine understand this data and, moreover, extract some information from it? Several years ago, at the opening of ACL (one of the main, if not the most important NLP conference) in its Presidential speech Marti Hearst admitted that she can no longer give students her favorite exercise. Using HAL 9000 as an example (one of the examples of artificial intelligence in science fiction), she asked students what the machine could do like HAL and what it could not do yet. Nowadays this is no longer such a good exercise, since almost all of this can now be done by a computer. It's amazing how quickly the field is growing and how much we've achieved. In the course we will try to make you understand and feel what is happening in the world. What problems are solved, how this happens; how some statistical approaches (to which courses on NLP were almost entirely devoted a few years ago) receive a new life and new interpretation in neural networks, and which ones gradually die out. We will show that NLP is not a set of pairs (problem, solution), but general ideas that penetrate different problems and reflect some common concept. You will also learn what happens in practice and when which approaches are more applicable. This is what we do, what we love, and we are ready to share it with you :)"
01 https://lena-voita.github.io/nlp_course.html
02 https://github.com/yandexdataschool/nlp_course
or
Computer vision
"The course is devoted to methods and algorithms of computer vision, i.e. extracting information from images and videos. Let's look at the basics of image processing, image classification, image search by content, face recognition, image segmentation. Then we’ll talk about video processing and analysis algorithms. The last part of the course is devoted to 3D reconstruction. For most problems we will discuss existing neural network models. In the course we try to pay attention only to the most modern methods that are currently used in solving practical and research problems. The course is largely practical rather than theoretical. Therefore, all lectures are equipped with laboratory and homework, which allow you to try most of the methods discussed in practice. The work is performed in Python using various libraries."
01Digital imaging and tone correction
02Image processing basics
03Image stitching
04Image classification and search for similar ones
05Convolutional neural networks for classification and search for similar images
06Object detection
07Semantic segmentation
08Style transfer and image synthesis
09Video recognition
10Sparse 3D reconstruction
11Dense 3D reconstruction
12Reconstruction from one frame and point clouds, parametric models
or
Bayesian methods in machine learning
01Bayesian approach to probability theory
02Analytic Bayesian inference
03Bayesian model selection
04Automatic determination of relevance
05 Relevance vector method for classification problem
06Probabilistic models with latent variables
07Variational Bayesian inference
08Bayesian mixture separation model of Gaussians
09Monte Carlo methods with Markov chains
10Latent Dirichlet allocation
11Gaussian processes for regression and classification
12Nonparametric Bayesian methods
Fourth semester
Mandatory
ML Engineering Practice
The course is project work on developing ML projects in teams.
ML Research Practice
The course represents work on team research projects in the field of machine learning.
Recommended special courses
Deep learning
01Course material
Reinforcement learning
01Course material
Self Driving Cars
The course covers the core components of self-driving technology: localization, perception, prediction, behavioral level, and motion planning. For each component, the main approaches will be described. Additionally, students will become familiar with current market conditions and technological challenges.
01Overview of the main components and sensors of an unmanned vehicle. Levels of autonomy. Drive by Wire. Self-driving cars as a business product. Ways to evaluate progress in creating drones. Localization basics: gnss, wheel odometry, Bayesian filters.
02Lidar localization methods: ICP, NDT, LOAM. Introduction to visual SLAM using ORB-SLAM as an example. Statement of the GraphSLAM problem. Reducing the GraphSLAM problem to a nonlinear least squares method. Selecting the correct parameterization. Systems with a special structure in GraphSLAM. Architectural approach: frontend and backend.
03Recognition task in a self-driving car. Static and dynamic obstacles. Sensors for the recognition system. Representation of static obstacles. Detection of static obstacles using lidar (VSCAN, neural network methods). Using lidar in conjunction with images to detect statics (semantic image segmentation, depth completion). Stereo camera and getting depth from a picture. Stixel World.
04Representation of dynamic obstacles in a self-driving car. Neural network methods for detecting objects in 2D. Detection based on Bird-eye view of lidar cloud representation. Using lidar with imagery to detect dynamic obstacles. Car detection in 3D based on pictures (3D boxes fitting, CAD models). Radar-based dynamic obstacle detection. Object tracking.
05Car driving patterns: rear wheel, front wheel. Path planning. The concept of configuration space. Graph methods for constructing trajectories. Trajectories that minimize jerk. Optimization methods for constructing trajectories.
06Speed planning in a dynamic environment. ST planning. Predicting the behavior of other road users
Neuro-Bayesian methods
The course focuses on the application of Bayesian methods in deep learning. The lectures will talk about the use of probabilistic modeling to build generative data models, the use of competing networks for approximate inference, modeling uncertainty in neural network parameters, and some open problems in deep training.
01Stochastic variational inference
02Doubly stochastic variational inference
03Variational autoencoder, normalizing flows for variational inference
04Methods for reducing variance in latent variable models
05Estimation of the ratio of distribution densities, application using the example of \alpha-GAN
06Bayesian neural networks
07Bayesian compression of neural networks
08Semi-implicit variational inference