MLOps - rate 80,000 rub. from Otus, training 5 months, date November 30, 2023.
Miscellaneous / / November 30, 2023
You will master all the necessary machine learning skills for streaming data and distributed environments. The program includes the necessary knowledge from the fields of Data Science and Data Engineering, which will allow you to process big data and write distributed algorithms in Spark.
You will practice each module by completing homework. At the end of the training, you will have a final project that will allow you to summarize all the knowledge you have acquired and add to your portfolio. It can be done as part of work tasks on your dataset or be a learning project based on data provided by OTUS.
Who is this course for?
For Machine Learning specialists or Software engineers who want to learn how to work with big data. Typically, such tasks exist in large IT companies with a large-scale digital product.
For Data Scientists who want to strengthen their skill set with engineering skills. Thanks to the course, you will be able to process data and independently display the results of ML solutions in production.
To learn, you will need basic data science skills. We suggest you look at the Map of Data Science courses at OTUS to find out the required level of training.
You will learn:
- Use standard ML pipeline tools in a distributed environment;
- Develop your own blocks for ML pipelines;
- Adapt ML algorithms to distributed environments and big data tools;
- Use Spark, SparkML, Spark Streaming;
- Develop algorithms for streaming data preparation for machine learning;
- Ensure quality control at all stages of the movement of ML solutions into industrial operation.
Demand for specialists
The skills you will master are as applied and promising as possible. More and more digital products are appearing on the market, the development of which requires working with big data and stream processing. Already now, specialists with such a pool of skills and some work experience can qualify for a salary of 270 thousand. rubles Another trend - automation of training and validation processes, on the contrary, in some way devalues the work of a classic Data Scientist. Everything is moving towards the point where even a non-specialist can do a fit-predict. Therefore, those who have at least superficial engineering skills are already at a premium.
Course Features
Lots of practice working with data
Wide range of skills from distributed ML and stream data processing to production output
Current tools and technologies: Scala, Spark, Python, Docker
Live communication with experts via webinars and Slack chat
4
courseEngaged in the development of a Data Science team that provides functionality based on machine learning for the company’s products and services. As a Data Scientist, he participated in the development of Kaspersky MLAD and MDR AI Analyst. IN...
Engaged in the development of a Data Science team that provides functionality based on machine learning for the company’s products and services. As a Data Scientist, he participated in the development of Kaspersky MLAD and MDR AI Analyst. As a C++ developer, he participated in the creation of MaxPatrol SIEM. He has been teaching computer for many years. science disciplines at MSTU GA. Author of a series of reports on ML, C++, DS project management and development teams. Member of the PC conference C++ Russia. Program Manager
8
courses20+ years of experience in custom development projects in IT. Dozens of successful projects, including those under government contracts. Experience in the development and implementation of ERP systems, open-source solutions, support for high-load applications. Teacher of courses on...
20+ years of experience in custom development projects in IT. Dozens of successful projects, including those under government contracts. Experience in the development and implementation of ERP systems, open-source solutions, support for high-load applications. Teacher of courses on Linux, Kuber, MLOps, DataOps, SolutionArchitect, IaC, SRE, as well as mentor of the HighLoad course
1
wellSpecialist in working with big data and machine learning. For 8 years he worked at Odnoklassniki.ru. Managed the OK Data Lab team (a laboratory for researchers in the field of big data and machine...
Specialist in working with big data and machine learning. For 8 years he worked at Odnoklassniki.ru. Managed the OK Data Lab team (a laboratory for researchers in the field of big data and machine learning). Big data analysis in Odnoklassniki has become a unique chance to combine theoretical training and scientific foundation with the development of real, in-demand products. Since 2019, he has been working at Sberbank as Managing Director. Acts as the leader of the cluster for developing a platform for recommendation systems in the mass personalization division. He graduated from St. Petersburg State University in 2004, where he defended his PhD in formal logical methods in 2007. I worked in outsourcing for almost 9 years without losing contact with the university and scientific environment.
Basic introduction to starting the course
-Topic 1.Gradient descent and linear models
-Topic 2.Overview of basic machine learning methods and metrics
-Topic 3.Evolution of approaches to working with data
-Topic 4.Basics of programming in Scala
Technological basis of distributed data processing
-Topic 5. Distributed file systems
-Topic 6. Resource managers in distributed systems
-Topic 7. Evolution of massively parallel and distributed computing frameworks
-Topic 8. Apache Spark 1 Basics
-Topic 9. Apache Spark 2 Basics
Distributed ML Basics
-Topic 10. Transfer of ML algorithms to a distributed environment
-Topic 11.ML in Apache Spark
-Topic 12.Developing your own blocks for SparkML
-Topic 13.Optimization of hyperparameters and AutoML
Stream processing
-Topic 14. Stream data processing
-Topic 15. Third party libraries for use with Spark
-Theme 16.Spark Streaming
-Topic 17. Structured and continuous streaming in Spark
-Topic 18.Alternative streaming frameworks
Goal setting and results analysis
-Topic 19. Determination of the goal of the ML project and preliminary analysis
-Topic 20. Long-term ML goals using the example of the task of reducing churn
-Topic 21.A/B testing
-Topic 22.Additional topics
Outputting ML results to production
-Topic 23. Approaches to bringing ML solutions into production
-Topic 24.Versioning, reproducibility and monitoring
-Topic 25.Online serving of models
-Topic 26. Patterns for asynchronous streaming ML and ETL
-Topic 27. If you need Python
ML in Python in production
-Topic 28.Production Code in Python. Organizing and Packaging Code
-Topic 29.REST architecture: Flask API
-Topic 30.Docker: Structure, application, deployment
-Topic 31.Kubernetes, container orchestration
-Theme 32.MLOPS tools for Kubernetes: KubeFlow, Seldon Core. Features of the operation of heterogeneous systems in the industry.
-Theme 33.Amazon Sagemaker
-Topic 34.AWS ML Service
Advanced topics
-Topic 35. Neural networks
-Topic 36. Distributed learning and inference of neural networks
-Topic 37.Gradient boosting on trees
-Topic 38. Reinforcement learning
Project work
-Topic 39. Selection of topic and organization of project work
-Topic 40. Consultation on projects and homework
-Topic 41.Protection of design work