Data Engineer from zero to Junior - free course from Skillbox, training, Date: November 29, 2023.
Miscellaneous / / November 30, 2023
For beginners
Learn Python and SQL from scratch. Learn to collect, analyze and process data. Solve problems based on real cases and add them to your portfolio. You can start a career in Data Engineering while studying.
For programmers
Improve your knowledge of SQL to work with databases. You will go through the entire path of a data engineer from collecting raw data to deploying the model. Experience in programming will help you quickly understand a new profession and change your field.
For beginning analysts
Learn all stages of working with data. Learn to collect information from different sources, build an architecture for storing it, and visualize reports. You will be able to independently prepare data for subsequent analysis.
Author of the Machine Learning course. Senior Data Scientist, Team Lead at SberData, Sber. 5+ years in the profession
Course speaker, R&D Director, UBIC Tech. More than 15 years of experience in development
Data Scientist at Sberbank, mathematician at the Computing Center of the Russian Academy of Sciences. Block “Fundamentals of Mathematics for Data Science”. More than 4 years of experience in teaching higher mathematics
First level: basic training
Get acquainted with the main areas of Data Science, train and implement your first ML model. Gain basic knowledge of mathematics, statistics and probability theory. All this will help you understand the basic principles of working with data. The average completion time is 6 months.
Introduction to Data Science
You will go through all stages of working with data. Learn to identify problems, collect business requirements. You will download data from various sources, conduct exploratory analysis and prepare the dataset for further use. Train and implement a ready-made ML model, try yourself as a product and marketing analyst. Learn how to formulate and test hypotheses. Master the basic tools for work: Python, SQL, Excel, Power Bi, Airflow.
Basic Mathematics for Data Science
Gain basic knowledge of mathematics to work with machine learning. You will understand what approximation, interpolation, functions, regressions, matrices and vectors are. Learn to work with mathematical entities in the SymPy Python library.
Fundamentals of statistics and probability theory
You will understand the principles of working with random variables and events. Become familiar with some types of distributions and statistical tests that are useful in constructing models and testing hypotheses.
Internship opportunity
Basic knowledge and skills are enough to get an internship - you can continue studying on the course and in the company at the same time.
Second level: Data Engineer Junior
Learn to collect complex data sets, prepare showcases and build pipelines for work, deploy DS projects from scratch and test code. You will be ready to work as a Junior Data Engineer. The average completion time is 6 months.
Introductory block
Find out what a Data Engineer does, what role he plays in a Data Science project, and what his career paths are. You will understand how the course is structured and what topics you will study.
SQL
Learn to manipulate data in existing tables, perform insert, clear and update operations. You can save data from the database in various formats. Learn about window functions and the basics of preparing data marts using SQL. Learn to ensure correct insertion of information in transaction mode. You will be able to read and understand the transaction log. Learn what indexes are and index architecture and where they are used. Learn techniques to speed up queries.
Python. LVL 2
Consider data types, how they can be converted, and how Python and SQL work together—getting data from a database, working with the data, and running queries. Learn the basic concepts of JSON and XML data schemas. You will be able to configure application debugging, write tests, anonymize and encrypt data.
Libraries for Python
You will learn what libraries for working with graphs are, supervised learning, visualization of metrics and sources of datasets. Learn to use Python and libraries to work with data. You can continue learning Pandas.
Airflow
Review key concepts and practices for working with Airflow. Learn architecture and interoperability fundamentals from UI to CLI. Build your first data pipeline.
Spark Basics
Master Spark: learn what computing resources it operates on, how it stores data, and works with memory and disk. Set up your first local stand. Learn the basics of RDD: basic concepts, working with sources, actions. Learn to work with the Dataframe API. Learn performance and optimization issues when using Dataframe, data sources and types, working with valid/invalid data, error handling, UDF, interaction with Python and SQL.
Basics of Machine Learning Algorithms
You will understand the main types of machine learning models, key terms and definitions. Learn regression algorithms and clustering algorithms.
Deployment
Learn the main stages of preparing a model for deployment, approaches to building an API, and ways to handle errors and debug applications. You will be able to troubleshoot deployment problems and master the basic swagger tools. Get acquainted with the key processes of bash: writing scripts, working with variables, and the text editors sed and awk.
Final projects
After passing the first level, prepare an introductory project. At the end of the course you will present your final work.
Introduction to Data Science
Consolidate your new knowledge on an individual project - you will go from loading data to implementing a model. Solve the problems of a data engineer, ML engineer and data analyst to decide on your specialization.
Data Engineer
Final project at Junior level. Conduct a cohort analysis and download API references. Build dashboards based on the data received.
Bonus courses
Developer Career: Employment and Development
You will learn how to choose a suitable vacancy, prepare for an interview and negotiate with an employer. You will be able to quickly get a position that meets your expectations and skills.
Git version control system
Learn to version code changes, create and manage repositories, branches, and resolve version conflicts. Learn useful rules for working with Git.
English for IT specialists
Gain language skills that will help you pass an interview with a foreign company and communicate comfortably in mixed teams.