Processing and analysis of big data - free course from Open Education, training 2 weeks, about 36 hours per week, Date November 29, 2023.
Miscellaneous / / November 30, 2023
Ph.D. Position: Associate Professor, Faculty of Control Systems and Robotics, Associate Professor, Higher School of Digital Culture, ITMO University
Candidate of Physical and Mathematical Sciences Position: Associate Professor, Higher School of Digital Culture, ITMO University
Ph.D. Position: Associate Professor, Higher School of Digital Culture, ITMO University
Module 1 Topic 1.1 Introduction to data science Discusses types and sources of data, principles separating and combining data, types of scales, methods of data cleaning and filling in gaps, control ranges. Topic 1.2 Data processing tools Discusses primary data processing tools, such as spreadsheets (Google spreadsheets and Excel), covers the issues of sorting and filtering data, means of aggregation and analysis of tabular data (pivot tables) Topic 1.3 Visualization data The tasks and methods of data visualization in various tools (Google spreadsheets and Excel), forms of presentation of quantitative and qualitative data. Cognitive data visualization is considered. Topic 1.4 Analysis and transformation of data Methods of smoothing and normalization of data, issues of data transformation are considered. The types of descriptive statistics and methods for their calculation are described in detail. Topic 1.5 Working with time series The principles of working with time series and methods of their analysis are considered. Particular attention is paid to techniques for smoothing time series, determining trends and seasonal components of time series.
Module 2 Big Data Storage Topic 2.1. Database Management Systems The architecture of information systems and the main functions of database management systems are considered. Topic 2.2. Designing structured data The basic concepts of the relational (tabular) model are considered data, designing data in a relational model, rules for creating tables and defining integrity constraints. Topic 2.3. SQL - queries to data and database objects The principles of constructing queries to data in the SQL language are considered, including projection, sorting, setting selection conditions, joining multiple tables, set-theoretic operations, nested requests. The lecture also discusses database objects - views, procedures/functions, triggers. The concept of indexes is given, which can improve the efficiency of executing a number of queries. Topic 2.4. NoSQL storage The basic concepts and characteristics of NoSQL systems, various types and ratings of NoSQL systems are considered: key-value, document, column and graph. Principles of building queries to data in NoSQL storages. Topic 2.5. MongoDB - working with document storage Discusses organizing data and building queries in MongoDB. Examples of building queries in the MongoDB demo database are provided.