CS211
Basics of Databases

Faculty
Nikolay Golov
CPO of Tengri Data Platform
Course length
Duration
Total hours
Credits
Language
Course type
Fee for single course
Fee for degree students
Skills you’ll learn
Overview
The course examines the basics of modern database technologies, from the point of view of database developer and analyst, and its usage for data-intensive applications. A particular focus is given to choosing the right database technology for a task at hand and tradeoffs between performance, ease of use, data integrity and other considerations. Hands-on lessons are done using SQLite, PostgreSQL, MongoDB, Redis and Snowflake databases.
The course starts with a brief overview of a data storing task for an application. We proceed with a list of requirements, which proved to be essential for a data storing tool: ACID, transactions, availability of data access languages (SQL, etc.). Afterwards, we will illustrate why given requirements determined the market dominance of classical relational databases (Oracle, MS SQL, PostgreSQL, MySQL, etc.) at the end of the 20th century. Later we describe why technological advances of the 21st century gave birth to a set of non-classical databases, such as in-memory storage, document storage, columnar storage, etc. The bulk of the remaining course focuses on the tradeoffs to be considered during technology selection and the database design. We emphasize the difference between OLAP (analytical) and OLTP tasks and modern data warehouse designs (Data Vault, Anchor Modeling, etc.).
A significant amount of time is dedicated to the practical skills needed for developers and analysts to work with most of the databases listed above. Course will try to illustrate what type of skills (mostly SQL) can be useful, regardless of the database selected: SQLite, PostgreSQL, MySQL, Snowflake, CockroachDB or Spanner. Course will illustrate similarities between all the databases, and the important differences between them. Practical tasks will help to develop skills on how to master a new database, discover its capabilities and limitations.
Learning highlights
- As a result of this course, students shall be able to select appropriate database engines for a task.
- Students shall understand the pros and cons of different existing database engines.
- Students shall know and efficiently use SQL, both for OLTP and analytical tasks.
- Students shall be able to create a data model for practical tasks for a selected database.
Course outline
15 classes
Session 1
- Introduction
- Data storage in general
- CRUD
- Evolution of approaches, the birth of relational model
- Data Normalization
- SQLite
- Task 01 - SQLite
Session 2
- SQLite
- SQL - Create, Insert, Select, Group by
- ACID - Durability, Isolation, risks of improper Isolation
Session 3
- ACID - atomicity
- SQL - transactions, commit/rollback
- SQLite
- Rollback journal, Write-Ahead-log
- Basic indexes
Session 4
- Classical client-server databases: PostgreSQL, MSSQL, Oracle, MySQL
- Master/Slave replication
- ACID - Isolation
- Transaction isolation levels
- Replication techniques
Session 5
- PostgreSQL
- Different types of indexes: B-tree, hashtable, projection
- SQL - Join, View
- Analytics
- OLTP vs OLAP tasks for classical databases
- Kimball vs Inmon
- Task 02: PostgreSQL
Session 6
- PostgreSQL
- “Star” schema, “Snowflake” schema
- Analytical SQL: window functions
- Modern BI tools - Tableau, Looker
- Analytical SQL
- Make your own BI tool
- Slow Changing Dimensions - practice
Session 7
- Redis
- Limits of classical databases: single master, raw-storage, inefficiency
- Key-value storages
- Rethinking everything
- Memcached
- Using key-value as cash
- Sharding approach
- Task 03: Redis
Session 8
- MongoDB
- Document-oriented databases
- JSON and SQL, document store for classical DB
- Sharding
- Risks of inconsistency
Session 9
- Column storages
- OLTP vs OLAP tasks for columnar databases
- Vertica, Greenplum, ClickHouse, Snowflake
- Sharding approach
- SQL - window functions
Session 10
- Snowflake
- Modern analytics using columnar databases: Data Vault, Anchor Modeling
- Big data clickstream analytics
- Task 04
Session 11
- Data Modeling for real tasks: Normalization VS Denormalization
- Lambda architecture
Session 12
- Databuses
- Kafka, Pulsar
- Databuses for OLTP (event-based architecture) and OLAP (data streaming)
Session 13
- Polyglot persistence
- Modern databases on a Performance/Complexity/Delay graph
- Proper roles of classical, key-value, document-orients, columnar databases and data-buses
- Risks of polyglot persistence
- Eventual consistency
- SAGA pattern
Session 14
- CAP theorem - meaning and applications
- Clouds change everything
- Managed databases
- Serverless databases
- BigQuery, Snowflake, YDB, CockRoachDB
Session 15
Final Quiz
Course materials
Media
Methodology
Lectures
Discussions
Practice with some database, using SQL IDE or through the python applications.
Few projects to try new skills and tools for practical tasks.
Grading
Nikolay got his M.S. degree in applied mathematics and cybernetics from Moscow State University, Russia. Afterwards, he had 15 years of experience building data platforms for various startups and enterprises. From 2013 until 2019, he headed the Data Platform of Avito, Craigslist of Russia, which grew to a multi-billion-dollar company from a small startup. In Avito, he was responsible for analytical databases (Vertica, ClickHouse), OLTP engines (PostgreSQL, Redis, MongoDB), and data buses (Kafka) for analytics and microservices. Later he was Head of Data Platform at ManyChat (a California and Barcelona-based SaaS startup), responsible for the implementation and growth of its Data Platform (AWS+Redis+Snowflake+Tableau), which is being used for analytics and AI. Currently Nikolay is a CPO of a startup, creating a new analytical database, Tengri Data Platform.
See full profileApply for this course
Basics of Databases
by Nikolay Golov
Total hours
45 Hours
Dates
May 23 - Jun 10, 2022
Fee for single course
€1500
Fee for degree students
€750
How to secure your spot
Complete the form below to kickstart your application
Schedule your Harbour.Space interview
If successful, get ready to join us on campus
FAQ
Will I receive a certificate after completion?
Yes. Upon completion of the course, you will receive a certificate signed by the director of the program your course belonged to.
Do I need a visa?
This depends on your case. Please check with the Spanish or Thai consulate in your country of residence about visa requirements. We will do our part to provide you with the necessary documents, such as the Certificate of Enrollment.
Can I get a discount?
Yes. The easiest way to enroll in a course at a discounted price is to register for multiple courses. Registering for multiple courses will reduce the cost per individual course. Please ask the Admissions Office for more information about the other kinds of discounts we offer and what you can do to receive one.