Data management

A.A. 2023/2024
6
Crediti massimi
56
Ore totali
SSD
INF/01
Lingua
Inglese
Obiettivi formativi
The course aims at providing the basic concepts of data management, with focus on structured databases and on unstructured (big) data. Basic concepts of relational database systems and of the SQL query language are provided. To develop a deeper understanding of the relational data management in real contexts, examples of relational data schemas and SQL queries formulation for selective data extraction are presented and discussed. Recent and innovative NoSQL solutions for unstructured data management are also illustrated, with special focus on the MongoDB document-oriented system and on basic concepts of the Python scripting language, to interact with a MongoDB database for data extraction and manipulation purposes.
The contents of this course are essential to provide the students a sufficient background in data management that will be applied first in subsequent classes (in particular in the laboratories), and later on in their professional career when monitoring, analysing and addressing natural resource management issues.
Risultati apprendimento attesi
· Knowledge and understanding.
Students are expected to be able to understand relational database schemas and languages.
Students are expected to be able to understand the principles of data organization in NoSQL systems with basic notions of scripting programming with the Python language.

· Applying knowledge and understanding.
Students will be able to describe the meaning, the properties, the relationships, and the constraints featuring data stored in a database.
Students will be able to apply concepts, models, and languages introduced in the course to formulate SQL queries over a database schema, with appropriate conditions to filter and retrieve target data satisfying specific user needs, also referring to real databases in environmental contexts.
Students will be able to apply NoSQL concepts and Python programming principles illustrated in the course for data extraction, aggregation, and manipulation over a MongoDB database.
Corso singolo

Questo insegnamento può essere seguito come corso singolo.

Programma e organizzazione didattica

Edizione unica

Responsabile
Periodo
Primo semestre

Programma
Part I (3 CFU)
Introduction to relational databases. Database and database systems (DBMS). Data definition languages and data manipulation languages for databases.
The relational model. Queries with the SQL language. Simple queries and group queries with aggregate operators. Queries with set operators. Nested queries.

Part II (3 CFU)
Introduction to NoSQL databases. Data models for NoSQL. Types of NoSQL. Comparison against the relational model.
The "document-oriented" data model. The MongoDB system. Collection in MongoDB. Collection queries in MongoDB. Aggregation pipeline in MongoDB.
Python language. Principles of programming and introduction to the language. Data structures and data types.
The Pandas library for data manipulation.
Prerequisiti
None.
Metodi didattici
For attending and non-attending students: slides and handouts that are progressively published on the Ariel website https://scastanodm.ariel.ctu.unimi.it/
Materiale di riferimento
Relational databases and SQL:
- P. Atzeni, S. Ceri, S. Paraboschi, R. Torlone, Database Systems - Concepts, Languages and Architectures - Mc-Graw Hill, available online at http://dbbook.dia.uniroma3.it/ (Chapters 1, 2, 4).

Python / Numpy / Pandas:
- Allen B. Downey, Think Python 2nd Edition - O'Reilly Media, available online at https://greenteapress.com/wp/think-python-2e/

- J. VanderPlas, Python Data Science Handbook - O'Reilly Media, available online at https://jakevdp.github.io/PythonDataScienceHandbook/ (Chapters 2,3,4)

MongoDB:
https://www.mongodb.com/docs/manual/tutorial/getting-started/

Online resources and handouts provided throughout the lectures available on the Ariel website https://scastanodm.ariel.ctu.unimi.it/
Modalità di verifica dell’apprendimento e criteri di valutazione
The exam is written (90 minutes) with quizzes/questions/exercises covering the syllabus of the course. The result is expressed in thirtieths. The exam aims to verify that the course objectives have been achieved, namely, that students have learned the basic concepts related to relational and NoSQL data organization; that they are able to interpret requests and implement correct queries to extract and organize appropriate data from NoSQL and relational databases for a given target; that they have learned Python language fundamentals for data management.
For attending students only: a first intermediate exam is foreseen covering relational databases and SQL and a second intermediate exam is foreseen (at the end of the course) covering NoSQL databases and Python.
INF/01 - INFORMATICA - CFU: 6
Esercitazioni in aula informatica: 16 ore
Lezioni: 40 ore
Docente/i
Ricevimento:
Ricevimento su appuntamento tramite email
Ricevimento:
Ricevimento su appuntamento tramite email