Databases and exposure scenarios

A.A. 2019/2020
6
Crediti massimi
48
Ore totali
SSD
INF/01 SECS-S/01
Lingua
Inglese
Obiettivi formativi
The course is organized into two parts, namely "Informatics and databases" and "Statistics applied to epidemiology".
Informatics and databases part of the course aims at providing the basic concepts of database and database management systems, with focus on relational data modeling and SQL query language. To develop a deeper understanding of the relational data organization in real contexts, examples relational data schemas of biological databases and SQL queries to extract data from them are presented and discussed.
As regards the statistical knowledge, the course aims at providing the fundamental concepts of descriptive and inferential statistics and epidemiological study design. The course also provides concrete tools to apply the main statistical techniques to real cases. At the end of the course the student should demonstrate knowledge and understanding of the main statistical techniques for the description and analysis of the phenomena being studied and the basic principles for setting up an epidemiological study; should have the ability to apply the knowledge acquired and the ability to interpret the results of the statistical analyses; should develop the skills necessary to continue studies independently in the context of statistical analysis and epidemiology.
Risultati apprendimento attesi
Regarding informatics and databases, students are expected to be able to understand relational database schemas and languages and to describe the meaning, the properties, the relationships, and the constraints featuring data stored in a database. Students will be able to apply concepts, models, and languages introduced in the course to formulate SQL queries over a database schema, with appropriate conditions to filter and retrieve target data satisfying specific user needs, also referring to real biological databases.
As regards the statistical knowledge, students will be expected to have assimilated the concepts exposed in the teaching, knowing how to critically compare the use of different statistical tests and study designs. In addition, students will develop the basic skills necessary to design an epidemiological study and face its statistical analysis.
Programma e organizzazione didattica

Edizione unica

Responsabile
Periodo
Primo semestre
Programma
Informatics and Databases.
Introduction to databases. Information systems, information and data. Database and Database
Management System (DBMS). Data models. Schemas and instances. Abstraction levels in DBMSs.
Database languages and users. Relational databases. The relational model. Relations and tables. Relations with attributes. Relations
and databases. Incomplete information and null values. Integrity constraints. Definitions and properties
of keys. Primary key and foreign key constraints.
Query languages for relational databases: SQL. Basic SQL query format. Selection amd projection queries. Join queries (inner join, natural join, outer joins). Aggregate queries. Group by queries. Set (union, intersection, difference) queries. Nested queries. Correlated nested queries.
Conceptual data modeling with the Entity-Relationship model.
Introduction to biological databases. Direct access to relational biological databases. The Ensembl database and its schema; sample queries over the Ensembl database.
Prerequisiti
Students must have knowledge of basic mathematics studied during the three-year degree course.
Metodi didattici
"Informatics and Databases." The teaching consists of lectures, supported by the use of slides. Slides, which follow the contents of the lectures, are made available on the ARIEL web site https://scastanodes.ariel.ctu.unimi.it/. During biological database lectures, students work in small groups using laptops equipped with a SQL system for interactively accessing and querying biological databases.

"Statistics applied to epidemiology." The teaching consists of lectures, supported by the use of slides and blackboard exercises. Slides, which follow the contents of the lectures, are made available on ARIEL. During the course, paper statistical tables (also available on the ARIEL web site https://rpizzisae.ariel.ctu.unimi.it/) are distributed so that students can directly follow the analyzes presented during the lessons.
Materiale di riferimento
"Informatics and Databases."
- P. Atzeni, S. Ceri, S. Paraboschi, R. Torlone, Database Systems - Concepts, Languages and Architectures - Mc-Graw Hill, available on-line at http://dbbook.dia.uniroma3.it/
Chapters: 1(whole), 2 (whole), 3(until §3.1.6 included)-4 (only § 4.2. and related subparagraphs)-5 (only § 5.2. and related subparagraphs)
- Lecture slides downloadable from the course web site (https://scastanodes.ariel.ctu.unimi.it/).

"Statistics applied to epidemiology."
The teaching material consists of the slides uploaded on ARIEL and of the following books:
- Barbara Illowsky, Susan Dean (2013), Introductory Statistics by OpenStax. 1st Edition, XanEdu Publishing Inc.
https://openstax.org/details/books/introductory-statistics
- Beaglehole, Robert, Bonita, Ruth, Kjellström, Tord & World Health Organization (‎1993)‎. Basic epidemiology.Updated reprint, World Health Organization. https://apps.who.int/iris/bitstream/handle/10665/36838/9241544465.pdf?s…
- Darrell Huff (1991), How to Lie with Statistics. Penguin (1991). https://archive.org/details/HowToLieWithStatistics
Modalità di verifica dell’apprendimento e criteri di valutazione
The course exam consists of two separate exams, one exam for "Informatics and databases" part of the course and one exam for "Statistics applied to epidemiology" part of the course. The vote of each part-exam is expressed in thirtieths. The final vote of the course exam is expressed in thirtieths as the average of the two part-exam votes.

"Informatics and Databases".
The exam consists of a single test. No intermediate tests are foreseen. The exam is written (approximately 1 hour and 30 minutes), it covers all the topics presented during lectures, and it will consist in multiple-choice questions and exercises. The exam aims to verify that the course objectives have been achieved, namely, that students have learned the basic concepts related to the relational data model and that they are able to solve query exercises on relational databases, also biological.
The same assessment methods and criteria apply to attending and non-attending students.

"Statistics applied to epidemiology."
The exam consists of a single test. No intermediate tests are foreseen. The test consists of a written exam (2 hours). A paper taken from an international indexed journal will be assigned, containing a study evaluated with statistical methods presented in class. Students will have to answer some open questions regarding the understanding of the statistical methods used in the paper. To pass the exam, the student must demonstrate to:
- understand the concepts of epidemiological study and basic statistics.
- know how to apply the knowledge acquired to real situations
- know how to interpret the results obtained from the analyses carried out.
The same assessment methods and criteria apply to attending and non-attending students.
Moduli o unità didattiche
Informatics and Database
INF/01 - INFORMATICA - CFU: 0
SECS-S/01 - STATISTICA - CFU: 0
Lectures: 24 ore
Docente: Castano Silvana

Statistics applied to Epidemiology
INF/01 - INFORMATICA - CFU: 0
SECS-S/01 - STATISTICA - CFU: 0
Lectures: 24 ore

Docente/i
Ricevimento:
Ricevimento su appuntamento tramite email
Ricevimento:
controllare tramite email
studio presso Dipartimento di Informatica sede di Crema o di Milano