Health informatics

A.A. 2022/2023
3
Crediti massimi
36
Ore totali
SSD
INF/01
Lingua
Inglese
Obiettivi formativi
Health Informatics course is divided in two modules dealing with the use of software for basic data analysis. In the first module Excel functions will be used to illustrate basic data analysis functionalities while in the second module, the Python programming language will be used.
Risultati apprendimento attesi
At the end of the first module the student will be able to use Excel data analysis toolpack and pivot tables for statistical calculations. At the end of the second module the students will be able to use a full programming language for basic data manipulation and statistical analysis.
Corso singolo

Questo insegnamento non può essere seguito come corso singolo. Puoi trovare gli insegnamenti disponibili consultando il catalogo corsi singoli.

Programma e organizzazione didattica

Edizione unica

Responsabile

Programma
The course will address the following topics:
The course will address the following topics:

MODULE 1: DATA EXPLORATION AND DESCRIPTIVE ANALYSIS
The first part will be very strictly connected with the biostatistics course and will involve using excel for presenting patient data and making statistical comparisons.
· Research problems categorization
· Structured, Semi-structured data source
Tables: Records and features
Data Types (nominal, ordinal, ranked, discrete, continuous)
· Exploratory Data Analysis
Statistics, query, summarization
Graphical representation
References for this part are (numbers refer to BIBLIOGRAPHY section):
· Spreadsheets: a few tips (Ariel platform)
· [4] Chapter 2,3
· [1] Chapter 1
· [3] Chapter 2,3
MODULE 2: DATA PREPROCESSING FOR PREDICTIVE MODELS
Graphical Programming Tool: Orange
· Installation of the tool and interface exploration
· Simple use cases
Data Preprocessing
· Outliers, Missing Values, Data Representation, Standardization, Discretization, Feature Engi-neering, (Unbalance data)
References for this module are:
· [4] Chapter 2, 4

MODULE 3: INTRODUCTION TO STATISTICAL LEARNING FOR PRE-DICTIVE MODELS
Supervised/Unsupervised Learning
· Classification, Regression, Clustering
Model Workflow
· Bias/Variance and overfitting, Holdout method
Supervised Learning
· Feature selection
· Classification: Classification Tree, Logistic Regression
· (Regression: Linear Regression)
(Unsupervised Learning)
· Dimensionality Reduction: PCA, Data embedding
· Clustering
References for this module are:
· [4] Chapter 19, 20
· [5] Chapter 2, 3.1-3.3, 4.1-4.3, 5, 8, (12)

EXPECTED OUTCOMES

1. Recognize real-word scenarios where statistical and automatic learning tools can provide ad-vantages for the analysis.
2. Recognize the difference between descriptive and predictive analysis.
3. Identify proper methods suited for specific research questions.
4. Compute statistical descriptors on a dataset.
5. Implement simple prediction models.
6. Interpret the results of the statistical analysis/models respect to the input data and the modelling assumptions.
7. Identify and effectively communicate results of the statistical analysis.
Prerequisiti
To take the Health Informatics exam, students must have already passed all the exams of the first year (Fundamentals of Basic Sciences, Cells Molecules and Genes 1 and 2, Human Body) and the exam of Functions.

· Good knowledge of Excel.
Suggested material: https://support.office.com/en-us/article/introduction-to-excel-starter601794a9-b73d-4d04-b2d4-eed4c40f98be
· Basic knowledge in analysis and statistical
Metodi didattici
Synchronous learning: Lectures by the teachers will mainly be used through the course.
Asynchronous learning: Literature data will be provided to exercise in data analysis with software.

ATTENDANCE:
Attendance is required to be allowed to take the exam. Unexcused absence is tolerated up to 34% of the course activities. University policy regarding excused illness is followed.
Materiale di riferimento
Bibliography
1. Leslie E. Daly and Geoffrey J. Bourke, "Interpretation and Uses of Medical Statistics", 5th edition. (Available online in Unimi library)
2. J. Mark Elwood, "Critical Appraisal of Epidemiological Studies and Clinical Trials", 3rd Edi-tion, Oxford University Press
3. Douglas G. Altman, "Practical statistics for medical research". Chapman and Hall
4. Marcello Pagano, Kimberlee Gauvreau, "Principles of Biostatistics", 2000, Duxbury Press. (Available online in Unimi library)
5. Gareth James, Daniela Witten, Trevor Hastie, Robert Tibshirani. "An Introduction to Statistical Learning: with Applications in R". New York : Springer, 2013. (Available online)

Software
· Microsoft Excel
· Orange Data Mining https://orangedatamining.com/
Modalità di verifica dell’apprendimento e criteri di valutazione
The exam consists of a written test containing both theoretical and practical questions.
The written test will be based on the Moodle platform basically with multiple items questions and short answers numerical questions. Online statistical calculators will be needed to answer some of the numerical questions. The grades are on a scale of 30 and a minimum of 18/30 is required to pass the written test.

Registration to the exam through SIFA is mandatory.
INF/01 - INFORMATICA - CFU: 3
Lezioni: 24 ore
: 12 ore
Siti didattici
Docente/i
Ricevimento:
Su appuntamento previo contatto via e-mail
Laboratorio di Statistica Medica, Biometria ed Epidemiologia "G.A. Maccacaro", Via Celoria 22, Milano