Biostatistics
A.Y. 2025/2026
Learning objectives
The course aims to introduce students to biostatistics, i.e. the application of statistical principles to questions and problems in genomics, biology or medicine.
Expected learning outcomes
At the end of this class , the students are expected to:
- know basic techniques and tools for the synthetic and graphical analysis of the information provided by clinical data sets
- apply the methods and techniques of biostatistics to real data sets by means of the use of appropriate statistical software.
- know the basic models for the representation and the analysis of random phenomena, with particular focus on genomics problems, and their application
- be able to apply methods and tools of biostatistics and survival analysis
- apply the methods and techniques of biostatistics to real data sets by means of the use of appropriate statistical software.
- know basic techniques and tools for the synthetic and graphical analysis of the information provided by clinical data sets
- apply the methods and techniques of biostatistics to real data sets by means of the use of appropriate statistical software.
- know the basic models for the representation and the analysis of random phenomena, with particular focus on genomics problems, and their application
- be able to apply methods and tools of biostatistics and survival analysis
- apply the methods and techniques of biostatistics to real data sets by means of the use of appropriate statistical software.
Lesson period: Second semester
Assessment methods: Esame
Assessment result: voto verbalizzato in trentesimi
Single course
This course cannot be attended as a single course. Please check our list of single courses to find the ones available for enrolment.
Course syllabus and organization
Single session
Lesson period
Second semester
Course syllabus
The course aims to introduce students to biostatistics, i.e. the application of statistical principles to questions and problems in medicine, public health or biology. The main topics of the course will be: the design of experiments, categorical data analysis, time to event data, survival analysis, meta analysis.
Specifically:
Ø Clinical Data and Health Analytics. Evidence-based medicine. The design of experiments: comparing treatments, random allocation, intention to treat, response bias, clustered clinical trials, observational studies, crosssectional studies, cohort studies, casecontrol studies. Bias.
Ø Categorical data: the chi-squared test for association, tests for 2 by 2 tables (chi-squared test, Fisher's exact test), Yates' continuity correction for the 2 by 2 table, odds and odds ratios, nonparametric methods, the Mann Whitney U Test, the Wilcoxon matched pairs test, Spearman's and Kendall's rank correlation coefficients, multiple testing correction.
Ø Time to event data: the Kaplan Meier product-limit estimate of survival, the log-rank test, survival function, hazard function, cumulative hazard function, parametric lifetime models (Exponential, Weibull, Gamma; Log-normal), treatment of missing values, lost to follow up and intention to treat analysis, censoring, Cox Proportional Hazards model, frailty models.
Ø Generalized linear models (logistic regression). Mixed effects models (continuous outcome variables, dichotomous outcome variables), funnel plots, meta-analysis (heterogeneity between studies and how to assess heterogeneity).
Specifically:
Ø Clinical Data and Health Analytics. Evidence-based medicine. The design of experiments: comparing treatments, random allocation, intention to treat, response bias, clustered clinical trials, observational studies, crosssectional studies, cohort studies, casecontrol studies. Bias.
Ø Categorical data: the chi-squared test for association, tests for 2 by 2 tables (chi-squared test, Fisher's exact test), Yates' continuity correction for the 2 by 2 table, odds and odds ratios, nonparametric methods, the Mann Whitney U Test, the Wilcoxon matched pairs test, Spearman's and Kendall's rank correlation coefficients, multiple testing correction.
Ø Time to event data: the Kaplan Meier product-limit estimate of survival, the log-rank test, survival function, hazard function, cumulative hazard function, parametric lifetime models (Exponential, Weibull, Gamma; Log-normal), treatment of missing values, lost to follow up and intention to treat analysis, censoring, Cox Proportional Hazards model, frailty models.
Ø Generalized linear models (logistic regression). Mixed effects models (continuous outcome variables, dichotomous outcome variables), funnel plots, meta-analysis (heterogeneity between studies and how to assess heterogeneity).
Prerequisites for admission
The course of Statistics and basic notions of calculus
Teaching methods
Lesson using blackboard and slides will be delivered for the theoretical part of the contents. Labs using R software will be delivered for practical sessions.
Teaching Resources
· Bland, M., An Introduction to Medical Statistics - 4th Edition, Editore: Oxford University Press, Anno edizione: 2015
· David Hosmer; Stanley Lemenshow, Applied Survival Analysis - Regression Modeling Of Time To Event Data, Editore: John Wiley & Sons, Anno edizione: 1999
· Alan Agresti, Categorical Data Analysis, Editore: John Wiley & Sons, Anno edizione: 2002
· Helen Brown, Robin Prescott, Applied Mixed Models in Medicine, Third Edition, Editore: John Wiley & Sons, Anno edizione: 2015, ISBN: 9781118778258
· Ieva F., Masci C., Paganoni A.M., Laboratorio di Statistica con R, Editore: Pearson, Anno edizione: 2016
· David Hosmer; Stanley Lemenshow, Applied Survival Analysis - Regression Modeling Of Time To Event Data, Editore: John Wiley & Sons, Anno edizione: 1999
· Alan Agresti, Categorical Data Analysis, Editore: John Wiley & Sons, Anno edizione: 2002
· Helen Brown, Robin Prescott, Applied Mixed Models in Medicine, Third Edition, Editore: John Wiley & Sons, Anno edizione: 2015, ISBN: 9781118778258
· Ieva F., Masci C., Paganoni A.M., Laboratorio di Statistica con R, Editore: Pearson, Anno edizione: 2016
Assessment methods and Criteria
The course assessment will consist of two parts, namely an individual evaluation and a team project. Both parts are mandatory.
The individual evaluation will be taken in one of the dates scheduled by the School within the academic year; it will consist of an oral exam about any topic presented during the course, scored on a scale from 0 to 30, the maximum evaluation being 30L. The individual evaluation will be passed upon obtaining a score greater than or equal to 18/30. The exam evaluation will account for the degree of clarity of the exposition and for the correctness of computations.
The team project will consist of an analysis of a real dataset, to be conducted in teams from 2 up to 4 students, using the models and methods introduced in the course. The team projects will be presented at the end of the course in a seminar during an open workshop that will take place after the end of the semester. Each team will receive an evaluation in a scale from 0 to 30, the maximul being 30L.
The final evaluation of the course will consist of a weighted average of the scores obtained by the student in the two parts of the assessment, with weights 0.6 (individual assessment) and 0.4 (team project).
During the exam, students will have to demonstrate the degree of knowledge and comprehension of the key aspects of the course, presenting the used methodologies in a clear and exhaustive way; demonstrate their ability to apply the learned notions to solve exercises and real problems, on any of the topics covered in the course.
Results will be communicated through the official webpage of the course.
The individual evaluation will be taken in one of the dates scheduled by the School within the academic year; it will consist of an oral exam about any topic presented during the course, scored on a scale from 0 to 30, the maximum evaluation being 30L. The individual evaluation will be passed upon obtaining a score greater than or equal to 18/30. The exam evaluation will account for the degree of clarity of the exposition and for the correctness of computations.
The team project will consist of an analysis of a real dataset, to be conducted in teams from 2 up to 4 students, using the models and methods introduced in the course. The team projects will be presented at the end of the course in a seminar during an open workshop that will take place after the end of the semester. Each team will receive an evaluation in a scale from 0 to 30, the maximul being 30L.
The final evaluation of the course will consist of a weighted average of the scores obtained by the student in the two parts of the assessment, with weights 0.6 (individual assessment) and 0.4 (team project).
During the exam, students will have to demonstrate the degree of knowledge and comprehension of the key aspects of the course, presenting the used methodologies in a clear and exhaustive way; demonstrate their ability to apply the learned notions to solve exercises and real problems, on any of the topics covered in the course.
Results will be communicated through the official webpage of the course.
MAT/06 - PROBABILITY AND STATISTICS - University credits: 1
SECS-S/01 - STATISTICS - University credits: 5
SECS-S/01 - STATISTICS - University credits: 5
Practicals: 24 hours
Lectures: 36 hours
Lectures: 36 hours
Professor:
Ieva Francesca