Advanced Mathematical Statistics
A.Y. 2025/2026
Learning objectives
The main aim of the course is to introduce the modern concepts of multivariate and computational Mathematical Statistics, both from a theoretical and applied point of view, with particular reference to the techniques for Big Data analysis. During the lab activities, the students will be trained to perform a data analysis with advanced software instruments (R and R Spark).
Expected learning outcomes
Basic notions and theorems of Multivariate Mathematical and Computational Statistics.
The student will then be able to apply and broaden his/her knowledge of the subjects in different areas of interest, both in theoretical and applied contexts, and to perform statistical data analyses, both in the multivariate and big data case.
The student will then be able to apply and broaden his/her knowledge of the subjects in different areas of interest, both in theoretical and applied contexts, and to perform statistical data analyses, both in the multivariate and big data case.
Lesson period: First semester
Assessment methods: Esame
Assessment result: voto verbalizzato in trentesimi
Single course
This course cannot be attended as a single course. Please check our list of single courses to find the ones available for enrolment.
Course syllabus and organization
Single session
Responsible
Lesson period
First semester
Course syllabus
Here an indication of the chapters that should be developed is provided. The teachers could operate a selection due to lack of time.
Part A. Statistical methods to treat small samples of big dimension (dimensionality reduction)
1. Ridge regression
2. Shrinkage methods to estimate the covariance matrix
3. Methods of penalized regression LASSO
4. Principal Components Analysis (PCA)
Part B. Statistical Methods for the analysis of Big Data
5. Locality Sensitive Hashing (LSH)
6. Finding Similar Items
7. Frequent Itemsets
8. Cluster analysis
9. Techniques for dimensionality reduction
10. Analysis of data streams
11. Analysis of social networks
12. Computer Lab
Data analysis by statistical softwares (Python and Spark)
Part A. Statistical methods to treat small samples of big dimension (dimensionality reduction)
1. Ridge regression
2. Shrinkage methods to estimate the covariance matrix
3. Methods of penalized regression LASSO
4. Principal Components Analysis (PCA)
Part B. Statistical Methods for the analysis of Big Data
5. Locality Sensitive Hashing (LSH)
6. Finding Similar Items
7. Frequent Itemsets
8. Cluster analysis
9. Techniques for dimensionality reduction
10. Analysis of data streams
11. Analysis of social networks
12. Computer Lab
Data analysis by statistical softwares (Python and Spark)
Prerequisites for admission
The students should have followed an introductory course to Mathematical Statistics, with particular reference to statistical hypotheses tests and Linear Regression.
Teaching methods
Frontal lectures and computer labs
Teaching Resources
Wessel N. van Wieringen, Lecture notes on ridge regression, https://arxiv.org/pdf/1509.09169.pdf
I.T.Jolliffe, Principal Component Analysis. 2nd Edition. Springer, 2002
Jure Leskovec, Anand Rajaraman, Jeff Ullman, Mining of massive datasets, Cambridge University Press, 2014. Versione online: http://www.mmds.org/
Lecture notes of the teachers
I.T.Jolliffe, Principal Component Analysis. 2nd Edition. Springer, 2002
Jure Leskovec, Anand Rajaraman, Jeff Ullman, Mining of massive datasets, Cambridge University Press, 2014. Versione online: http://www.mmds.org/
Lecture notes of the teachers
Assessment methods and Criteria
The exam is composed by a set of homeworks that will be assigned by the teachers during the course, composed by both multivariate and big dimensional data analysis and guided development of methodologies for big data analysis.
The homeworks are dedicated to the students who follow the course in real time, thus the attendance of the course is highly recommended.
The non attending students, or the students that will reject the grade resulting from the homeworks, will have to pass an oral exam on the entire program of the course.
The aim of the exams is to ascertain the achievement of the objectives in terms of knowledge and comprehension and of the ability of the students to solve problems of multivariate and big data analysis with suitable mathematical-statistical instruments.
The homeworks are dedicated to the students who follow the course in real time, thus the attendance of the course is highly recommended.
The non attending students, or the students that will reject the grade resulting from the homeworks, will have to pass an oral exam on the entire program of the course.
The aim of the exams is to ascertain the achievement of the objectives in terms of knowledge and comprehension and of the ability of the students to solve problems of multivariate and big data analysis with suitable mathematical-statistical instruments.
MAT/06 - PROBABILITY AND STATISTICS - University credits: 9
Laboratories: 36 hours
Lessons: 42 hours
Lessons: 42 hours
Professors:
Aletti Giacomo, Micheletti Alessandra
Shifts:
Professor(s)
Reception:
on appointment
office 2099