Advanced Mathematical Statistics

A.Y. 2021/2022
9
Max ECTS
78
Overall hours
SSD
MAT/06
Language
Italian
Learning objectives
Nozioni e teoremi base della Statistica Matematica multivariata e computazionale, che lo studente sarà poi in grado di approfondire in ambito sia teorico che applicativo. Lo studente sarà inoltre in grado di applicare tali competenze all'analisi statistica di dati multivariati o di grandi dimensioni.
Expected learning outcomes
Basic notions and theorems of Multivariate Mathematical and Computational Statistics.
The student will then be able to apply and broaden his/her knowledge of the subjects in different areas of interest, both in theoretical and applied contexts, and to perform statistical data analyses, both in the multivariate and big data case.
Course syllabus and organization

Single session

Responsible
Lesson period
First semester
More specific information on the delivery modes of training activities for academic year 2021/22 will be provided over the coming months, based on the evolution of the public health situation
Course syllabus
Here an indication of the chapters that should be developed is provided. The teachers could operate a selection due to lack of time.

Part A. Statistical methods to treat small samples of big dimension (dimensionality reduction)

1. Ridge regression
2. Shrinkage methods to estimate the covariance matrix
3. Methods of penalized regression LASSO
4. Principal Components Analysis (PCA)

Part B. Statistical Methods for the analysis of Big Data
5. Locality Sensitive Hashing (LSH)
6. Finding Similar Items
7. Frequent Itemsets
8. Cluster analysis
9. Techniques for dimensionality reduction
10. Analysis of data streams
11. Analysis of social networks

12. Computer Lab
Data analysis by statistical softwares ( R and R Spark)
Prerequisites for admission
The students should have followed an introductory course to Mathematical Statistics, with particular reference to statistical hypotheses tests and Linear Regression.
Teaching methods
Frontal lectures and computer labs
Teaching Resources
Wessel N. van Wieringen, Lecture notes on ridge regression, https://arxiv.org/pdf/1509.09169.pdf

I.T.Jolliffe, Principal Component Analysis. 2nd Edition. Springer, 2002

Jure Leskovec, Anand Rajaraman, Jeff Ullman, Mining of massive datasets, Cambridge University Press, 2014. Versione online: http://www.mmds.org/

Lecture notes of the teachers
Assessment methods and Criteria
The exam is composed by a set of homeworks that will be assigned by the teachers during the course, composed by both multivariate and big dimensional data analysis and guided development of methodologies for big data analysis.
The homeworks are dedicated to the students who follow the course in real time, thus the attendance of the course is highly recommended.

The non attending students, or the students that will reject the grade resulting from the homeworks, will have to pass an oral exam on the entire program of the course.
The aim of the exams is to ascertain the achievement of the objectives in terms of knowledge and comprehension and of the ability of the students to solve problems of multivariate and big data analysis with suitable mathematical-statistical instruments.
MAT/06 - PROBABILITY AND STATISTICS - University credits: 9
Laboratories: 36 hours
Lessons: 42 hours
Educational website(s)
Professor(s)
Reception:
on appointment
office 2099
Reception:
Appointment by email
Office or online (by videocall)