Advanced Mathematical Statistics
A.Y. 2020/2021
Learning objectives
Nozioni e teoremi base della Statistica Matematica multivariata e computazionale, che lo studente sarà poi in grado di approfondire in ambito sia teorico che applicativo. Lo studente sarà inoltre in grado di applicare tali competenze all'analisi statistica di dati multivariati o di grandi dimensioni.
Expected learning outcomes
Basic notions and theorems of Multivariate Mathematical and Computational Statistics.
The student will then be able to apply and broaden his/her knowledge of the subjects in different areas of interest, both in theoretical and applied contexts, and to perform statistical data analyses, both in the multivariate and big data case.
The student will then be able to apply and broaden his/her knowledge of the subjects in different areas of interest, both in theoretical and applied contexts, and to perform statistical data analyses, both in the multivariate and big data case.
Lesson period: First semester
Single course
This course cannot be attended as a single course. Please check our list of single courses to find the ones available for enrolment.
Course syllabus and organization
Single session
Responsible
Lesson period
First semester
Teaching methods
The lessons will be held on the Microsoft Teams platform and can be followed both synchronously based on the timetable of the first semester and asynchronously as they will be recorded and uploaded on that same platform. Other reference materials and the news on the course will be uploaded also on the web site of the course on Ariel. The students will have to install on their personal PC's the softwares R, RStudio and Spark, following the instructions of the teachers. Some review lectures will be recorded and offered in asynchronous modality.
Course syllabus, Teaching Resources and exams
The contents and reference material will not be changed as well as the modality of exam.
The lessons will be held on the Microsoft Teams platform and can be followed both synchronously based on the timetable of the first semester and asynchronously as they will be recorded and uploaded on that same platform. Other reference materials and the news on the course will be uploaded also on the web site of the course on Ariel. The students will have to install on their personal PC's the softwares R, RStudio and Spark, following the instructions of the teachers. Some review lectures will be recorded and offered in asynchronous modality.
Course syllabus, Teaching Resources and exams
The contents and reference material will not be changed as well as the modality of exam.
Course syllabus
Here an indication of the chapters that should be developed is provided. The teachers could operate a selection due to lack of time.
1. Random vectors
2. The Multivariate Normal Distribution
2.1. Definition and properties of the multivariate normal distribution
2.2. Test for the normality of a random vector
2.3. Detection of outliers
3. Main multivariate distributions originating from the Normal
3.1. Wishart distribution
3.2. Hotelling T2 distribution
3.3. Wilks' Lambda distribution
4. Multivariate Hypothesis Tests
4.1. Test on one or two mean vectors
4.2. Multivariate Analysis of Variance (MANOVA)
4.3. Test on covariance matrices
Statistical Methods for the analysis of Big Data
5. Locality Sensitive Hashing (LSH)
6. Finding Similar Items
7. Frequent Itemsets
8. Cluster analysis
9. Techniques for dimensionality reduction
10. Analysis of data streams
11. Analysis of social networks
12. Computer Lab
Multivariate and big data analysis by statistical softwares ( R and R Spark)
1. Random vectors
2. The Multivariate Normal Distribution
2.1. Definition and properties of the multivariate normal distribution
2.2. Test for the normality of a random vector
2.3. Detection of outliers
3. Main multivariate distributions originating from the Normal
3.1. Wishart distribution
3.2. Hotelling T2 distribution
3.3. Wilks' Lambda distribution
4. Multivariate Hypothesis Tests
4.1. Test on one or two mean vectors
4.2. Multivariate Analysis of Variance (MANOVA)
4.3. Test on covariance matrices
Statistical Methods for the analysis of Big Data
5. Locality Sensitive Hashing (LSH)
6. Finding Similar Items
7. Frequent Itemsets
8. Cluster analysis
9. Techniques for dimensionality reduction
10. Analysis of data streams
11. Analysis of social networks
12. Computer Lab
Multivariate and big data analysis by statistical softwares ( R and R Spark)
Prerequisites for admission
The students should have followed an introductory course to Mathematical Statistics, with particular reference to statistical hypotheses tests and Linear Regression.
Teaching methods
Frontal lectures and computer labs
Teaching Resources
A.C. Rencher, Multivariate Statistical Inference and Applications, Wiley, 1998
K.V. Mardia, J.T. Kent, J.M., Bibby, Multivariate Analysis, Academic Press, 1979
Jure Leskovec, Anand Rajaraman, Jeff Ullman, Mining of massive datasets, Cambridge University Press, 2014. Versione online: http://www.mmds.org/
Lecture notes of the teachers
K.V. Mardia, J.T. Kent, J.M., Bibby, Multivariate Analysis, Academic Press, 1979
Jure Leskovec, Anand Rajaraman, Jeff Ullman, Mining of massive datasets, Cambridge University Press, 2014. Versione online: http://www.mmds.org/
Lecture notes of the teachers
Assessment methods and Criteria
The exam is composed by a set of homeworks that will be assigned by the teachers during the course, composed by both multivariate data analysis and guided development of methodologies for big data analysis.
The homeworks are dedicated to the students who follow the course in real time, thus the attendance of the course is highly recommended.
The non attending students, or the students that will reject the grade resulting from the homeworks, will have to pass an oral exam on the entire program of the course.
The aim of the exams is to ascertain the achievement of the objectives in terms of knowledge and comprehension and of the ability of the students to solve problems of multivariate and big data analysis with suitable mathematical-statistical instruments.
The homeworks are dedicated to the students who follow the course in real time, thus the attendance of the course is highly recommended.
The non attending students, or the students that will reject the grade resulting from the homeworks, will have to pass an oral exam on the entire program of the course.
The aim of the exams is to ascertain the achievement of the objectives in terms of knowledge and comprehension and of the ability of the students to solve problems of multivariate and big data analysis with suitable mathematical-statistical instruments.
MAT/06 - PROBABILITY AND STATISTICS - University credits: 9
Laboratories: 36 hours
Lessons: 42 hours
Lessons: 42 hours
Professors:
Aletti Giacomo, Micheletti Alessandra
Professor(s)
Reception:
on appointment
office 2099