Mathematical Statistics

A.Y. 2019/2020
9
Max ECTS
78
Overall hours
SSD
MAT/06
Language
Italian
Learning objectives
The main aim of the course is to introduce the modern concepts of multivariate and computational Mathematical Statistics, both from a theoretical and applied point of view, with particular reference to the techniques for Big Data analysis. During the lab activities, the students will be trained to perform a data analysis with advanced software instruments (R and R Spark).
Expected learning outcomes
Basic notions and theorems of Multivariate Mathematical and Computational Statistics.
The student will then be able to apply and broaden his/her knowledge of the subjects in different areas of interest, both in theoretical and applied contexts, and to perform statistical data analyses, both in the multivariate and big data case.
Single course

This course cannot be attended as a single course. Please check our list of single courses to find the ones available for enrolment.

Course syllabus and organization

Single session

Responsible
Lesson period
Second semester
Course syllabus
Here an indication of the chapters that should be developed is provided. The teachers could operate a selection due to lack of time.

1. Random vectors
2. The Multivariate Normal Distribution
2.1. Definition and properties of the multivariate normal distribution
2.2. Test for the normality of a random vector
2.3. Detection of outliers
3. Main multivariate distributions originating from the Normal
3.1. Wishart distribution
3.2. Hotelling T2 distribution
3.3. Wilks' Lambda distribution
4. Multivariate Hypothesis Tests
4.1. Test on one or two mean vectors
4.2. Multivariate Analysis of Variance (MANOVA)
4.3. Test on covariance matrices

Statistical Methods for the analysis of Big Data
5. Locality Sensitive Hashing (LSH)
6. Finding Similar Items
7. Frequent Itemsets
8. Cluster analysis
9. Techniques for dimensionality reduction
10. Analysis of data streams
11. Analysis of social networks

12. Computer Lab
Multivariate and big data analysis by statistical softwares ( R and R Spark)
Prerequisites for admission
The students should have followed an introductory course to Mathematical Statistics, with particular reference to statistical hypotheses tests and Linear Regression.
Teaching methods
Frontal lectures, exercises and computer lab
Teaching Resources
A.C. Rencher, Multivariate Statistical Inference and Applications, Wiley, 1998

K.V. Mardia, J.T. Kent, J.M., Bibby, Multivariate Analysis, Academic Press, 1979

Jure Leskovec, Anand Rajaraman, Jeff Ullman, Mining of massive datasets, Cambridge University Press, 2014. Versione online: http://www.mmds.org/

Lecture notes of the teachers
Assessment methods and Criteria
The final examination consists of two parts: a written exam and a lab exam.

- During the written exam, the student must solve some exercises in the format of open-ended and/or short answer questions, with the aim of assessing the student's ability to solve problems of Multivariate Statistics. The duration of the written exam will be proportional to the number of exercises assigned, also taking into account the nature and complexity of the exercises themselves (however, the duration will not exceed three hours).The outcomes of these tests will be available in the SIFA service through the UNIMIA portal and on the ARIEL website of the course.

-The lab exam consists in short reports and programs developments related with problems or exercises, which will be assigned by the professors during the lectures. The reports will be evaluated during the course, thus it is required a constant attendance to the lectures. The lab portion of the examination serves to assess the capability of the student to put a problem of multivariate and/or big data into context, find a solution and to give a report on the results obtained.

The complete final examination is passed if both parts (written and lab) are successfully passed. Final marks are given using the numerical range 0-30, and are computed as a weighted mean (6 cfu for the written part, 3 cfu for the lab part) of the grades of the two parts and will be communicated immediately after the correction of the written examination.
MAT/06 - PROBABILITY AND STATISTICS - University credits: 9
Laboratories: 36 hours
Lessons: 42 hours
Shifts:
Professor(s)
Reception:
on appointment
office 2099
Reception:
Appointment by email
Office or online (by videocall)