Statistics and data analysis

A.Y. 2020/2021
6
Max ECTS
60
Overall hours
SSD
INF/01
Language
Italian
Learning objectives
The course aim at introducing the fundamentals of descriptive statistics, probability and parametric inferential statistics.
Expected learning outcomes
Students will be able to carry out basic explorative analyses and inferences on datasets, they will know the main probability distributions and will be able to understand statistical analyses conducted by others; moreover, they will know simple methods for the problem of binary classification, and will be able to evaluate their performances. The students will also acquire the fundamental competences for studying more sophisticated techniques for data analysis and data modeling.
Course syllabus and organization

Single session

Responsible
Lesson period
Second semester
Teaching methods: lectures will be delivered via videoconference system; students will be able to attend them both through streaming on the basis of the course schedule, and downloading their videos in a subsequnt time from the course Web page.

Program and reference material: There will be no change.

Assessment methods and criteria: depending on the regulations in force at the time of the exam, examination procedures may be carried out remotely. The evaluation criteria will not change.
Course syllabus
Introduction to python.
Descriptive statistics:
- Frequencies and cumulate frequencies. Joined and marginal frequencies.
- Indices of centrality, dispersion, correlation, heterogeneity, and concentration.
- Graphical methods: frequency and cumulative frequency plots, scatter plots, and QQ plots.
- Classificators and ROC curves.
Probability:
- Combinatorics. Basics of set theory.
- Probability axioms.
- Conditional probability and related theorems.
- Discrete and continuous random variables. Centrality and dispersion indices for random variables and their properties.
- Multivariate random variables. Covariance and correlation indices for random variables.
- Independent events and independent random variables.
- Markov and Tchebyshev inequalities.
- Bernoulli, binomial, geometric, Poisson, discrete uniform and hypergeometric models.
- Continuous uniform, exponential and gaussian models.
- Poisson process.
Parametric inferential statistics:
- Population, random sample and point estimates.
- Sample mean. Central limit theorem.
- Sample variance.
- Unbiasedness and Consistency in mean square.
- Large numbers law.
- Computation of the sample size.
Prerequisites for admission
Students shall have passed the exam of "Matematica del continuo"; besides that, the course requires knowledge of the main topics of computer programming, and having passed the exam of "Matematica del discreto" is strongly suggested.
Teaching methods
Frontal classes and exercise sessions
Teaching Resources
Suggested textbooks:
- S. Ross, Introduzione alla statistica, Apogeo education, 2014, ISBN 9788838786020
- S. Ross, Probabilità e statistca per l'ingegneria e le scienze, terza edizione, Apogeo education, 2015, ISBN 8891609946

Lecture notes (for topics not covered in the suggested textbooks) and sample code available at the course Web pages:
- http://https://dmalchiodisad.ariel.ctu.unimi.it/
- http://malchiodi.di.unimi.it/teaching/data-analytics/
Assessment methods and Criteria
The exam consists of a written and an oral test, both relating to the topics covered in the course. The written test takes place in a computer-based room and it lasts two hours and a half. It is based on open-ended questions and on the analysis of a dataset through the adequate application of the statistical techniques described during the classes. The evaluation, with a mark of pass/fail, takes into account the level of mastery of the topics and the correct use of mathematical formalism.

The oral test, which is accessed after passing the written test, is based on the discussion of the written test answers and on questions concerning topics covered in the course. Its evaluation, expressed on a scale between 0 and 30, takes into account the level of mastery of the topics, the clarity, the language skills, and the correct use of technical jargon.
INF/01 - INFORMATICS - University credits: 6
Practicals: 36 hours
Lessons: 24 hours
Professor: Malchiodi Dario
Professor(s)
Reception:
By appointment (via e-mail)