Methods and Languages for Data Management
A.Y. 2020/2021
Learning objectives
By partecipating fully to the F8x-160 course students should be able to explain what research data is and perform canonical statistical (performed using the R language), plan and realize a data acquisition project based on remote data access (PERL programming language) and make inference on the acquired data (PYTHON programming language) using machine learning techniques.
Expected learning outcomes
- Ability to perform statistical analyses using the R language for statistical computing
- Ability to write personalized script/programs in R
- Ability to write personalized script/programs in PERL
- Ability to write PERL programs for remote database access and manipulation.
- Ability to write a personalized program/script in PYTHON
- Ability to perform machine learning experiments using PYTHON
- Ability to write personalized script/programs in R
- Ability to write personalized script/programs in PERL
- Ability to write PERL programs for remote database access and manipulation.
- Ability to write a personalized program/script in PYTHON
- Ability to perform machine learning experiments using PYTHON
Lesson period: Second semester
Assessment methods: Esame
Assessment result: voto verbalizzato in trentesimi
Single course
This course cannot be attended as a single course. Please check our list of single courses to find the ones available for enrolment.
Course syllabus and organization
Single session
Responsible
Lesson period
Second semester
Course syllabus
Parte 1 - descriptive analysis methods: frequency tables, charts, poition indices, dispesion indices, hetherogeneity indices. Some case studies using SPSS.
Parte 2 -statistical inference methods: estimators of the mean and of the variance, confidence intervals, hypothesis tests. Some case studies using SPSS.
Parte 3 - ontroduction to machine learning: supervised and non supervised methods; classification methods (clustering, decision trees, dendrograms, logistic regression), prevision methods (linear regression, support vector machines, neural networks); important issues about machine learning (overfitting, non linearity, dimensionality reduction, unbalanced data); measure of performance (accuracy, confusion table, specificity, sensitivity).
Parte 2 -statistical inference methods: estimators of the mean and of the variance, confidence intervals, hypothesis tests. Some case studies using SPSS.
Parte 3 - ontroduction to machine learning: supervised and non supervised methods; classification methods (clustering, decision trees, dendrograms, logistic regression), prevision methods (linear regression, support vector machines, neural networks); important issues about machine learning (overfitting, non linearity, dimensionality reduction, unbalanced data); measure of performance (accuracy, confusion table, specificity, sensitivity).
Prerequisites for admission
none
Teaching methods
General computer science, Probabilistic and statistic methods
Assessment methods and Criteria
The final exam consists of a practical part (a simple case study analysis using a software for data analysis) followed by an oral part regarding topics covered in class.
Professor(s)
Reception:
Wednesday 10:30-12:30 -- by appointment
via Celoria 18, 5th floor