Data Mining and Computational Statistics

A.Y. 2015/2016
Lesson for
9
Max ECTS
80
Overall hours
Language
English
Learning objectives
Course objectives are:
· To introduce students to the expanding world of big data analysis.
· To introduce students to basic concepts, techniques and applications of computational statistics & data mining to be used in finance and economics.
· To develop skills for using the R software in order to solve practical problems
· To achieve skills for doing independent study and research.

Course structure and Syllabus

Active edition
Yes
Responsible
Practicals: 40 hours
Lessons: 40 hours
Professors: Andreis Federico, Manzi Giancarlo
Syllabus
Main topics:
(i) Introduction to data mining and statistical learning. (ii) Exploratory data analysis and visualization. (iii) Supervised vs. unsupervised methods: introduction. (iv) Quick review of Maximum Likelihood Methods (v) Multiple Linear regression. (vi) Classification methods: logistic regression, linear discriminant analysis and the K-nearest neighbors method. (vii) Resampling methods: cross validation and the bootstrap. (vii) Shrinkage methods: Ridge regression and the Lasso. Principal component regression. (ix) Regression splines and local regression. (x) Tree-based methods: random forest, bagging and boosting. (xi) Support vector machines. (xi) Unsupervised learning: PCA and clustering methods. (xii) Introduction to Bayesian methods in data mining.
Further topics:
(i) Computer-intensive statistical methods: overview. (ii) Pseudo-random number and variable generation. (iii) Monte Carlo methods for numerical integration. (iv) Simulation-based inference. (v) MCMC methods: overview. (vi) MCMC methods: Metropolis-Hastings and Gibbs sampling.
Lesson period
Second trimester
Lesson period
Second trimester
Assessment methods
Esame
Assessment result
voto verbalizzato in trentesimi
Professor(s)
Reception:
Wed 4.30PM-7.30PM.
Room 37, 3rd Floor.