Course objectives are: · To introduce students to the expanding world of big data analysis. · To introduce students to basic concepts, techniques and applications of computational statistics & data mining to be used in finance and economics. · To develop skills for using the R software in order to solve practical problems · To achieve skills for doing independent study and research.
Main topics: (i) Introduction to data mining and statistical learning. (ii) Exploratory data analysis and visualization. (iii) Supervised vs. unsupervised methods: introduction. (iv) Quick review of Maximum Likelihood Methods (v) Multiple Linear regression. (vi) Classification methods: logistic regression, linear discriminant analysis and the K-nearest neighbors method. (vii) Resampling methods: cross validation and the bootstrap. (vii) Shrinkage methods: Ridge regression and the Lasso. Principal component regression. (ix) Regression splines and local regression. (x) Tree-based methods: random forest, bagging and boosting. (xi) Support vector machines. (xi) Unsupervised learning: PCA and clustering methods. (xii) Introduction to Bayesian methods in data mining. Further topics: (i) Computer-intensive statistical methods: overview. (ii) Pseudo-random number and variable generation. (iii) Monte Carlo methods for numerical integration. (iv) Simulation-based inference. (v) MCMC methods: overview. (vi) MCMC methods: Metropolis-Hastings and Gibbs sampling.