Statistics for Big Data for Economics and Business

A.Y. 2023/2024
6
Max ECTS
40
Overall hours
SSD
SECS-S/03
Language
Italian
Learning objectives
This course aims at introducing and illustrating specific statistical, IT and machine learning methodologies for the analysis of Big Data in economic, business and financial applications. The course will focus mainly on the Python programming language, which is by far the most used in Big Data applications, but some parts will be devoted to the R language and other more classical languages such as Java. On the statistical side, supervised and unsupervised statistical learning themes will be proposed with some reference to Bayesian statistics.
Expected learning outcomes
At the end of the course, students will have acquired adequate statistical and programming skills allowing for mastering the tools necessary for the analysis of Big Data and the extrapolation of information of interest in the economic, business and financial fields.
Single course

This course can be attended as a single course.

Course syllabus and organization

Single session

Responsible
Lesson period
Third trimester
Course syllabus
FIRST PART: Statistical models.
1.1 Advanced cluster analysis
1.2 Principal component analysis and introduction to other dimension reductional methods
1.3 Decision trees
1.4 The bootstrap
1.5 Random forest
SECOND PART: Introduction to programming and data management
1) Introduction to programming in R and Python for statistical and economic applications
2) Introduction to cloud computing
3) Introduction to web scraping
4) Introduction to relational and non-relational databases
5) Introduction to the SQL language
Prerequisites for admission
Knowledge of basic statistical and mathematical techniques. Knowledge of some programming techniques is useful but not essential.
Teaching methods
Classes will be carried out with the active involvement of the students, especially in the programming part. They will often be invited to actively follow (i.e. also on their personal laptops) steps of computer programs proposed in the classroom together with the teacher, in a "what-if" approach. They will also work in gro ups to share and increase the effectiveness of their active learning.
Teaching Resources
James, Witten, Hastie, Tibshirani (2013). Introduction to Statistical Learning, Springer
Wiktorski, (2019). Data-intensive Systems, Springer.
Crawley (2012) The R book, Wiley.
Sosinsky (2010). Cloud Computing Bible, Wiley
Raschka, Mirjalili (2013). Python Machine Learning
Atzeni et al. (2018).Basi di dati. McGraw-Hill.
Assessment methods and Criteria
The exam will consist of a test with questions involving multiple answers. During the course some assignments will be proposed (both in the classroom and to be returned in the short term) which will contribute to the final score.
SECS-S/03 - ECONOMIC STATISTICS - University credits: 6
Lessons: 40 hours
Professor: Cappozzo Andrea
Professor(s)
Reception:
Friday 10AM-12PM
Office n.29, via Conservatorio third floor