Data Analysis

A.Y. 2020/2021
9
Max ECTS
60
Overall hours
SSD
SPS/07
Language
English
Learning objectives
The objective of the course is to acquire a solid foundation in applied statistical methodology for the social sciences. By the end of the course students will master the basic toolkit of quantitative research both from a theoretical and a practical/applied standpoint.
Expected learning outcomes
Reach proficiency in various types of univariate and bivariate analyses. Understand what it means to make inference in the social sciences and how to do it in different circumstances. Become competent in hypothesis testing with different types of variables. Be able to produce basic statistical analyses of quantitative data independently using Stata. Achieve basic competences in the understanding and production of time series analyses.
Course syllabus and organization

Single session

Lesson period
First trimester
During the health emergency, the course undergoes the following changes:

Teaching methods:
The course will take place mostly (about two thirds) in the classroom (booking is compulsory using the app). Students who don't book a seat will be able to follow the class remotely. The remaining third of the course will be held remotely with non-synchronous classes where students will develop the same abilities they would have achieved in presence.

The course calendar and all details of the activities will be published in the online course before the beginning of the classes. All updates will be published on the website of the course. Students are required to check their institutional e-mail account often (@studenti.unimi.it).

The means and criteria for sitting classes in presence, which need to be booked using the app, will be published in advance on the online course.
Students are considered as attending the course even if they do not take part in the classes in presence, but attend the class in streaming and submit the exercises and the take-home lab sessions.

Course material:
students who attend the course must refer to the mandatory textbook and to the materials such as exercises and lab sessions that are published in the online course.
Course syllabus
The Data Analysis Module (40 hours credits) aims at providing students with a solid foundation in applied statistical methodology. Students who attend and successfully complete the course will master the basic toolkit of quantitative research (i.e. cases, types of variables, datasets, hypotheses testing); will achieve an understanding of why sampling is used, different sampling methods and how to make predictions (inference) in the social sciences; they will be proficient with the main tools for univariate and bivariate analyses. Students will also receive basic training for the use of the statistical software Stata and, by the end of the course, they will be able to produce basic statistical analyses of quantitative data independently.
The topic covered are: Introduction, variables and samples; Descriptive statistics, Introduction to Stata, setting up the workspace, descriptive statistics, Probabilities and distributions, Generating and modifying variables in Stata, Inference and estimation, Significance tests; Point and interval estimation with Stata; Comparing two groups and associations between categorical variables, Cross-tabulation in Stata, Linear regression and correlation, Anova, Linear regression and Anova in Stata, Introduction to logistic regression and to multivariate relationships; Setting up and executing a quantitative research analysis in Stata.

The introductory module on Time Series Analysis (20 hours) will provide the methodological basis for time series analysis and technical tools for the descriptive analysis, decomposition and forecasting of time series using Excel and Stata.
Prerequisites for admission
No previous background in statistics is required to take this course
Teaching methods
Data Analysis Module (40 hours) includes both lectures and lab sessions. Students are given in class and take-home assignments and are asked to work in groups and/or individually. Lab sessions include individual exercises with the software Stata.

The 20-hour Time Series Analysis (20 hours) module includes both lectures and lab sessions. Students are given in class and take-home assignments and are asked to work individually and/or in groups. Lab sessions include individual exercises with Excel and the software Stata.
Teaching Resources
For Data Analysis (40 hours):
Alan Agresti and Barbara Finlay (2014), Statistical Methods for the Social Sciences. Pearson, 4th Edition
Chapters: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 15.

For Stata: syntax will be provided by the professor in ARIEL
Useful (not mandatory) textbooks for learning how to use Stata on your own:
Ulrich Kohler & Frauke Kreuter (2012). Data Analysis Using Stata. Stata Press, 3rd Edition
Alan Acock (2014). A Gentle Introduction to Stata. Stata Press. 4th Edition

For the introductory module on Time Series Analysis (20 hours):
Barrow Michael (2017) Statistics for Economics, Accounting and Business Studies, Pearson, 7th Edition.
Chapters: 1, 10 and 11.
Further materials will be provided by professor in ARIEL.
Assessment methods and Criteria
For the Data Analysis Module (40 hours) attendance to the course is mandatory and will be checked using attendance sheets. Students are expected to participate in at least 80% of the classes. They will be evaluated for their participation in class and for doing and uploading homework on the course website (Ariel). The final exam for attendees includes multiple-choice questions and written exercises (similar to those assigned for the homework). Attendees will also have a test on the use of the software Stata. Nonattendees will take a comprehensive final exam on all the material assigned in the textbook. The exam will be held remotely or in presence, according to the indications received by the University.

About the introductory module on Time Series Analysis (20 hours), attendees will have to answer to two open questions. Attendees will also have a practical session with exercises to do with Excel and Stata. Nonattendees will take a comprehensive final exam on all the material assigned in the textbook and provided on Ariel. They will have to do an exercise using Excel and answer two open questions. The exam will be held remotely or in presence, according to the indications received by the University.
SPS/07 - GENERAL SOCIOLOGY - University credits: 9
Lessons: 60 hours