Multivariate Analysis for Social Scientists
A.Y. 2022/2023
Learning objectives
The objective of the course is to introduce students to the logic of quantitative statistical reasoning in social and political sciences. Students will be presented with the main statistical techniques for data analysis used in social sciences, so that they will be able to independently assess and conduct quantitative research.
By the end of the course, students will be equipped with the necessary skills to:
- understand the assumptions underlying the main statistical techniques used in the analysis of social and political phenomena;
- perform descriptive and inferential multivariate analyses on R;
- elaborate a research design for answering a research question with empirical data, applying the statistical techniques learned in this course.
The program is designed to be flexible and can be adjusted if needed, based on the starting level of the students.
In the first part of the course students will be introduced to key concepts and processes involved in designing a research project in social sciences, including formulating a research question and developing hypotheses, and selecting an appropriate data set. This will be followed by a review of univariate and bivariate analyses, OLS regression and its assumptions, and how to deal with deviations from such assumptions. The students will then be introduced to other analytical tools for quantitative analysis, such as non-linear regression functions, logit and probit regressions, and limited dependent variable models. More advanced topics such as time-series and panel-data analyses can be covered upon collective agreement. Students will also learn how to use the statistical software R to organize and analyze data. Lectures are coordinated with computer lab instruction in data analysis.
In the second part of the course, students will be required to elaborate a research design using a real political or social science dataset, and apply the concepts they have learnt to address a research question. Great emphasis will be placed on the formulation of hypotheses and on the use of data to test such hypotheses. Lectures will be based on hands-on material and will provide interactive learning experiences.
By the end of the course, students will be equipped with the necessary skills to:
- understand the assumptions underlying the main statistical techniques used in the analysis of social and political phenomena;
- perform descriptive and inferential multivariate analyses on R;
- elaborate a research design for answering a research question with empirical data, applying the statistical techniques learned in this course.
The program is designed to be flexible and can be adjusted if needed, based on the starting level of the students.
In the first part of the course students will be introduced to key concepts and processes involved in designing a research project in social sciences, including formulating a research question and developing hypotheses, and selecting an appropriate data set. This will be followed by a review of univariate and bivariate analyses, OLS regression and its assumptions, and how to deal with deviations from such assumptions. The students will then be introduced to other analytical tools for quantitative analysis, such as non-linear regression functions, logit and probit regressions, and limited dependent variable models. More advanced topics such as time-series and panel-data analyses can be covered upon collective agreement. Students will also learn how to use the statistical software R to organize and analyze data. Lectures are coordinated with computer lab instruction in data analysis.
In the second part of the course, students will be required to elaborate a research design using a real political or social science dataset, and apply the concepts they have learnt to address a research question. Great emphasis will be placed on the formulation of hypotheses and on the use of data to test such hypotheses. Lectures will be based on hands-on material and will provide interactive learning experiences.
Expected learning outcomes
The course will prepare students to:
- perform descriptive and inferential multivariate analyses using R, including multiple regression models, non-linear regression functions, limited dependent variables, and time-series analysis;
- understand and discuss the underlying assumptions of common statistical techniques in social and political science;
- develop a research design to study social and political phenomena, including defining an original research question, creating a research design, formulating hypotheses, selecting a suitable dataset, conducting statistical analysis, interpreting results, and discussing limitations;
- present their research findings through a public presentation.
- perform descriptive and inferential multivariate analyses using R, including multiple regression models, non-linear regression functions, limited dependent variables, and time-series analysis;
- understand and discuss the underlying assumptions of common statistical techniques in social and political science;
- develop a research design to study social and political phenomena, including defining an original research question, creating a research design, formulating hypotheses, selecting a suitable dataset, conducting statistical analysis, interpreting results, and discussing limitations;
- present their research findings through a public presentation.
Lesson period: Third trimester
Assessment methods: Esame
Assessment result: voto verbalizzato in trentesimi
Single course
This course cannot be attended as a single course. Please check our list of single courses to find the ones available for enrolment.
Course syllabus and organization
Single session
Responsible
Lesson period
Third trimester
Course syllabus
The first Module of the course (30 hrs), taught by Prof. Pagano, starts by introducing the students to the core elements of creating a robust research design in the field of social and political sciences. Students will then receive a comprehensive refresher of their knowledge of univariate and bivariate analyses, as well as of the Ordinary Least Squares (OLS) regression model and its underlying assumptions. This will serve as a foundation for exploring more complex multivariate analysis techniques in the subsequent lessons. The Module will then proceed to cover non-linear regression functions, logit and probit regressions, and limited dependent variable models, with more advanced techniques for quantitative analysis being covered upon collective agreement (such as time-series and panel-data analyses). Lectures are coordinated with computer lab instruction in data analysis, in which students will learn how to use the statistical software R to organize and analyze data. At the end of the Module, an intermediate test will assess students' understanding of the material covered in the first part of the course.
The second Module by Prof. Decadri (10 hrs) is entirely dedicated to helping the students prepare their final Group Research Project. In detail, Prof. Decadri will assign to each group a general research topic in the form of an already formulated research question, and a suitable dataset to address it. According to such preliminary instructions, each group will have to: 1. formulate testable hypotheses; 2. among the available statistical models covered in the course, select the most suitable to test such hypotheses (i.e., descriptive statistics, tabulations, bivariate and multivariate OLS regressions, and model specifications for dummy, categorical or ordinal dependent variables); 3. run the descriptive and inferential statistical analyses; 4. write a short research note (4000 words maximum: title, abstract, keywords, tables, figures and references included).
Course Outline
Module 1 - Prof. Pagano (30 hours)
-Introduction to creating a robust research design in social and political sciences
-Refresher on univariate and bivariate analyses, and Ordinary Least Squares (OLS) regression model and its assumptions
-Non-linear regression functions - Quadratic and Interaction Models
-Logit and probit regressions, and limited dependent variable models
-Advanced techniques for quantitative analysis (such as time-series and panel-data analyses) covered based on collective agreement
Lectures will be coordinated with computer lab instruction in data analysis using statistical software R.
Module 2 - Prof. Decadri (10 hours)
-Research design sessions dedicated to helping students prepare their final Group Research Project
-Project Colloquia
The second Module by Prof. Decadri (10 hrs) is entirely dedicated to helping the students prepare their final Group Research Project. In detail, Prof. Decadri will assign to each group a general research topic in the form of an already formulated research question, and a suitable dataset to address it. According to such preliminary instructions, each group will have to: 1. formulate testable hypotheses; 2. among the available statistical models covered in the course, select the most suitable to test such hypotheses (i.e., descriptive statistics, tabulations, bivariate and multivariate OLS regressions, and model specifications for dummy, categorical or ordinal dependent variables); 3. run the descriptive and inferential statistical analyses; 4. write a short research note (4000 words maximum: title, abstract, keywords, tables, figures and references included).
Course Outline
Module 1 - Prof. Pagano (30 hours)
-Introduction to creating a robust research design in social and political sciences
-Refresher on univariate and bivariate analyses, and Ordinary Least Squares (OLS) regression model and its assumptions
-Non-linear regression functions - Quadratic and Interaction Models
-Logit and probit regressions, and limited dependent variable models
-Advanced techniques for quantitative analysis (such as time-series and panel-data analyses) covered based on collective agreement
Lectures will be coordinated with computer lab instruction in data analysis using statistical software R.
Module 2 - Prof. Decadri (10 hours)
-Research design sessions dedicated to helping students prepare their final Group Research Project
-Project Colloquia
Prerequisites for admission
Concepts and topics covered in the Data Analysis course are required, as well as a basic knowledge of the R programming language. On the other hand, the mathematical requirements for the class are minimal. Only a decent knowledge of algebra is assumed.
Teaching methods
Lectures, hands-on sessions in R, and teamwork.
Teaching Resources
The program is the same for attending and not-attending students.
Readings and textbooks:
1. Paul M. Kellstedt and Guy D. Whitten(2009-2013). The Fundamentals of Political Science Research (2nd edition). Cambridge University Press, Chapters 1, 2, 3, 4, 5, 6, 12; Chapters 7-11 (read only).
2. Thomas Brambor, William Roberts Clark, and Matt Golder (2006). Understanding Interaction Models: Improving Empirical Analyses. Political Analysis.14(1): 63-82.
3. John Fox and Sanford Weisberg (2019). An R Companion to Applied Regression (3rd edition). SAGE Publications, Chapters 4, 5, 6, 8, 9.
Readings and textbooks:
1. Paul M. Kellstedt and Guy D. Whitten(2009-2013). The Fundamentals of Political Science Research (2nd edition). Cambridge University Press, Chapters 1, 2, 3, 4, 5, 6, 12; Chapters 7-11 (read only).
2. Thomas Brambor, William Roberts Clark, and Matt Golder (2006). Understanding Interaction Models: Improving Empirical Analyses. Political Analysis.14(1): 63-82.
3. John Fox and Sanford Weisberg (2019). An R Companion to Applied Regression (3rd edition). SAGE Publications, Chapters 4, 5, 6, 8, 9.
Assessment methods and Criteria
Attendance is not compulsory, but it is warmly suggested.
Students attending the classes will be evaluated based on: (1) an intermediate individual exam at the conclusion of Module 1, and (2) a final Group Research Project.
- The intermediate exam (1) will consist of a combination of multiple-choice questions and hands-on exercises. Students will be required to perform descriptive and inferential analyses using R on a provided dataset. The completed questions and the corresponding R script must be submitted to Prof. Pagano via email. Prof. Pagano will acknowledge receipt of each student's submission. These exam results will contribute 50% towards the final grade and will be communicated to each student individually via email.
- For the final Group Research Project (2), at the end of Module 2, students will present their research design and preliminary results to their peers and Prof. Decadri during the Project Colloquia. The final research note in .pdf format, together with the R dataset and script to replicate its content, will have to be sent via email to Prof. Decadri by July 7 (by midnight). Grades for this component of the final grade, which accounts for 50%, will be individually communicated via email.
Students who are unable to attend the classes will be evaluated through a written exam. The exam will consist of a combination of multiple-choice and open-ended questions, as well as descriptive and inferential analyses using R on a dataset provided by the instructors. These evaluations will determine their final grade.
Students attending the classes will be evaluated based on: (1) an intermediate individual exam at the conclusion of Module 1, and (2) a final Group Research Project.
- The intermediate exam (1) will consist of a combination of multiple-choice questions and hands-on exercises. Students will be required to perform descriptive and inferential analyses using R on a provided dataset. The completed questions and the corresponding R script must be submitted to Prof. Pagano via email. Prof. Pagano will acknowledge receipt of each student's submission. These exam results will contribute 50% towards the final grade and will be communicated to each student individually via email.
- For the final Group Research Project (2), at the end of Module 2, students will present their research design and preliminary results to their peers and Prof. Decadri during the Project Colloquia. The final research note in .pdf format, together with the R dataset and script to replicate its content, will have to be sent via email to Prof. Decadri by July 7 (by midnight). Grades for this component of the final grade, which accounts for 50%, will be individually communicated via email.
Students who are unable to attend the classes will be evaluated through a written exam. The exam will consist of a combination of multiple-choice and open-ended questions, as well as descriptive and inferential analyses using R on a dataset provided by the instructors. These evaluations will determine their final grade.
SPS/04 - POLITICAL SCIENCE - University credits: 6
Lessons: 40 hours
Professors:
Decadri Silvia, Pagano Giovanni
Professor(s)