#
Labour statistics

A.Y. 2020/2021

Learning objectives

Undefined

Expected learning outcomes

Undefined

**Lesson period:** Second semester
(In case of multiple editions, please check the period, as it may vary)

**Assessment methods:** Esame

**Assessment result:** voto verbalizzato in trentesimi

Course syllabus and organization

### Single session

Responsible

Lesson period

Second semester

The lessons will be usually held synchronously on the Microsoft Teams platform based on the trimester's timetable and will be recorded and made available to students on the same platform. If synchronous lesson is not possible, the lesson will be recorded and loaded asynchronously.

The program and the reference material will not change.

The exam will always be written, but taken remotely, via the Exam.net and Microsoft teams platforms and will have the same structure as the exam in presence.

The program and the reference material will not change.

The exam will always be written, but taken remotely, via the Exam.net and Microsoft teams platforms and will have the same structure as the exam in presence.

**Course syllabus**

Descriptive statistics

1) Classification of statistical phenomena (types of characters and scales of measurement) and frequency distributions (absolute, relative and cumulative frequencies).

2) Graphical representations: bar graph, stick graph, histogram.

3) Calculation of a mode, a median and a sample mean when the data are classified in a frequency table. Theorems and properties of the mean.

4) Some indices of variability and dispersion: range, interquartile difference, variance and standard deviation. The variation coefficient.

5) Contingency tables and bivariate analysis: definition of joint absolute and relative, marginal and conditioned frequency distributions; the Pearson index for independence; dependence in mean; covariance and the linear correlation coefficient.

Probability and random variables

1) Introduction to probability theory: classical, frequentist, subjective and axiomatic probability definitions; elementary, compound and disjoint events; stochastic independence; Bayes theorem; principle of total probabilities; types of sampling (extractions with and without replacement).

2) Definition of discrete and continuous random variables: probability distribution, probability density, distribution function; expected value (or mean), mode, median, variance of a random variable. Definition of independence between random variables.

3) Central limit theorem and law of large numbers.

4) Bernoulli random variable, Normal random variable and Binomial random variable; Normal approximation to Binomial distribution.

Inferential statistics

1) Point estimation: definition of unbiased estimator; the standard error as an accuracy measure of an estimator. The sample mean and variance; the sample proportion.

2) Confidence intervals for a mean (with Normal observations and known or unknown variance). Confidence intervals for a proportion.

3) General definition of statistical hypothesis testing: null and alternative hypotheses; type 1 and type 2 errors; rejection region; p-value. Hypothesis testing for a mean, with Normal observations and known or unknown variance; the t-test for the comparison between 2 means; the ANOVA test for comparison among multiple means.

4) Hypothesis testing for a proportion. Chi-square test for comparison among multiple proportions and to verify the independence between two variables.

Simple linear regression

1) Presentation of the statistical package R: how to install it; basic commands.

2) Definition of linear regression model; estimation of the parameters (slope and intercept coefficients) with the least square method; goodness of fit and determination coefficient; confidence interval for the coefficients of the linear regression model; hypothesis testing on the intercept and on the slope coefficients.

3) The use of R for the statistical analyzes described in point 2. Interpretation of the output.

1) Classification of statistical phenomena (types of characters and scales of measurement) and frequency distributions (absolute, relative and cumulative frequencies).

2) Graphical representations: bar graph, stick graph, histogram.

3) Calculation of a mode, a median and a sample mean when the data are classified in a frequency table. Theorems and properties of the mean.

4) Some indices of variability and dispersion: range, interquartile difference, variance and standard deviation. The variation coefficient.

5) Contingency tables and bivariate analysis: definition of joint absolute and relative, marginal and conditioned frequency distributions; the Pearson index for independence; dependence in mean; covariance and the linear correlation coefficient.

Probability and random variables

1) Introduction to probability theory: classical, frequentist, subjective and axiomatic probability definitions; elementary, compound and disjoint events; stochastic independence; Bayes theorem; principle of total probabilities; types of sampling (extractions with and without replacement).

2) Definition of discrete and continuous random variables: probability distribution, probability density, distribution function; expected value (or mean), mode, median, variance of a random variable. Definition of independence between random variables.

3) Central limit theorem and law of large numbers.

4) Bernoulli random variable, Normal random variable and Binomial random variable; Normal approximation to Binomial distribution.

Inferential statistics

1) Point estimation: definition of unbiased estimator; the standard error as an accuracy measure of an estimator. The sample mean and variance; the sample proportion.

2) Confidence intervals for a mean (with Normal observations and known or unknown variance). Confidence intervals for a proportion.

3) General definition of statistical hypothesis testing: null and alternative hypotheses; type 1 and type 2 errors; rejection region; p-value. Hypothesis testing for a mean, with Normal observations and known or unknown variance; the t-test for the comparison between 2 means; the ANOVA test for comparison among multiple means.

4) Hypothesis testing for a proportion. Chi-square test for comparison among multiple proportions and to verify the independence between two variables.

Simple linear regression

1) Presentation of the statistical package R: how to install it; basic commands.

2) Definition of linear regression model; estimation of the parameters (slope and intercept coefficients) with the least square method; goodness of fit and determination coefficient; confidence interval for the coefficients of the linear regression model; hypothesis testing on the intercept and on the slope coefficients.

3) The use of R for the statistical analyzes described in point 2. Interpretation of the output.

**Prerequisites for admission**

To successfully attend the Statistics course, it is necessary to have acquired the basic notions of mathematics.

**Teaching methods**

About the theoretical part, the teacher explains on the blackboard basically without the use of slides, the lesson in this way is more interactive and is adapted to the needs of the classroom. Students who cannot attend can find everything in the reference material (textbook and lecture notes on ARIEL).

After the introduction of any new concept, various numerical examples are presented to fully understand its meaning and practice the calculations.

In addition to the theoretical lessons, classroom exercises are also carried out. The exercises carried out during the classes are available on the course web page (ARIEL) to facilitate non-attending students.

About the use of the R software in linear regression, the teacher presents slides with the various instructions to be typed for each kind of analysis. During the lecture some examples of regression analysis are carried out and students are invited to bring a laptop (if they have one) in order to train with the teacher. In any case, all the instructions presented in the lecture are made available in ARIEL so that students can reproduce them at home with their PC.

Comments and requests for clarification during the lessons / exercises by the students are always welcome, because they make the lessons more lively and certainly more useful for everyone.

After the introduction of any new concept, various numerical examples are presented to fully understand its meaning and practice the calculations.

In addition to the theoretical lessons, classroom exercises are also carried out. The exercises carried out during the classes are available on the course web page (ARIEL) to facilitate non-attending students.

About the use of the R software in linear regression, the teacher presents slides with the various instructions to be typed for each kind of analysis. During the lecture some examples of regression analysis are carried out and students are invited to bring a laptop (if they have one) in order to train with the teacher. In any case, all the instructions presented in the lecture are made available in ARIEL so that students can reproduce them at home with their PC.

Comments and requests for clarification during the lessons / exercises by the students are always welcome, because they make the lessons more lively and certainly more useful for everyone.

**Teaching Resources**

I) Descriptive statistics: two lecture notes available on the ARIEL page of the course: http://ctommasis.ariel.ctu.unimi.it/v5 (under the headings: lezioni-statistica descrittiva)

II) Probability and random variables: Introduction to the statistical inference. Authors: Ferrari, Nicolini and Tommasi, Giappichelli Editore - Turin (2009) - CHAPTERS: 1-2.

III) Inferential statistics: Introduction to the statistical inference. Authors: Ferrari, Nicolini and Tommasi, Giappichelli Editore - Turin (2009) - CHAPTERS: 3-4-5

and the following supplementary notes:

1) "la stima puntuale"

2) "confronto tra due o più medie (ANOVA)"

3) "Il test del chi-quadrato per l'indipendenza e per il confronto tra più proporzioni. Il test Z per il confronto tra due proporzioni"

which are available on the ARIEL page of the course: http://ctommasis.ariel.ctu.unimi.it/v5 (under the headings: contenuti - lezioni).

IV) Simple linear regression: Introduction to the statistical inference. Authors: Ferrari, Nicolini and Tommasi, Giappichelli Editore - Turin (2009) - CHAPTER 6.

The material about the use of the R software is available on the course web-page on ARIEL: https://ctommasis.ariel.ctu.unimi.it/v5

II) Probability and random variables: Introduction to the statistical inference. Authors: Ferrari, Nicolini and Tommasi, Giappichelli Editore - Turin (2009) - CHAPTERS: 1-2.

III) Inferential statistics: Introduction to the statistical inference. Authors: Ferrari, Nicolini and Tommasi, Giappichelli Editore - Turin (2009) - CHAPTERS: 3-4-5

and the following supplementary notes:

1) "la stima puntuale"

2) "confronto tra due o più medie (ANOVA)"

3) "Il test del chi-quadrato per l'indipendenza e per il confronto tra più proporzioni. Il test Z per il confronto tra due proporzioni"

which are available on the ARIEL page of the course: http://ctommasis.ariel.ctu.unimi.it/v5 (under the headings: contenuti - lezioni).

IV) Simple linear regression: Introduction to the statistical inference. Authors: Ferrari, Nicolini and Tommasi, Giappichelli Editore - Turin (2009) - CHAPTER 6.

The material about the use of the R software is available on the course web-page on ARIEL: https://ctommasis.ariel.ctu.unimi.it/v5

**Assessment methods and Criteria**

The exam consists of a written test lasting approximately one hour and a half.

It consists of 3 exercises and 6 multiple choice questions (rated from 0 to 30 points), plus and additional, more theoretical exercise (rated from 0 to 2 points). Questions and exercises concern all the topics listed in the program.

The exam is considered sufficient if a score of at least 18 is obtained.

To carry out the written test you need to bring a calculator with you.

The structure of the exam (numerical exercises plus multiple choice questions) allows the teacher to check whether the student is able to carry out simple statistical analyses and interpret the results.

It consists of 3 exercises and 6 multiple choice questions (rated from 0 to 30 points), plus and additional, more theoretical exercise (rated from 0 to 2 points). Questions and exercises concern all the topics listed in the program.

The exam is considered sufficient if a score of at least 18 is obtained.

To carry out the written test you need to bring a calculator with you.

The structure of the exam (numerical exercises plus multiple choice questions) allows the teacher to check whether the student is able to carry out simple statistical analyses and interpret the results.

SECS-S/05 - SOCIAL STATISTICS - University credits: 12

Lessons: 84 hours

Professor:
Barbiero Alessandro

Educational website(s)

Professor(s)

Reception:

Office hours are on Monday 10.30-12.30 and 2.30-3.30. Office hours in presence are suspended but they are carried out via Teams, by sending a chat message

Room 33, 3rd floor DEMM