Statistics and Data Analysis
A.Y. 2018/2019
Learning objectives
The course aims at introducing the bases of descriptive statistics, probability theory and inferential statistics.
Expected learning outcomes
Students will acquire basic skills allowing them to summarize a data sample through numerical indices and graphical representations, to reason in terms of the main probability distributions, to perform simple statistical analyses, to understand statistical analyses performed by others, and to study more complex data analysis techniques.
Lesson period: First semester
Assessment methods: Esame
Assessment result: voto verbalizzato in trentesimi
Single course
This course cannot be attended as a single course. Please check our list of single courses to find the ones available for enrolment.
Course syllabus and organization
Single session
Responsible
Lesson period
First semester
ATTENDING STUDENTS
Course syllabus
NON-ATTENDING STUDENTS
Probability:
- axiomatic definition of probability
- unidimensional random variables: probability function, indices of position and dispersion, examples of random variables: Bernoulli, binomial, hypergeometric, Poisson, uniform, exponential, normal
- Chebyshev's inequality and law of large numbers,
- The weak law of large numbers
- multidimensional random variables: joint and marginal probability functions, covariance
- sample mean and sample variance,
- the central limit theorem
Statistical analysis:
- explorative analysis of univariate data: distributions of frequency, mean, mode, quartiles, percentiles, empiric cumulative function, interquartile range, variance, indices of heterogeneity, Gini's index of concentration, sensitivity and specificity.
- explorative analysis of bivariate data: contingency tables, covariance matrix, correlation matrix
- statistical inference: sampling, estimation of position and dispersion indices, sample size, statistical significance
- graphical representations: histogram, scatter plot, frequency diagrams, box and wiskers plot, quantile-quantile plots, ROC curve
The R environment for the statistical analysis of data:
- commands for starting, browsing in the file system and exiting R
- functions for workspace management
- syntax for the traditional programming language constructs
- vectors and matrices
- the data frame
- functions for the explorative analysis of univariate and bivariate data
- functions for the graphical representation of statistical data
- axiomatic definition of probability
- unidimensional random variables: probability function, indices of position and dispersion, examples of random variables: Bernoulli, binomial, hypergeometric, Poisson, uniform, exponential, normal
- Chebyshev's inequality and law of large numbers,
- The weak law of large numbers
- multidimensional random variables: joint and marginal probability functions, covariance
- sample mean and sample variance,
- the central limit theorem
Statistical analysis:
- explorative analysis of univariate data: distributions of frequency, mean, mode, quartiles, percentiles, empiric cumulative function, interquartile range, variance, indices of heterogeneity, Gini's index of concentration, sensitivity and specificity.
- explorative analysis of bivariate data: contingency tables, covariance matrix, correlation matrix
- statistical inference: sampling, estimation of position and dispersion indices, sample size, statistical significance
- graphical representations: histogram, scatter plot, frequency diagrams, box and wiskers plot, quantile-quantile plots, ROC curve
The R environment for the statistical analysis of data:
- commands for starting, browsing in the file system and exiting R
- functions for workspace management
- syntax for the traditional programming language constructs
- vectors and matrices
- the data frame
- functions for the explorative analysis of univariate and bivariate data
- functions for the graphical representation of statistical data
Course syllabus
Probability:
- axiomatic definition of probability
- unidimensional random variables: probability function, indices of position and dispersion, examples of random variables: Bernoulli, binomial, hypergeometric, Poisson, uniform, exponential, normal
- Chebyshev's inequality and law of large numbers,
- The weak law of large numbers
- multidimensional random variables: joint and marginal probability functions, covariance
- sample mean and sample variance,
- the central limit theorem
Statistical analysis:
- explorative analysis of univariate data: distributions of frequency, mean, mode, quartiles, percentiles, empiric cumulative function, interquartile range, variance, indices of heterogeneity, Gini's index of concentration, sensitivity and specificity.
- explorative analysis of bivariate data: contingency tables, covariance matrix, correlation matrix
- statistical inference: sampling, estimation of position and dispersion indices, sample size, statistical significance
- graphical representations: histogram, scatter plot, frequency diagrams, box and wiskers plot, quantile-quantile plots, ROC curve
The R environment for the statistical analysis of data:
- commands for starting, browsing in the file system and exiting R
- functions for workspace management
- syntax for the traditional programming language constructs
- vectors and matrices
- the data frame
- functions for the explorative analysis of univariate and bivariate data
- functions for the graphical representation of statistical data
- axiomatic definition of probability
- unidimensional random variables: probability function, indices of position and dispersion, examples of random variables: Bernoulli, binomial, hypergeometric, Poisson, uniform, exponential, normal
- Chebyshev's inequality and law of large numbers,
- The weak law of large numbers
- multidimensional random variables: joint and marginal probability functions, covariance
- sample mean and sample variance,
- the central limit theorem
Statistical analysis:
- explorative analysis of univariate data: distributions of frequency, mean, mode, quartiles, percentiles, empiric cumulative function, interquartile range, variance, indices of heterogeneity, Gini's index of concentration, sensitivity and specificity.
- explorative analysis of bivariate data: contingency tables, covariance matrix, correlation matrix
- statistical inference: sampling, estimation of position and dispersion indices, sample size, statistical significance
- graphical representations: histogram, scatter plot, frequency diagrams, box and wiskers plot, quantile-quantile plots, ROC curve
The R environment for the statistical analysis of data:
- commands for starting, browsing in the file system and exiting R
- functions for workspace management
- syntax for the traditional programming language constructs
- vectors and matrices
- the data frame
- functions for the explorative analysis of univariate and bivariate data
- functions for the graphical representation of statistical data
INF/01 - INFORMATICS - University credits: 6
Practicals: 36 hours
Lessons: 24 hours
Lessons: 24 hours
Professor:
Zanaboni Anna Maria
Professor(s)
Reception:
Wednesday 10:30-12:30 -- by appointment
via Celoria 18, 5th floor