Data access and regulation

A.Y. 2020/2021
Overall hours
INF/01 IUS/09
Learning objectives
The objective of the course is to give students a multidisciplinary approach to data processing. According with this objective, each module is focused on a specific aspect. The aim of the first module is to provide students with the essential elements of data protection law, making them familiars with principles, rights and duties set by the General Data Protection Regulation (GDPR)
Expected learning outcomes
At the end of the course students should have acquired: knowledge and understanding of the fundamental legal concepts of data protection; autonomous ability to read the new European regulatory standards.
Course syllabus and organization

Single session

Lesson period
Second trimester
The lessons will be held synchronously through the Teams platform.
Course syllabus
The course programme of the Prof. Orofino modules is detailed above:
1. Introduction
2. The fundamental right to personal data protection (I part)
3. The fundamental right to personal data protection (II part)
4. The European concept of data protection between EU and ECHR
5. Data protection terminology
6. Territorial and material scope;
7. General Principles of European Data Protection Law
8. The Legal Conditions relating to processing of personal data;
9. The rights of the data subject (I part)
10. The rights of the data subject (II part)
11. The obligations of the controller and of the processor (I part)
12. The obligations of the controller and of the processor (II part)
13. The DPO
14. The Member States' Independent Supervisory Authorities
15. The European Data Protection Board; Competence, tasks and powers
16. Transfers of personal data to third countries (non-EU countries)
17. Specific Type of Data (I part)
18. Specific Type of Data (II part)
19. Remedies and penalties
20. IA and Data protection

The programm of the third module (R. Fahley)
The course will begin with a discussion of the challenges presented to social science researchers by Big Data and the increasingly rigorous demands for transparency and open sharing of research data, and outline the considerations which researchers need to keep in mind when designing a data management plan for their project. Next we will introduce a variety of different data management tools, including flat files, local databases, and large-scale cloud storage systems. For each of these we will discuss strengths and weaknesses, and the kinds of data or analysis for which they might be suitable, and students will have an opportunity to try working with them (adding, manipulating and exporting data) in lab sessions, where they will learn the basics of the data specification and query syntaxes (SQL and NoSQL) that are commonly used for these systems. Finally, we will discuss best practices for creating a data management plan that will address considerations such as collaborating with other researchers, ensuring that your valuable data is securely backed up, and allowing you to easily share it publicly when your research is published. The course will conclude with a project in which students will design a data management plan for a hypothetical research project and write a short report justifying the choices they made.

Fourth module (Andrea de Angelis)
Fourth module (Andrea de Angelis)
The course is structured in three blocks: 1) an introductory block covering the essential knowledge for working with web data (notions of R programming, developing reproducible code, reporting in automated notebooks, version control, and Git/GitHub) and the main available secondary data sources. 2) A core block focusing on web data technologies (regexprs, HTML, CSS, XPath, XML, JSON). 3) An advanced block introducing the basics of API interaction (HTTP protocol, GET and POST methods, RESTful web APIs' workflow).
Prerequisites for admission
First and second module: None
Third module (R. Fahley): No specific knowledge is assumed, but some understanding of a programming language (either R or Python) will help you to understand the code examples used in the course more easily. You may find it beneficial to take a short online tutorial to help you understand the basics of your chosen language.
Fourth module (Andrea de Angelis)
The course assumes a basic understanding of R programming, including the main tidyverse packages (e.g. dplyr, ggplot2, tidyr, readr), basic familiarity with computer notions (dealing with file paths, folder organization, moving, copying and pasting files, URLs and browsing the web), and some basic descriptive and inferential statistics to fulfill the class exercises.
Teaching methods
Frontal lessons, case studies and hands-on lab sessions.
The module uses community-learning techniques to support the students through the learning process. Class activities alternate the instructor's introductions with formative and summative assessments, and become participated and hands-on using tools like pair programming, shared notes, and live coding. The home-study is supported by digital tools to let students and the instructor interact and support each other through the assimilation of the reading material; the two class exercises are assigned to the class as a whole and solved in small mildly-competing and peer-supportive groups using github. The capstone project is intended as continued support to the student's data-gathering first steps.
Teaching Resources
First and second module (Prof. Orofino)
Handbook on European data protection law, 2018 edition, free disponible online…
Third module (Robert Fahley)
Foster I, Ghani R, Jarmin RS, Kreuter F, Lane J (eds.) (2016), "Big Data and Social Science", CRC Press,
Teorey T, Lightstone S, Nadeau T, Jagadish HV (2011), "Database Modeling & Design (5th Edition)", Elsevier,
Van den Eynden V, Corti L, Woollard M, Bishop L, Horton L (2011), "Managing and Sharing Data: Best Practice for Researchers", UK Data Archive,
Jones S (2011), "How to Develop a Data Management and Sharing Plan", Digital Curation Centre,…
Fourth module (Andrea de Angelis)
- Wickham and Grolemund (2017). R for Data Science. O'Reilly.
- Bryan (2018). Happy Git and GitHub for the useR. Unpublished manuscript.
- Munzert et al. (2015). Automated Data Collection with R. Wiley.
- Nolan and Temple Lang. (2014). XML and Web Technologies for Data Sciences with R. Springer.
Assessment methods and Criteria
First and second module (Prof. Orofino)
The exam is oral. The oral exam consists of an interview on program topics. The exam is aimed at ascertaining the preparation and argumentative capacity of the student.
Third module (Robert Fahley)
Assessment for this portion of the course will be a series of short assignments (mostly technical / lab-based in nature) and one final slightly longer assignment (designing and justifying a data management scheme for a hypothetical research project).
Non-attending students will be expected to submit a series of short essay reports demonstrating their understanding of the topics discussed in the lecture, in addition to the regular assessment requirements for the course
Fourth module (Andrea de Angelis)
The assessment is based on 1. the quality of the home study of the assigned texts, 2. the participation in two class' exercises, and 3. the realization of a capstone project. Non-attending students have to fulfill the same requirements as the students attending the course.
INF/01 - INFORMATICS - University credits: 6
IUS/09 - PUBLIC LAW - University credits: 6
Lessons: 80 hours