Algorithms for massive data, cloud and distributed computing

A.A. 2019/2020
Insegnamento per
12
Crediti massimi
80
Ore totali
SSD
INF/01
Lingua
Inglese
Obiettivi formativi
The main objective of the course is to analyze the technologies, computing paradigms, models, and algorithms at the basis of massive data management and analysis. On the one hand, students will learn the main approaches enabling them to process massive amounts of data; on the other one, they will also analyze the modern distributed computing systems, including cloud computing and microservice-based architectures. At the end of the course, students will be able to design and execute computations on massive datasets, deployed on modern distributed systems and cloud computing platforms; moreover, they will learn the fundamentals of non-functional property (e.g., privacy) management in the cloud. To achieve the above mentioned objectives, the course will consist of two modules: i) "Algorithm for Massive Data" (40 hours - 6 CFU), and ii) "Cloud and Distributed Computing" (40 hours - 6 CFU).

Struttura insegnamento e programma

Edizione attiva
Moduli o unità didattiche
Module Algorithms for Massive Data
INF/01 - INFORMATICA - CFU: 6
Lezioni: 40 ore
Docente: Malchiodi Dario

Module Cloud and Distributed Computing
INF/01 - INFORMATICA - CFU: 6
Lezioni: 40 ore

STUDENTI FREQUENTANTI
Informazioni sul programma
The course will analyze the models, tools and techniques at the basis of massive data management and analysis, from algorithms for data analytics to cloud and distributed computing. To this aim, the course will be organized in two modules as follows.

Module 1 "Algorithms for Massive Data" will consider the main processing techniques dealing with data at massive scale, and their implementation on distributed computational frameworks. More precisely, lectures will review the principal application contexts characterized by amounts of data which cannot be handled using standard computing facilities and procedures. Such contexts will be analyzed in terms of tailored algorithms. Meanwhile, some general big data processing techniques, such as those falling within the hat of machine learning, will be considered.

Module 2 will discuss the technologies and solutions at the basis of cloud computing and modern distributed systems, including microservice architectures. Module 2 is composed of three main parts. The first part will provide an overview of the cloud computing paradigm and its technologies, as well as its service and deployment models. It will also investigate risks and opportunities of cloud migration, focusing on governance and non-functional properties of the cloud. The second part will provide an overview of the microservice architecture and its technologies, focusing on the migration from a monolithic approach to microservices and on microservice orchestration. Finally, the third part will focus on privacy and data protection in the cloud.
Propedeuticità
Computer Networks.
Prerequisiti e modalità di esame
Prerequisites: computer programming, probability and statistics, basic calculus, fundamentals of computer networks, fundamentals of virtualization.

The course exam will consist of two mandatory exams, one for each module. The exam for module "Algorithms for massive datasets" consists of an experimental project or theory project to be summarized in a written report, followed by an oral discussion of the report including questions on a choice of topics covered in the course. The exam for module "Cloud and Distributed Computing" consists of an experimental project or theory project to be summarized in a written report, followed by an oral discussion of the report including questions on a choice of topics covered in the course. The course exam aims to verify the student knowledge on all arguments discussed in the course. The course exam is successfully completed when both module exams have been evaluated with a grade of 18/30 or higher. The final grade is the rounded down average of the two grades.

Additional information can be found on the personal web sites of lectures:
- http://www.di.unimi.it/ardagna/
- http://sesar.di.unimi.it/
- http://malchiodi.di.unimi.it/teaching/algorithms-massive-datasets
and on the official web page of the course https://ariel.unimi.it/
Metodi didattici
Lectures
Module Cloud and Distributed Computing
Programma
The module will discuss the technologies and solutions at the basis of cloud computing and modern distributed systems, including microservice architectures. It is composed of three main parts as follows.

After a brief recall of the fundamentals of IT networks and virtualization, the first part of the module will provide an overview of the cloud computing paradigm and its technologies, as well as its service and deployment models. It will also investigate risks and opportunities of cloud migration, focusing on governance and non-functional properties of the cloud.

1. Cloud Computing. Service models. Deployment models. Migration to the cloud. Cloudonomics. Challenges and issues.
2. IaaS, PaaS, SaaS: Definitions. Technologies. Case studies.
3. Non-functional aspects of the cloud.
4. PaaS Big Data. Multicloud orchestration. Big Data analytics examples.

The second part of the module will provide an overview of the microservice architecture and its technologies, focusing on the migration from a monolithic approach to microservices and on microservice orchestration.

1. Microservice architecture. Overview and basic concepts. Microservices and containers. Dockers.
2. Microservice migration and orchestration. Cloud for microservices. How to migrate a monolithic software to microservices. Examples.
3. New cloud services: AWS, Azure, Soft-Layer/BlueMix e GCP
4. Microservices and Big Data. Model-Based Big Data Analytics-as-a-Service.

Finally, the third part of the module will focus on privacy and data protection in the cloud.

1. Data and access confidentiality and integrity in outsourcing and cloud scenarios.
Metodi didattici
Lectures
Materiale didattico e bibliografia
Papers and slide decks available on the web page of the course (https://ariel.unimi.it)
STUDENTI NON FREQUENTANTI
Prerequisiti e modalità di esame
Prerequisites: computer programming, probability and statistics, basic calculus, fundamentals of computer networks, fundamentals of virtualization.

The course exam will consist of two mandatory exams, one for each module. The exam for module "Algorithms for massive datasets" consists of an experimental project or theory project to be summarized in a written report, followed by an oral discussion of the report including questions on a choice of topics covered in the course. The exam for module "Cloud and Distributed Computing" consists of an experimental project or theory project to be summarized in a written report, followed by an oral discussion of the report including questions on a choice of topics covered in the course. The course exam aims to verify the student knowledge on all arguments discussed in the course. The course exam is successfully completed when both module exams have been evaluated with a grade of 18/30 or higher. The final grade is the rounded down average of the two grades.

Additional information can be found on the personal web sites of lectures:
- http://www.di.unimi.it/ardagna/
- http://sesar.di.unimi.it/
- http://malchiodi.di.unimi.it/teaching/algorithms-massive-datasets
and on the official web page of the course https://ariel.unimi.it/
Module Cloud and Distributed Computing
Programma
The module will discuss the technologies and solutions at the basis of cloud computing and modern distributed systems, including microservice architectures. It is composed of three main parts as follows.

After a brief recall of the fundamentals of IT networks and virtualization, the first part of the module will provide an overview of the cloud computing paradigm and its technologies, as well as its service and deployment models. It will also investigate risks and opportunities of cloud migration, focusing on governance and non-functional properties of the cloud.

1. Cloud Computing. Service models. Deployment models. Migration to the cloud. Cloudonomics. Challenges and issues.
2. IaaS, PaaS, SaaS: Definitions. Technologies. Case studies.
3. Non-functional aspects of the cloud.
4. PaaS Big Data. Multicloud orchestration. Big Data analytics examples.

The second part of the module will provide an overview of the microservice architecture and its technologies, focusing on the migration from a monolithic approach to microservices and on microservice orchestration.

1. Microservice architecture. Overview and basic concepts. Microservices and containers. Dockers.
2. Microservice migration and orchestration. Cloud for microservices. How to migrate a monolithic software to microservices. Examples.
3. New cloud services: AWS, Azure, Soft-Layer/BlueMix e GCP
4. Microservices and Big Data. Model-Based Big Data Analytics-as-a-Service.

Finally, the third part of the module will focus on privacy and data protection in the cloud.

1. Data and access confidentiality and integrity in outsourcing and cloud scenarios.
Materiale didattico e bibliografia
Papers and slide decks available on the web page of the course (https://ariel.unimi.it)
Periodo
Secondo semestre
Periodo
Secondo semestre
Modalità di valutazione
Esame
Giudizio di valutazione
voto verbalizzato in trentesimi
Docente/i
Ricevimento:
Su appuntamento
Ufficio docente (7004) presso il Dipartimento di Informatica in Via Celoria 18, Milano (MI)
Ricevimento:
Solo su appuntamento: contattare il Dott. Fulvio Frati (fulvio.frati@unimi.it)
ufficio presso il Dipartimento di Informatica
Ricevimento:
su appuntamento
via Celoria, 18 - Milano (MI)
Ricevimento:
Su appuntamento
Stanza 5015, Dipartimento di Informatica