Scientific Programming

A.Y. 2021/2022
Overall hours
Learning objectives
The objective of the course is to make students proficient in writing programs and scripts in the programming languages most widely used in modern genomic research: R and Python.
Expected learning outcomes
At the end of this class , the students are expected to be able to design and write advanced programs in Python and R programming languages, applying them to case studies derived from the analysis of genomic data.
Course syllabus and organization

Single session

Lesson period
Second semester
Course syllabus
Seminar lectures and practicals in informatics room on the following topics:

Python programming language:
- Quick revision on:
- Variables, Expressions and Statements
- Strings, Lists, Tuples, Dictionaries and indexes
- Functions and Classes
- Functions of second order for manipulation of lists and data collections
- Principles of object programming
- Simple Abstract Data Types: Stacks, Queues, Trees and Graphs
- File management
- Libraries for the management of matrices and tables (e.g. Biopython, Pandas and NumPy). Common statistical libraries (e.g. SciPy).
- Techniques of data visualization through graphs and curves.
- Biopython tools and functions.
- Building Python libraries.
- Integration of informatics systems: hints of REST Web services and Web service invocation from Python

Practicals on Phyton implementations of dynamic programming, statistical analyses of Next Generation Sequencing data, and/or others.

R programming language:
- Main data structures in R: vectors, factors, matrices, arrays, lists and environments
- Control of execution flow: blocks, conditional statements, loops
- Functions and scripts
- Input/Output functions and operators; R data import/export
- Graphical representation of the data, heatmaps, boxplots and Venn diagrams
- Vector operations
- Packages and R "extensions"
- Building packages in R and Bioconductor
- Analyzing Next Generation Sequencing data with R and Bioconductor packages

Practicals on R implementation of statistical analyses of gene expression data (differentially expressed genes, clustering, principal component analysis, and/or others).
Prerequisites for admission
No prerequisites different from those required for admission to the Master Degree program.
Teaching methods
Class lectures and practices in an informatics room or using the students' laptop computers.
Teaching Resources
The slides presented during the course and the estimated detailed schedule of lectures and practices are available on the "Be e-Poli" (BeeP), the portal for the network activities of students and professors at the Politecnico di Milano, accessible from the Politecnico di Milano Web site; students registered to the course for the current academic year can access it.
Assessment methods and Criteria
The assessment is based on a written exam at the end of the course, with exercises and open questions on all the topics presented during the course lectures or practices.
Practicals: 24 hours
Lectures: 36 hours
Professor: Piro Rosario Michael