Natural Language Processing | Università degli Studi di Milano Statale

A.Y. 2025/2026

Max ECTS

Overall hours

SSD

INF/01

Language

English

Included in the following degree programmes

Computer Science (Classe LM-18)-Enrolled in 2025/26

Learning objectives

The course provides an extensive and in-depth introduction to the state of the art and the main research trends in Natural Language Processing (NLP). In particular, the course focus on deep learning methods for NLP, with a specific attention on large language models. Students will deal with fundamental tasks such as syntactic, semantic, and discourse analysis, as well as methods to solve these tasks. A specific focus will be on transfer learning methods and model architectures to address concrete tasks such as text classification, question answering, automatic translation and text generation. These goals will be pursued by a combination of theory, seminars on recent papers and methods, and practical examples. The program is intended for graduate students in computer science and data science who are familiar with machine learning basics. An intruduction to deep learning and neural networks will be provided together with a practical introduction to PyTorch. Coding in Python will play also an important role in the classes.

Expected learning outcomes

Through reading recent research papers, programming assignments, and a final project, students will acquire the following skills: 1) knowing and understanding the main topics as well as the research issues and the future trends in the field of Natural Language Processing (NLP); 2) learn how to apply NLP methods to a corpus of texts for a specific need; 3) being able to judge the quality of different design and implementation choices when coming to a NLP project; 4) being able to design, implement, and evaluate a specific project focused on NLP tasks; 5) understand the notion of language model and being able to detect language specificities and topics in a corpus of text documents; 6) being able to use the Python stack of libraries and tools required to develop a NLP project.

Lesson period: First four month period

Lessons timetable

Assessment methods: Esame
Assessment result: voto verbalizzato in trentesimi

Exams calendar

Single course

This course can be attended as a single course.

Take a single course

Course syllabus and organization

Single session

Responsible

Ferrara Alfio

Lesson period

First four month period

Syllabus

Course syllabus

The lectures provide an in-depth introduction to the main research topics in the field of Deep Learning applied to Natural Language Processing. In addition to the lectures, there is a final project through which students will acquire the necessary skills to design, implement and understand the main neural network models for natural language, using Python and Pytorch.

Introduction to Natural Language Processing
- Vector Space
- Text tokenization and normalization
- TfIdf

Introduction to neural networks
- Classification problems
- Introduction to Neural Networks
- Linear classifiers and neural networks
- Tutorial on PyTorch (basics)
- Tutorial on PyTorch (deep learning)

Language models
- Introduction to the notion of Language Modeling
- Markov Language Model

Neural networks as language models
- Using Neural Networks as language models
- Word2Vec and word embedding

Text encoding and sequence learning
- Encoding text and sequences
- Simple RNN example
- Sequence classification
- Sequence generation

Transformers
- Transformers architecture
- BERT
- GPT

Image Processing
- Foundations of image processing

Proprietary models and prompt engineering
- Prompt Engineering

Explainable AI, bias and ethics
- Explainability
- SHAP methods
- Saliency
- Concept Activation

Bias and Stereotypes
- Simple example
- Masking
- Text completion

Frameworks for LLMs
- LLama-cpp
- vLLM
- MLX LM

Prerequisites for admission

Intermediate knowledge of Python. Basic knowledge of derivatives and understanding of matrix/vector notation and operations. Basics of probabilities and gaussian distributions.

Teaching methods

The course is given in the form of lectures with extensive use of examples and support materials such as Python notebooks. Slides and handouts are employed throughout the lectures and they are progressively published on the reference course website on the Ariel platform and on the GitHub repository (https://github.com/afflint/nlp).
Lecture attendance is not mandatory, but it is strongly recommended.

Teaching Resources

- Manning, C. D., Raghavan, P., & Schütze, H. (2008). Introduction to information retrieval (Vol. 1, p. 496). Cambridge: Cambridge University Press. (Http://nlp.stanford.edu/IR-book/)
- Alfio Ferrara. Le macchine del linguaggio. L'uomo allo specchio dell'intelligenza artificiale. Einaudi, 2025.
- Notes, notebooks and materials provided by the lecturer and published on the website of the course (https://aferrarair.ariel.ctu.unimi.it)

Assessment methods and Criteria

Development of a project. The project topic has to be previously discussed with the lecturer. The project should demonstrate the comprehension of the lectures topics and the capability of proposing and motivating innovative solutions to specific research problems.
The project will be evaluated through a discussion with the lecturer about the project outcomes and the related topics. The evaluation will take into account both the project and the interview.
Using the SIFA service for participating in the examination is mandatory. After the registration to an examination on SIFA, the students are requested to contact the lecturer for scheduling the discussion.

Course structure

INF/01 - INFORMATICS - University credits: 6

Lessons: 48 hours

Professor: Ferrara Alfio

Professor(s)

Ferrara Alfio

Web site

Reception:

On appointment. The meeting will be online by first contacting the professor by email.

Online. In case of a meeting in person, Department of Computer Science, via Celoria 18 Milano, Room 7012 (7 floor)