Gpu Computing
A.Y. 2025/2026
Learning objectives
The course aims to provide students with advanced training in the use of GPUs as high-performance computational platforms, with a dual focus: on one hand, the acquisition of skills in the CUDA parallel programming model for general-purpose computing, and on the other, the application of those skills in developing and optimizing deep learning models using libraries such as PyTorch. Through a balance of theoretical lessons and practical activities, the course promotes integration of GPU architectural foundations, key parallelism patterns, and acceleration strategies for neural networks, generative models, and models operating on non-Euclidean geometric structures. Developing design and experimental skills within concrete application scenarios enables students to gain a solid understanding of both theoretical concepts and their operational implications, also in connection with current research trends in HPC and artificial intelligence.
Expected learning outcomes
The expected outcomes are as follows:
- Understand modern GPU architectures and the CUDA parallel computing model for HPC and AI.
- Gain knowledge of key deep learning paradigms (e.g., GANs, VAEs, Transformers, GNNs) and acceleration techniques for GPU-based training.
- Write and optimize CUDA C code for general-purpose parallel kernels.
- Implement and train advanced deep learning models in PyTorch, leveraging GPUs for computational acceleration.
- Apply profiling techniques and performance analysis to improve the efficiency of training and inference processes.
- Critically evaluate GPU-based computational solutions for AI and scientific computing problems.
- Identify optimal design choices in implementing models and algorithms on parallel architectures.
- Clearly and rigorously present and discuss architectures, acceleration techniques, and experimental results, including written reports or oral presentations.
- Continue independently exploring emerging models and software libraries in the field of GPU computing and applied artificial intelligence.
- Understand modern GPU architectures and the CUDA parallel computing model for HPC and AI.
- Gain knowledge of key deep learning paradigms (e.g., GANs, VAEs, Transformers, GNNs) and acceleration techniques for GPU-based training.
- Write and optimize CUDA C code for general-purpose parallel kernels.
- Implement and train advanced deep learning models in PyTorch, leveraging GPUs for computational acceleration.
- Apply profiling techniques and performance analysis to improve the efficiency of training and inference processes.
- Critically evaluate GPU-based computational solutions for AI and scientific computing problems.
- Identify optimal design choices in implementing models and algorithms on parallel architectures.
- Clearly and rigorously present and discuss architectures, acceleration techniques, and experimental results, including written reports or oral presentations.
- Continue independently exploring emerging models and software libraries in the field of GPU computing and applied artificial intelligence.
Lesson period: Second semester
Assessment methods: Esame
Assessment result: voto verbalizzato in trentesimi
Single course
This course can be attended as a single course.
Course syllabus and organization
Single session
Course syllabus
The syllabus is shared with the following courses:
- [FBA-40](https://www.unimi.it/en/ugov/of/af20260000fba-40)
- [FBA-40](https://www.unimi.it/en/ugov/of/af20260000fba-40)
Professor(s)