Natural language processing

A.A. 2023/2024
6
Crediti massimi
48
Ore totali
SSD
INF/01
Lingua
Inglese
Obiettivi formativi
The aim of the course is to provide an introduction to the fundamental concepts related to Natural Language Processing (NLP) as well as an overview of the main tools used in the field. Moreover, some NLP applications will be presented, e.g. information retrieval, machine translation and automatic misogyny identification.
Risultati apprendimento attesi
After successfully completing the course, students will be able to:

-know the base concepts of the Natural Language Processing field.

-explain the common computational vector space models for words applied in language technology.

-describe the challenges related to word vector models.

-know how to address some Natural Language Processing applications.
Corso singolo

Questo insegnamento può essere seguito come corso singolo.

Programma e organizzazione didattica

Edizione unica

Periodo
Primo semestre

Programma
Introduction to data pre-processing and to some NLP tasks, such as part of speech tagging, and named entity recognition.
Text representation (eg. tf-idf)
Statistical LM (eg. n-gram model)
Dense vector representation (eg. Word2Vec, FastText, etc.)
Dense contextualized word vectors (eg. Neural Language Model)
Sequence2sequence models for NLP (eg. Encoder-Decoder)

Applications of NLP:
Information Retrieval
Automatic Misogyny Identification
Machine Translation
Prerequisiti
Basic knowledge of statistics and programming languages.
Metodi didattici
The course will be taught in English, and it will consist of both lectures introducing the main topics and tutorial sessions where open-source tools will be explained. Seminars held by experts at national and international levels may be part of the course.
Materiale di riferimento
Daniel Jurafsky and James Martin, "Speech and Language Processing, 2nd Edition", Prentice Hall, 2008.

Emily M. Bender, "Linguistic Fundamentals for Natural Language Processing", Synthesis lectures on human language technologies, Morgan&Claypool Publishers, 2013.

Yoav Goldberg, "Neural Network Methods for Natural Language Processing", Synthesis lectures on human language technologies, Morgan&Claypool Publishers, 2017.

Mohammad Taher Pilehvar and Jose Camacho-collados, "Embeddings in Natural Language Processing", Synthesis Lectures on Human Language Technologies, Morgan & Claypool Publishers, 2021.
Modalità di verifica dell’apprendimento e criteri di valutazione
Written and optional oral individual examination.

The written examination is aimed at assessing the level of understanding of the basic aspects taught during the course; it is constituted by a set of open questions.
INF/01 - INFORMATICA - CFU: 6
Lezioni: 48 ore
Docenti: Fersini Elisabetta, Pasi Gabriella, Raganato Alessandro