Corpus Linguistics

A.Y. 2025/2026
9
Max ECTS
60
Overall hours
SSD
L-LIN/01
Language
English
Learning objectives
The course aims to provide students with a solid introduction to the theoretical, methodological, and applied principles of corpus linguistics, adopting a cross-linguistic approach suited to different languages of study. Through three progressive modules, the course seeks to:
- Introduce the fundamental concepts of corpus linguistics;
- Provide skills for collecting, querying, and analysing corpus-based data;
- Develop critical thinking skills in linguistic research contexts, supported by practical examples applicable across different languages.
Expected learning outcomes
By the end of the course, students will have acquired a solid understanding of the theoretical and methodological foundations of corpus linguistics, including its core principles and the limitations associated with the use of data extracted from corpora. They will be familiar with the main operational procedures for extracting and organizing linguistic data and will be able to apply quantitative analysis methods, having developed the skills needed to interpret fundamental statistical measures in linguistics.
Throughout the course activities, students will learn how to use specific software for querying and analysing corpora, how to design simple empirical investigations, and how to apply corpus-based tools to linguistic data from different languages. Particular attention will be given to developing the critical thinking skills necessary to evaluate corpus data, reflect on methodological choices, and understand the potential and limitations of these tools in language analysis.
Thanks to the knowledge acquired and the development of critical thinking fostered during the course, students will be able to independently pursue further study of corpus-based methodologies applied to their chosen language of specialisation.
Single course

This course cannot be attended as a single course. Please check our list of single courses to find the ones available for enrolment.

Course syllabus and organization

Single session

Responsible
Lesson period
First semester
Course syllabus
The course is structured into three main modules:
· Module A - Introduction to Corpus Linguistics
Module A provides a general introduction to corpus linguistics, presenting its main theoretical and methodological foundations. The criteria that define a linguistic corpus will be discussed, along with an overview of the main types of corpora that can be created. Particular attention will be given to the relationship between the type of corpus and the research question, highlighting the methodological potential and limitations associated with different sampling and data construction strategies. The module will also offer a brief historical overview of the development of corpus linguistics as a discipline and its role in contemporary language studies.
· Module B - Methods for Data Analysis
Module B introduces students to the main conceptual and operational tools for analysing data derived from linguistic corpora. Basic statistical notions will be presented as well as key measures used to describe associations and co-occurrences between linguistic items, in order to equip students with tools for critically interpreting analysis results in relation to specific research questions.
· Module C - Practical Applications
The module is dedicated to the application of corpus linguistics principles and methods. Through the analysis of case studies and research examples, it aims to show how corpus-based techniques can be used to describe linguistic phenomena in real-world contexts. The goal is to consolidate the analytical skills developed in the previous modules and to foster the ability to critically apply corpus linguistics tools in an applied perspective.
Prerequisites for admission
No specific prior knowledge is required, apart from a basic understanding of general linguistics.
Teaching methods
The course consists of lectures and practical activities.
Teaching Resources
Freddi, Maria (2019). Linguistica dei corpora. Roma: Carocci.
Barbera, M. (2013). Linguistica dei corpora e linguistica dei corpora italiana. Una introduzione. Milano: Quasar.
McEnery, T. and Wilson, A. (2001) Corpus Linguistics: An Introduction. 2nd Edition. Edinburgh: Edinburgh University Press.
Brezina, V. (2018). Statistics in Corpus Linguistics: A Practical Guide. Cambridge University Press. https://doi.org/10.1017/9781316410899

Course materials and slides will be available on Ariel.
The syllabus is the same for attending and non-attending students.
Assessment methods and Criteria
The exam consists of a written test including both theoretical questions and practical exercises, aimed at assessing students' understanding of the course content and their ability to apply the methodologies covered in the three course modules.

During the semester, students will have the opportunity to take three interim tests, one at the end of each module. Students who pass all three interim tests will be exempt from the final exam. Should one or two interim tests not be taken or not passed, they may be retaken during the final exam session, while any positive results from the other interim tests will be retained.
Part A and B
L-LIN/01 - HISTORICAL AND GENERAL LINGUISTICS - University credits: 6
Lessons: 40 hours
Part C
L-LIN/01 - HISTORICAL AND GENERAL LINGUISTICS - University credits: 3
Lessons: 20 hours
Professor(s)