Web Algorithmics
A.Y. 2018/2019
Learning objectives
Il corso si propone di mettere lo studente a contatto con tecniche avanzate relative alla raccolta di grandi collezioni documentali e alla costruzione di motori di ricerca, con particolare attenzione al web e ai suoi meccanismi di ranking.
Expected learning outcomes
Undefined
Lesson period: First semester
Assessment methods: Esame
Assessment result: voto verbalizzato in trentesimi
Single course
This course cannot be attended as a single course. Please check our list of single courses to find the ones available for enrolment.
Course syllabus and organization
Milan
Responsible
Lesson period
First semester
ATTENDING STUDENTS
Course syllabus
NON-ATTENDING STUDENTS
- Anatomy of a search engine and indexing techniques.
- Collection of information: algorithmic, technical and social problems in creating crawlers.
- Comparison strategies for comparison.
- Global web structure and algorithmic tools for its analysis.
- Representation of the web graph: compression problems.
- Parallel and distributed crawlers. Coherent hashing techniques.
- Graph rankings: proximity, PageRank, HITS, etc.
- Link spamming and adversarial ranking mechanisms.
- Techniques for building inverted indices.
- Methods for the construction of large distributed indices.
- Methods for the construction of large dictionaries: perfect minimal ordered hash tables, prefix dictionaries, lexicographic compression.
- Instant codes for index compression: Elias, Golomb, arithmetic compression.
- Representation of Elias-Fano and quasi-succinct indices.
- Textual ranking techniques (LSI, cosine, Clarke-Cormack, etc.).
- Substring indexes: trie, suffix trees and suffix vectors.
- Collection of information: algorithmic, technical and social problems in creating crawlers.
- Comparison strategies for comparison.
- Global web structure and algorithmic tools for its analysis.
- Representation of the web graph: compression problems.
- Parallel and distributed crawlers. Coherent hashing techniques.
- Graph rankings: proximity, PageRank, HITS, etc.
- Link spamming and adversarial ranking mechanisms.
- Techniques for building inverted indices.
- Methods for the construction of large distributed indices.
- Methods for the construction of large dictionaries: perfect minimal ordered hash tables, prefix dictionaries, lexicographic compression.
- Instant codes for index compression: Elias, Golomb, arithmetic compression.
- Representation of Elias-Fano and quasi-succinct indices.
- Textual ranking techniques (LSI, cosine, Clarke-Cormack, etc.).
- Substring indexes: trie, suffix trees and suffix vectors.
Course syllabus
- Anatomy of a search engine and indexing techniques.
- Collection of information: algorithmic, technical and social problems in creating crawlers.
- Comparison strategies for comparison.
- Global web structure and algorithmic tools for its analysis.
- Representation of the web graph: compression problems.
- Parallel and distributed crawlers. Coherent hashing techniques.
- Graph rankings: proximity, PageRank, HITS, etc.
- Link spamming and adversarial ranking mechanisms.
- Techniques for building inverted indices.
- Methods for the construction of large distributed indices.
- Methods for the construction of large dictionaries: perfect minimal ordered hash tables, prefix dictionaries, lexicographic compression.
- Instant codes for index compression: Elias, Golomb, arithmetic compression.
- Representation of Elias-Fano and quasi-succinct indices.
- Textual ranking techniques (LSI, cosine, Clarke-Cormack, etc.).
- Substring indexes: trie, suffix trees and suffix vectors.
- Collection of information: algorithmic, technical and social problems in creating crawlers.
- Comparison strategies for comparison.
- Global web structure and algorithmic tools for its analysis.
- Representation of the web graph: compression problems.
- Parallel and distributed crawlers. Coherent hashing techniques.
- Graph rankings: proximity, PageRank, HITS, etc.
- Link spamming and adversarial ranking mechanisms.
- Techniques for building inverted indices.
- Methods for the construction of large distributed indices.
- Methods for the construction of large dictionaries: perfect minimal ordered hash tables, prefix dictionaries, lexicographic compression.
- Instant codes for index compression: Elias, Golomb, arithmetic compression.
- Representation of Elias-Fano and quasi-succinct indices.
- Textual ranking techniques (LSI, cosine, Clarke-Cormack, etc.).
- Substring indexes: trie, suffix trees and suffix vectors.
Professor(s)