**Course: Data Mining, MSc in Mathematics**

Data Mining (6 CFU)

Semester: Spring

Overview

The course provides a modern introduction to data mining, which spans techniques, algorithms and methodologies for discovering structure, patterns and relationships in data sets (typically, large ones) and making predictions. Applications of data mining are already happening all around us, and, when they are done well, sometimes they even go unnoticed. For instance, how does the Google web search work? How does Shazam recognizes a song? How does Netflix recommend movies to its users? The principles of data mining provide answers to these and others questions. Data mining overlaps the fields of computer science, statistical machine learning and data bases. The course aims at providing the students with the knowldedge required to explore, analyze and leverage available data in order to turn the data into valuable and actionable information for a company, for instance, in order to facilitate a decision-making process.

Data Mining (6 CFU)

Semester: Spring

Overview

**After the course the student should be able to:**

Learning outcomes

Learning outcomes

• describe and use the main data mining techniques;

• understand the differences among several algorithms solving the same problem and recognize which one is better under different conditions;

• tackle new data mining problems by selecting the appropriate methods and justifying his/her choices;

• tackle new data mining problems by designing suitable algorithms and evaluating the results;

• explaining experimental results to people outside of statistical machine learning or computer science.

**Introduction. Map-Reduce (2 hours) Mining data streams. Frequent Items. (6 hours) Frequent Itemsets and association rules. (4 hours) Mining similar items and Locality-Sensitive Hashing. (2 hours) Graph analysis. Link analysis and PageRank. (2 hours) Clustering. (4 hours) Recommendation systems. (4 hours) Mining Social-Network Graphs. (4 hours) Dimensionality reduction. (2 hours) Classification. (6 hours) Drills: (6 hours)**

Course Content

Course Content

**Calculus. Probability and Statistics. Linear Algebra. Programming skills.**

Prerequisite

Prerequisite

**Oral exam. During the exam the student is asked to illustrate theoretical topics in order to verify his/her knowledge and understanding of the selected topics.**

Examination

Examination

**By appointment; contact the instructor by email or at the end of class meetings.**

Office Hours

Office Hours

**Data Mining and Analysis**

References

References

M. J. Zaki and W. Meira

Freely available online: http://dataminingbook.info

**Mining of Massive Datasets**

J. Leskovec, A. Rajaraman and J. Ullman

Freely available online: http://www.mmds.org