MSE Master of Science in Engineering

The Swiss engineering master's degree


Ogni modulo equivale a 3 crediti ECTS. È possibile scegliere un totale di 10 moduli/30 ECTS nelle seguenti categorie: 

  • 12-15 crediti ECTS in moduli tecnico-scientifici (TSM)
    I moduli TSM trasmettono competenze tecniche specifiche del profilo e si integrano ai moduli di approfondimento decentralizzati.
  • 9-12 crediti ECTS in basi teoriche ampliate (FTP)
    I moduli FTP trattano principalmente basi teoriche come la matematica, la fisica, la teoria dell’informazione, la chimica ecc. I moduli ampliano la competenza scientifica dello studente e contribuiscono a creare un importante sinergia tra i concetti astratti e l’applicazione fondamentale per l’innovazione 
  • 6-9 crediti ECTS in moduli di contesto (CM)
    I moduli CM trasmettono competenze supplementari in settori quali gestione delle tecnologie, economia aziendale, comunicazione, gestione dei progetti, diritto dei brevetti, diritto contrattuale ecc.

La descrizione del modulo (scarica il pdf)riporta le informazioni linguistiche per ogni modulo, suddivise nelle seguenti categorie:

  • Insegnamento
  • Documentazione
  • Esame
Information Retrieval and Data Mining (TSM_InfData)

-

Requisiti

  • Knowledges in the field of relational databases
  • Basic knowledge of statistics
  • Good basics of object-oriented programming (Java)

Obiettivi di apprendimento

  • The course provides an introduction to the field of information retrieval and the multidisciplinary field of data mining.
  • Students are familiar with the architecture of an information retrieval system.
  • They are familiar with IR models (Boolean and Vector) and the use of these models to determine the weight of indexing terms and to calculate the correspondence between documents and queries.
  • They understand the different measures of evaluation of an information retrieval system and are able to apply the comparison algorithms and interpret their results.
  • They are familiar with the use of the Apache Lucene library for indexing and information retrieval according to the Boolean and vector model.
  • They are familiar with techniques for detecting similar documents using "Localitiy Sensitive Hashing" algorithms.
  • Students understand the use of modern database technologies for the processing and management of large data collections.
  • Students receive an introduction to the field of multidimensional databases, data warehousing models, OLAP techniques. They are familiar with new data structures (data types) that are alternatives to relational (including non-relational) database management systems (RDBMS) and are able to determine which data types and database system are appropriate for the context and the type of data available.
  • They are familiar with data pre-processing techniques (the concept of data quality and methods for data cleaning, data integration, data reduction, data transformation and data discretization).
  • They are familiar with the main data mining tasks and the main associated methods: descriptive data analysis, market basket analysis (association rules), classification (decision trees), clustering (hierarchical and non-hierarchical), estimation, detection of outliers, etc.
  • They are able to reuse the knowledge acquired during this course in their own work environment and apply it to solve their specific problems.

Categoria modulo

The module is divided into two parts, the first is dedicated to the field of information retrieval and the second to the field of data mining :

  • Basic concepts of IR
  • Boolean retrieval model
  • Vector space model and efficient ranking
  • Query refinement
  • Evaluation of IR systems
  • The Lucene API for Information Retrieval and evaluation
  • Near duplicate detection
  • Introduction to Data Warehousing and OLAP
  • Data pre-processing
  • Introduction to Data Mining
  • Classification
  • Market basket Analysis
  • Clustering
  • Estimation

Information Retrieval: 7 weeks
Data mining: 7 weeks

Metodologie di insegnamento e apprendimento

Lectures, exercises, labs.

Bibliografia

Optional literature suggestion (books):

  • DB: Lena Wiese: Advanced Data Management for SQL, NoSQL, Cloud and Distributed Databases. De Gruyter Textbook. 2015. ISBN 978-3-11-044140-6.
  • IR: "Modern Information Retrieval". Baeza-Yates & Ribeiro-Neto, New York (2011). ISBN: 9780321416919.
  • IR: Introduction to Information Retrieval. C.D. Manning, P. Raghavan, H. Schütze. Cambridge UP, 2008. Classical and web information retrieval systems: algorithms, mathematical foundations and practical issues.
  • IR: Information Retrieval in Practice. B. Croft, D. Metzler, T. Strohman. Pearson Education, 2009.

Scarica il descrittivo completo del modulo

Indietro