79301 - LAB 1

Anno Accademico 2018/2019

  • Docente: Piotr Cwiakowski
  • Crediti formativi: 3
  • Lingua di insegnamento: Inglese
  • Modalità didattica: Convenzionale - Lezioni in presenza
  • Campus: Bologna
  • Corso: Laurea in Scienze statistiche (cod. 8873)

Conoscenze e abilità da conseguire

By the end of the course the student will develop advanced expertise in analyzing real-world phenomena by using statistical methods. By the end of this course students will be able to: - implement appropriate advanced statistical analysis using a statistical software (SAS or R or SPSS); - interpret the output of the procedures; - critically collate results and conclusions; - present the main results and conclusions in the form of concise summaries; - work independently on practical data analysis problems.

Contenuti

  • Review of popular clusterization and classification methods (Naive Bayes, K-means, Decision Trees, SVM, kNN).

  • Searching for relationships and patterns between words.

  • Visualization techniques for text Mining analysis.

  • Case studies and examples of text Mining from i. a. social media (Facebook, Twitter).

  • R software for the analysis.

Testi/Bibliografia

Handouts provided by the teacher.

Suggested readings:

  • Ashish Kumar, Avinash Paul, Mastering Text Mining with R. „Packt Publishing", 2016.

  • Feldman, Ronen, and James Sanger. The text mining handbook: advanced approaches in analyzing unstructured data. Cambridge university press, 2007.

  • Friedl, Jeffrey EF. Mastering regular expressions. " O'Reilly Media, Inc.", 2006.

  • Manning, Christopher D., and Hinrich Schütze. Foundations of statistical natural language processing. Vol. 999. Cambridge: MIT press, 1999.

  • Weiss, Sholom M., et al. Text mining: predictive methods for analyzing unstructured information. Springer Science & Business Media, 2010.

Metodi didattici

Computer lab sessions.

Modalità di verifica e valutazione dell'apprendimento

Students are requested to write a report (of approximatively 15 pages) about an applied project in text mining. The report should contain the statement of the applied problem chosen by the student, a description of the appropriate methodology and comments about the obtained results.

Strumenti a supporto della didattica

Lab tutorials & teaching notes.

Orario di ricevimento

Consulta il sito web di Piotr Cwiakowski