Foto del docente

Stefano Lodi

Associate Professor

Department of Computer Science and Engineering

Academic discipline: ING-INF/05 Information Processing Systems


Data clustering in centralized, distributed, streaming, peer-to-peer environments and in sensor networks. Classification by support vector machines in distributed, streaming environments. Outlier detection in distributed environments. Semantic peer-to-peer systems. Refinement techniques for pattern-based conceptual clustering. Privacy issues and inferential attacks in distributed data mining with kernel density estimates. Visual data mining. Biomedical data mining. Knowledge representation.

Stefano Lodi's research interests are currently in outlier detection and data clustering methods for centralized, distributed, streaming environments, and visual data mining systems.
In distributed and streaming environments, the clustering task must be carried out with sufficient accuracy even if a global knowledge of the data is not available, due to sensitive data of prohibitive communication costs in distributed environments, and to the infeasibility of accessing past data in direct access mode in streaming environments. In both cases, it is often necessary to exploit the additivity of small data synopses, which are sufficient to compute clusters accurately. Such synopses can be transmitted at small cost, therefore can support distributed clustering. Moreover, they can be efficiently updated, therefore can support incremental updating needed in streaming environments. The proposed synopsis is the sampled kernel density estimate. The outlier detection problem involves similar challenges. The adoption of an approach based on the evaluation of distances to the k nearest neighbors allows to employ neighborhood as synopsis. Issues of privacy and inferential attacks to data subject to mining have been addressed, in particular data of which a kernel estimate is known with publicly known paramters. Currently investigating clustering methods based on kernel estimates that are not vulnerable to such attacks are under investigation. The clustering problem in peer-to-peer environments and in sensor networks is also being investigated. In such environments, clustering based on kernel estimates exploits mainly local information and therefore is not affected by hte high dinamicity of networks. Approximate solutions generated by appropriate algorithms can therefore be very accurate. User interfaces for the presentation of clustering input and output have been also been studied. Finally, a technique to refine a conceptual clustering generated by pattern based algorithms from relational databases.