Abstract
As a platform for unsupervised data mining and pattern recognition, we use Correspondence Analysis on Twitter content from May to December 2015. The following data characteristics are well addressed: exponentially distributed data properties, and major imbalance between categories. Contextualization is supported. To both focus on informative resolution scale in one's data, and to handle large data sets, the granularity of point clouds offers benefits.
Original language | English |
---|---|
Pages (from-to) | 1207-1225 |
Number of pages | 19 |
Journal | CEUR Workshop Proceedings |
Volume | 1609 |
Publication status | Published - 25 Jul 2016 |
Externally published | Yes |
Event | 7th Conference and Labs of the Evaluation Forum: Information Access Evaluation meets Multilinguality, Multimodality and Visualization - University of Évora, Evora, Portugal Duration: 5 Sep 2016 → 8 Sep 2016 Conference number: 7 http://clef2016.clef-initiative.eu/ (Link to Event Website) |