Abstract
The clinical documents stored in a textual and unstructured manner represent a precious source of information that can be gathered by exploiting Information Retrieval techniques. Classification algorithms, and their composition through Ensemble Methods, can be used for organizing this huge amount of data, but are usually tested on standardized corpora, which significantly differ from actual clinical documents that can be found in a modern hospital. In this paper we present the results of a large experimental analysis conducted on 36,000 clinical documents, generated by three different medical Departments. For the sake of this investigation we propose a new classifier, based on the entropy idea, and test four single algorithms and four ensemble methods. The experimental results show the performance of selected approaches in a real-world environment, and highlights the impact of obsolescence on classification.
Original language | English |
---|---|
Title of host publication | Proceedings of the International Conference on Health Informatics |
Place of Publication | Angers, France |
Pages | 447-452 |
Number of pages | 6 |
Volume | 1 |
DOIs | |
Publication status | Published - 2014 |
Event | 7th International Conference on Health Informatics - ESEO, Angers, France Duration: 3 Mar 2014 → 6 Mar 2014 http://www.healthinf.biostec.org/?y=2014 (Link to Conference Website) |
Conference
Conference | 7th International Conference on Health Informatics |
---|---|
Abbreviated title | HEALTHINF 2014 |
Country/Territory | France |
City | Angers |
Period | 3/03/14 → 6/03/14 |
Internet address |
|