Abstract
In recent years large volumes of short text data can be eas- ily collected from platforms such as microblogs and product review sites. Very often the obtained short text data contains several domains, which poses many challenges in effective multi-domain text processing because it is challenging to distinguish among the multiple domains in the text data. The concept of multiple domain taxonomy (MDT) has shown promis- ing performance in processing multi-domain text data. However, MDT has to be constructed manually, which requires much expert knowledge about the relevant domains and is time consuming. To address such is- sues, in this paper, we introduce a semi-automatic method to construct an MDT that only requires a small amount of manual input, in com- bination of an unsupervised method for ranking multi-domain concepts based on semantic relationships learned from unlabeled data. We show that the iteratively-constructed MDT using our semi-automatic method can achieve higher accuracy than existing methods in domain classifica- tion, where the accuracy can be improved by up to 11%.
Original language | English |
---|---|
Title of host publication | 29th International Conference on Database and Expert Systems Applications (DEXA 2018) |
Place of Publication | Cham |
Publisher | Springer Verlag |
Pages | 501-509 |
Number of pages | 9 |
Edition | 1st |
ISBN (Electronic) | 9783319988122 |
ISBN (Print) | 9783319988115 |
DOIs | |
Publication status | Published - 9 Aug 2018 |
Event | 29th International Conference on Database and Expert Systems Applications - Regensburg, Germany Duration: 3 Sep 2018 → 6 Sep 2018 Conference number: 29 http://www.dexa.org/dexa2018 (Link to Conference Website) |
Publication series
Name | Lecture Notes in Computer Science |
---|---|
Publisher | Springer |
Conference
Conference | 29th International Conference on Database and Expert Systems Applications |
---|---|
Abbreviated title | DEXA 2018 |
Country/Territory | Germany |
City | Regensburg |
Period | 3/09/18 → 6/09/18 |
Internet address |
|