Constructing Multiple Domain Taxonomy for Text Processing Tasks

Yihong Zhang, Yongrui Qin, Longkun Guo

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review


In recent years large volumes of short text data can be eas- ily collected from platforms such as microblogs and product review sites. Very often the obtained short text data contains several domains, which poses many challenges in effective multi-domain text processing because it is challenging to distinguish among the multiple domains in the text data. The concept of multiple domain taxonomy (MDT) has shown promis- ing performance in processing multi-domain text data. However, MDT has to be constructed manually, which requires much expert knowledge about the relevant domains and is time consuming. To address such is- sues, in this paper, we introduce a semi-automatic method to construct an MDT that only requires a small amount of manual input, in com- bination of an unsupervised method for ranking multi-domain concepts based on semantic relationships learned from unlabeled data. We show that the iteratively-constructed MDT using our semi-automatic method can achieve higher accuracy than existing methods in domain classifica- tion, where the accuracy can be improved by up to 11%.
Original languageEnglish
Title of host publication29th International Conference on Database and Expert Systems Applications (DEXA 2018)
Place of PublicationCham
PublisherSpringer Verlag
Number of pages9
ISBN (Electronic)9783319988122
ISBN (Print)9783319988115
Publication statusPublished - 9 Aug 2018
Event29th International Conference on Database and Expert Systems Applications - Regensburg, Germany
Duration: 3 Sep 20186 Sep 2018
Conference number: 29 (Link to Conference Website)

Publication series

NameLecture Notes in Computer Science


Conference29th International Conference on Database and Expert Systems Applications
Abbreviated titleDEXA 2018
Internet address


Dive into the research topics of 'Constructing Multiple Domain Taxonomy for Text Processing Tasks'. Together they form a unique fingerprint.

Cite this