Schema independent XML compressor

Baydaa Al-Hamadani, Zhongyu Lu, Raad F. Alwan

Research output: Chapter in Book/Report/Conference proceedingChapterpeer-review

Abstract

XML has become the standard way for representing and transforming data over the World Wide Web. The problem with XML documents is that they have a very high ratio of redundancy, which makes these documents demanding a large storage capacity and large network band-width for transmission. This study designs a system for compressing and querying XML documents (XMLCQ) which compresses the XML document without the need to its schema or DTD to minimize the amount of technologies associated with these documents. XMLCQ first compressed the XML document by separating its data into containers according to the path of these data from the root to the leaf, then it compressed these containers using a back-end compression technique. The compressed file then could be retrieved with any kind of queries applied. Only the required information is decompressed and submitted to the user. Depending on several experiments, the query processor part of the system showed the ability to answer different kinds of queries ranging from simple exact match queries to complex ones. Furthermore, this paper introduced the idea of retrieving information from more than one compressed XML documents.

Original languageEnglish
Title of host publicationInformation Retrieval Methods for Multidisciplinary Applications
PublisherIGI Global
Chapter7
Pages95-115
Number of pages21
ISBN (Electronic)9781466638990
ISBN (Print)1466638982, 9781466638983
DOIs
Publication statusPublished - 30 Apr 2013

Fingerprint

Dive into the research topics of 'Schema independent XML compressor'. Together they form a unique fingerprint.

Cite this