HFilter: Hybrid Finite Automaton Based Stream Filtering for Deep and Recursive XML Data

Weiwei Sun, Yongrui Qin, Ping Yu, Zhuoyao Zhang, Zhenying He

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

5 Citations (Scopus)

Abstract

XML filtering applications are gaining increasing popularity recently. Automata are generally adopted to construct query indexes for evaluating large numbers of XPath queries over XML streams. Usually only shallow data are observed in existing approaches. How to process deep and recursive XML data with low memory limitation efficiently is still a challenging issue. In this paper, we propose HFilter, a Hybrid Finite Automaton (HFA) based stream filtering approach, to solve this problem. We introduce the basic two-tier HFA (lazy DFA tier and NFA tier) first, which realizes data prefix sharing and memory overflow control to improve the filtering throughput. Then an optimized three-tier HFA with an extra pre-expanded DFA tier is put forward, which significantly reduces the restarting cost of HFA after memory overflow. Experiments show that our approaches work more efficiently than existing ones.

Original languageEnglish
Title of host publicationDatabase and Expert Systems Applications
Subtitle of host publication19th International Conference, DEXA 2008, Turin, Italy, September 1-5, 2008. Proceedings
PublisherSpringer-Verlag Berlin Heidelberg
Pages566-580
Number of pages15
VolumeLNCS 5181
ISBN (Electronic)9783540856542
ISBN (Print)3540856536, 9783540856535
DOIs
Publication statusPublished - 2008
Externally publishedYes
Event19th International Conference on Database and Expert Systems Applications - Turin, Italy
Duration: 1 Sep 20085 Sep 2008
Conference number: 19
http://www.dexa.org/previous/dexa2008/index.html (Link to Conference Website)

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
PublisherSpringer-Verlag Berlin Heidelberg
Volume5181 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference19th International Conference on Database and Expert Systems Applications
Abbreviated titleDEXA 2008
Country/TerritoryItaly
CityTurin
Period1/09/085/09/08
Internet address

Fingerprint

Dive into the research topics of 'HFilter: Hybrid Finite Automaton Based Stream Filtering for Deep and Recursive XML Data'. Together they form a unique fingerprint.

Cite this