A Learning-Based Framework for Improving Querying on Web Interfaces of Curated Knowledge Bases

Wei Emma Zhang, Quan Z. Sheng, Lina Yao, Kerry Taylor, Ali Shemshadi, Yongrui Qin

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

Knowledge Bases (KBs) are widely used as one of the fundamental components in Semantic Web applications as they provide facts and relationships that can be automatically understood by machines. Curated knowledge bases usually use Resource Description Framework (RDF) as the data representation model. In order to query the RDF-presented knowledge in curated KBs, Web interfaces are built via SPARQL Endpoints. Currently, querying SPARQL Endpoints has the problems like network instability and latency, which affect the query efficiency. To address these issues, we propose a client-side caching framework, SPARQL Endpoint Caching Framework (SECF), aiming at accelerating the overall querying speed over SPARQL Endpoints. SECF identifies the potential issued queries by leveraging the querying patterns learned from clients’ historical queries and prefecthes/caches these queries. In particular, we develop a distance function based on graph edit distance to measure the similarity of SPARQL queries. We propose a feature modelling method to transform SPARQL queries to vector representation that are fed into machine learning algorithms. A time-aware smoothing-based method, Modified Simple Exponential Smoothing (MSES), is developed for cache replacement. Extensive experiments performed on real world queries showcase the effectiveness of our approach, which outperforms the state-of-the-art work in terms of the overall querying speed.
Original languageEnglish
Article number35
Pages (from-to)1-20
Number of pages20
JournalACM Transactions on Internet Technology
Volume18
Issue number3
Early online date1 Feb 2018
DOIs
Publication statusPublished - Feb 2018

Fingerprint

Semantic Web
Learning algorithms
Learning systems
Experiments

Cite this

Zhang, Wei Emma ; Sheng, Quan Z. ; Yao, Lina ; Taylor, Kerry ; Shemshadi, Ali ; Qin, Yongrui. / A Learning-Based Framework for Improving Querying on Web Interfaces of Curated Knowledge Bases. In: ACM Transactions on Internet Technology. 2018 ; Vol. 18, No. 3. pp. 1-20.
@article{06ef1092e64d48ad96af848c65a4342a,
title = "A Learning-Based Framework for Improving Querying on Web Interfaces of Curated Knowledge Bases",
abstract = "Knowledge Bases (KBs) are widely used as one of the fundamental components in Semantic Web applications as they provide facts and relationships that can be automatically understood by machines. Curated knowledge bases usually use Resource Description Framework (RDF) as the data representation model. In order to query the RDF-presented knowledge in curated KBs, Web interfaces are built via SPARQL Endpoints. Currently, querying SPARQL Endpoints has the problems like network instability and latency, which affect the query efficiency. To address these issues, we propose a client-side caching framework, SPARQL Endpoint Caching Framework (SECF), aiming at accelerating the overall querying speed over SPARQL Endpoints. SECF identifies the potential issued queries by leveraging the querying patterns learned from clients’ historical queries and prefecthes/caches these queries. In particular, we develop a distance function based on graph edit distance to measure the similarity of SPARQL queries. We propose a feature modelling method to transform SPARQL queries to vector representation that are fed into machine learning algorithms. A time-aware smoothing-based method, Modified Simple Exponential Smoothing (MSES), is developed for cache replacement. Extensive experiments performed on real world queries showcase the effectiveness of our approach, which outperforms the state-of-the-art work in terms of the overall querying speed.",
keywords = "Caching, Knowledge base query-answering, Query suggestion, SPARQL",
author = "Zhang, {Wei Emma} and Sheng, {Quan Z.} and Lina Yao and Kerry Taylor and Ali Shemshadi and Yongrui Qin",
note = "{\circledC} ACM, 2018. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in ACM Transactions on Internet Technolgy, {18, 3, (February 2018)} http://doi.acm.org/10.1145/3155806",
year = "2018",
month = "2",
doi = "10.1145/3155806",
language = "English",
volume = "18",
pages = "1--20",
journal = "ACM Transactions on Internet Technology",
issn = "1533-5399",
publisher = "Association for Computing Machinery (ACM)",
number = "3",

}

A Learning-Based Framework for Improving Querying on Web Interfaces of Curated Knowledge Bases. / Zhang, Wei Emma; Sheng, Quan Z.; Yao, Lina; Taylor, Kerry; Shemshadi, Ali; Qin, Yongrui.

In: ACM Transactions on Internet Technology, Vol. 18, No. 3, 35, 02.2018, p. 1-20.

Research output: Contribution to journalArticle

TY - JOUR

T1 - A Learning-Based Framework for Improving Querying on Web Interfaces of Curated Knowledge Bases

AU - Zhang, Wei Emma

AU - Sheng, Quan Z.

AU - Yao, Lina

AU - Taylor, Kerry

AU - Shemshadi, Ali

AU - Qin, Yongrui

N1 - © ACM, 2018. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in ACM Transactions on Internet Technolgy, {18, 3, (February 2018)} http://doi.acm.org/10.1145/3155806

PY - 2018/2

Y1 - 2018/2

N2 - Knowledge Bases (KBs) are widely used as one of the fundamental components in Semantic Web applications as they provide facts and relationships that can be automatically understood by machines. Curated knowledge bases usually use Resource Description Framework (RDF) as the data representation model. In order to query the RDF-presented knowledge in curated KBs, Web interfaces are built via SPARQL Endpoints. Currently, querying SPARQL Endpoints has the problems like network instability and latency, which affect the query efficiency. To address these issues, we propose a client-side caching framework, SPARQL Endpoint Caching Framework (SECF), aiming at accelerating the overall querying speed over SPARQL Endpoints. SECF identifies the potential issued queries by leveraging the querying patterns learned from clients’ historical queries and prefecthes/caches these queries. In particular, we develop a distance function based on graph edit distance to measure the similarity of SPARQL queries. We propose a feature modelling method to transform SPARQL queries to vector representation that are fed into machine learning algorithms. A time-aware smoothing-based method, Modified Simple Exponential Smoothing (MSES), is developed for cache replacement. Extensive experiments performed on real world queries showcase the effectiveness of our approach, which outperforms the state-of-the-art work in terms of the overall querying speed.

AB - Knowledge Bases (KBs) are widely used as one of the fundamental components in Semantic Web applications as they provide facts and relationships that can be automatically understood by machines. Curated knowledge bases usually use Resource Description Framework (RDF) as the data representation model. In order to query the RDF-presented knowledge in curated KBs, Web interfaces are built via SPARQL Endpoints. Currently, querying SPARQL Endpoints has the problems like network instability and latency, which affect the query efficiency. To address these issues, we propose a client-side caching framework, SPARQL Endpoint Caching Framework (SECF), aiming at accelerating the overall querying speed over SPARQL Endpoints. SECF identifies the potential issued queries by leveraging the querying patterns learned from clients’ historical queries and prefecthes/caches these queries. In particular, we develop a distance function based on graph edit distance to measure the similarity of SPARQL queries. We propose a feature modelling method to transform SPARQL queries to vector representation that are fed into machine learning algorithms. A time-aware smoothing-based method, Modified Simple Exponential Smoothing (MSES), is developed for cache replacement. Extensive experiments performed on real world queries showcase the effectiveness of our approach, which outperforms the state-of-the-art work in terms of the overall querying speed.

KW - Caching

KW - Knowledge base query-answering

KW - Query suggestion

KW - SPARQL

UR - http://www.scopus.com/inward/record.url?scp=85041706992&partnerID=8YFLogxK

U2 - 10.1145/3155806

DO - 10.1145/3155806

M3 - Article

VL - 18

SP - 1

EP - 20

JO - ACM Transactions on Internet Technology

JF - ACM Transactions on Internet Technology

SN - 1533-5399

IS - 3

M1 - 35

ER -