Knowledge Bases (KBs) are widely used as one of the fundamental components in Semantic Web applications as they provide facts and relationships that can be automatically understood by machines. Curated knowledge bases usually use Resource Description Framework (RDF) as the data representation model. In order to query the RDF-presented knowledge in curated KBs, Web interfaces are built via SPARQL Endpoints. Currently, querying SPARQL Endpoints has the problems like network instability and latency, which affect the query efficiency. To address these issues, we propose a client-side caching framework, SPARQL Endpoint Caching Framework (SECF), aiming at accelerating the overall querying speed over SPARQL Endpoints. SECF identifies the potential issued queries by leveraging the querying patterns learned from clients’ historical queries and prefecthes/caches these queries. In particular, we develop a distance function based on graph edit distance to measure the similarity of SPARQL queries. We propose a feature modelling method to transform SPARQL queries to vector representation that are fed into machine learning algorithms. A time-aware smoothing-based method, Modified Simple Exponential Smoothing (MSES), is developed for cache replacement. Extensive experiments performed on real world queries showcase the effectiveness of our approach, which outperforms the state-of-the-art work in terms of the overall querying speed.
|Number of pages||20|
|Journal||ACM Transactions on Internet Technology|
|Early online date||1 Feb 2018|
|Publication status||Published - Feb 2018|