← Go back

Towards more intelligent client-server architecture for SPARQL query processing (HybridAgent)

Project Group Master

In Smart-KG [1], the author proposes a client-server architecture for SPARQL query processing. Given the RDF dataset, the server first creates partitions of the given dataset based, which can be shipped to the client during the SPARQL query execution. These partitions are compressed which are decompressed by the clients during the query execution. The partitions are created based on the concepts of Characteristic [2,3], that exploits the structure of RDF graphs to group entities described with the same sets of predicates. Furthermore, the client can also receive uncompressed results from the server using the triple pattern fragments [4] interfaces. Once all the required results are shipped from the server , the client performs a local join to generate the final result of the given SPARQL query. The combination of server and client to distribute the workload has increased the efficiency of the query processing task. Wise-KG [5] (an improved version of SKG), on the other hand, leverages the characteristics of star-shaped sub queries and the information about the current client and server resources, to estimate the cost of processing each star-shaped subquery on the client (using SKG) or on the server (using Star Pattern Fragments [6], – choosing the most efficient execution strategy dynamically. That means, unlike SKG, servers can also perform joins of the star triple patterns. However, this improvement only deals with star-shaped joins. There exists other types of joins as well, e.g. path joins, sink joins, or hybrid joins (please refer to [7] for the details of these joins). In this project, we aim to go one step forward: In Wise KG [5], the path joins are always executed on the client side, and we want to propose a cost model that decides whether the given path join should be executed on the client or on the server. The cost model will basically calculate the time required to perform the given path join both on server and one the client. The low cost option will be selected. This approach would further refine the cost model, as not only the star queries (s-s join) but the path queries (s-o join) will also be considered in the decision process. All the experiments are conducted in Wise KG will be repeated to compare the runtime performance of the proposed model w.r.t Wise KG.

Project Goal

Our project main goal is to optimise the quering process by:

  • Enhancing the cost model: The existing cost model has a room for improvement
  • Distributing the workload between client and server: The worload can be distributed in a better way to achieve the trade-off between runtime performance, and network usage
  • Introducing subject-object and object-object joins: Since the existing systems rely on subject-subject join, therefore queries having other type of joins can not be exected by them efficently, thereofre introducing one or two of these joining techiniques will improve the quering tasks.
  • Integrating all the existing interfaces in a hybrid way: The existing models have their own pros, and cons. Integrating them in such a way that to utilize them according to the situations they are good in, will enhance the performance. Its a low hanging fruit.

Literature

  1. Smart-KG
  2. Characteristics sets 1
  3. Characteristics sets 2
  4. Triple Pattern Fragments Interface
  5. Wise KG
  6. Star Pattern Fragments
  7. HibisCuS

Questions

In case you have further questions, feel free to contact Muhammad Saleem or Hashim Khan

Course in PAUL

To be updated.