Following the publishing guidelines of Linked Data, the number of available SPARQL endpoints that support remote query processing is quickly growing imposing new challenges for client-server SPARQL query processing engines. In a federated scenario, the lack of endpoint statistics, the large number of available sources, and the size of intermediate results impose new challenges on source selection, query optimization and evaluation to ensure efficient and effective query executions. We present ANAPSID, an adaptive query engine for SPARQL endpoints that implements heuristic-based source selection and adaptive query execution to produce results as quickly as data arrives from the sources. Experimental results show that ANAPSID is able to devise efficient federated query plans that produce the first results faster than other engines. Although the federated engines developed in the Semantic Web community exhibit a good performance, they are not designed to make a fair usage of the endpoint resources, which may affect not only the endpoint performance but also its reliability. We propose therefore SHEPHERD, a SPARQL client-server query processor tailored to reduce SPARQL endpoint workload and generate plans where costly operators are placed at the client site. Experiments suggest that SHEPHERD can enhance endpoint performance while shifting workload from the endpoint to the client. Finally, in this talk we will provide insights of more flexible client-server architectures to allow remote SPARQL querying to reach its full potential.
Maribel Acosta is a PhD student at the Institute AIFB in the Karlsruhe Institute of Technology (KIT). She holds a M.Sc. from Universidad Simón Bolívar (Caracas, Venezuela) with specializations in Databases and Semantic Web technologies. She is currently focused on researching the enhancement of Semantic Web technologies with human-based computation, particularly applied to SPARQL query execution. Her research interests also include adaptive techniques for Linked Data management and federated query processing. Maribel has been involved in organizing and presenting several tutorials at the Extended Semantic Web Conference (ESWC), focused on semantic data management as well as crowdsourcing systems.