History | Log In     View a printable version of the current page. Get help!  
Issue Details [XML]

Key: SES-855
Type: Improvement Improvement
Status: Resolved Resolved
Resolution: Fixed
Priority: Major Major
Assignee: Andreas Schwarte
Reporter: Andreas Schwarte
Votes: 0
Watchers: 1
Operations

If you were logged in you would be able to see more operations.
Sesame

Optimizations for SPARQL SERVICE evaluation using BINDINGs clause

Created: 13/Oct/11 09:25 AM   Updated: 15/Nov/11 01:02 PM
Component/s: SPARQL
Affects Version/s: None
Fix Version/s: 2.6.1


 Description   
The evaluation of the SERVICE keyword is currently done in a nested loop fashion. An optimization would be to collect a set of intermediate bindings and to evaluate the SERVICE expression in a grouped remote query (using the bindings as constraints in the query).

Basically, there are two options to achieve a grouped query:
a) using the BINDINGS clause introduced in SPARQL 1.1
b) a bound join using a UNION construct (as proposed in the FedX paper) -> compatible with SPARQL1.1

Since, more and more the endpoints offer SPARQL 1.1 support, I suggest to go for option a) and to use the naive NLJ approach as fallback.

Required changes:

* To implement this, some means of "collecting" intermediate bindings is needed. I suggest to add a ServiceJoin, which essentially does this: Instead of feeding the retrieved results directly into the stream for the next operator, it collects them and calls a method like evaluate(Service, List<BindingSet>, ...).
* Changes to the FederatedService interface to support for constrained evaluation with a set of bindings

To be discussed (ideas from the mailing list): Use a ClosableIteration as input to the evaluation method since vectored evaluation might induce GC heap pressure.

 All   Comments   Change History      Sort Order:
Comment by Andreas Schwarte [15/Nov/11 01:02 PM]
Fixed in revision 11415. Vectored evaluation of bindings