If you were logged in you would be able to see more operations.
Incorrect error handling for SPARQL aggregation
Created: 19/Oct/11 02:24 PM
Updated: 27/Oct/11 12:53 AM
It seems to me that this test from the SPARQL 1.1 compliance suite is incorrect. It is expecting two solutions, where ?x is bound in both solutions. However, for the :a group one of the bindings for ?x is a plain text literal. That should create an error when SUM(?x) is computed for that group. The error should cause (SUM(?x) as ?total) to be unbound in the group for :a. So, I believe the correct solution would be as follows. There are two solutions, but one of them is empty.
I think that this is probably one of the unit tests that you developed when you were working on the aggregation stuff.
I've inlined below a response from the SPARQL WG on the handling of errors within aggregates. Please let me know what you think.
This is in response to http://lists.w3.org/Archives/Public/public-rdf-dawg-comments/2011Aug/0000.html
> I am trying to put together some unit tests for error handling for
> aggregates. Based on the example below it seems that an error within
> an aggregate function (such as SUM or AVG) is NOT trapped by the
> function (which would cause the specific solution to be dropped by
> that function), but rather causes the evaluation of that function for
> the group to fail such that no value is bound for that function for
> that group. Is this the correct reading? Also, per the example, the failure of one aggregate for a group does not cause the group to be dropped, just that aggregate, correct?
The result of an Aggregate which returns an error, is an error, but the SELECT expression result of projecting an error is unbound.
> Also, if the test were modified to such that ?c could be computed
> while ?avg still produced an error, I presume that ?c would become
> bound for the group. For example, consider if the source of the error
> was a Literal having numeric data but not explicitly typed as some xsd
> numeric datatype rather than a blank node. In this case, one of the
> aggregates might be written to parse the literal, returning its numeric value. Under those circumstances, I presume that ?c would become bound but that ?avg would not.
?c can be bound if AVG is an error because the AVG error is handled in SELECT expressions.
> One more twist to consider. What if there is a HAVING clause and it
> encounters the same error? I assume that it should fail the group in
> which the error was encountered, but not the entire query. E.g.,
> HAVING SUM(?p) > 0 in the example below. That would trip on the same error which is tripping up AVG(?p).
That is correct, the evaluation rules are as per FILTER.
Please indicate whether this response has answered your query.
Steve, on behalf of the SPARQL WG.
Data for the unit test
Query for the unit test
Expected results for the unit test.
You are correct that this test case describes behavior incompatible with the current SPARQL spec - it was developed against an earlier version of the SPARQL working draft, in which behavior of aggregates on encountering type errors was not specified. I have at that point notified the SPARQL WG of this omission and have in the meantime chosen the current approach, which is to silently ignore type errors and continue computing the aggregate for the values that _are_ compatible.
The working group, however, has since adopted another strategy, which is to let the aggregate fail completely on such inputs and produce an empty binding.
I'm still of the opinion that the WG's design choice lacks flexibility and our implementation choice is the more useful one, but on the other hand it's not very useful either to say "we'll adopt a standard, except where we don't agree with it, there we'll do something else to confuse everybody" :) So you're probably right that we should fix this.
It's of course not just the test case which is incorrect, but also the current behavior of Sesame's SPARQL engine. If you could add this as an issue to JIRA I'd be grateful, then we can track and schedule a fix for this.
I just got a followup response from the SPARQL WG which indicates that this might in fact not be a bug after all.
My original message to the SPARQL WG:
Response by the WG:
In other words: behavior of operators on type errors in extensible in SPARQL (this was always the case), but according to the WG this _also_ holds for aggregate operators. So if we extend aggregates to silently ignore type errors rather than produce an empty binding, that's allowed.
Nevertheless, I'm in favor of keeping our implementation as "un-extended" as possible to avoid interoperability problems, so I will go ahead and change the behavior anyway, but I reckon it's useful to know this.
Fix checked in, aggregate processing now behaves W3C spec-compliant again.