History | Log In     View a printable version of the current page. Get help!  
Issue Details [XML]

Key: SES-873
Type: Improvement Improvement
Status: Resolved Resolved
Resolution: Fixed
Priority: Minor Minor
Assignee: Jeen Broekstra
Reporter: James Leigh
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
Sesame

Order the same Blank Nodes together in ORDER BY

Created: 27/Oct/11 08:53 PM   Updated: 02/Nov/11 02:32 PM
Component/s: Query Engine
Affects Version/s: 2.6.0
Fix Version/s: 2.6.1


 Description   
Sesame should only return 0 in ValueComparator's compare method iff both inputs are null or they are the same term.

When ordering RDF literals and URIs, the same literal or the same URI
will always be arranged together. However, there is no guarantee with
blank nodes that the same blank nodes will be arranged together.

The following SPARQL query lists all the vcards addresses in the default
graph along with their properties. A single address is represented in
multiple result bindings, one for each property in the data store.

SELECT ?card ?adr ?pred ?obj {
  ?card a vcard:VCard; vcard:adr ?adr .
  ?adr ?pred ?obj .
} ORDER BY ?vcard ?adr ?pred

The (author's) expected result is to have all results bindings ordered
first by the vcard they belong to and if there are multiple addresses on
the vcard, each address property is ordered together.

For example the follow bindings sets are a valid result set. Notice that
the entire home address comes before any of the work address properties.
This order is predictable because of the ORDER BY clause in the query
above.

vcard=<me>, adr=<me#home>, pred=vcard:country-name, obj="Australia"
vcard=<me>, adr=<me#home>, pred=vcard:locality, obj="WonderCity"
vcard=<me>, adr=<me#home>, pred=vcard:postal-code, obj="5555"
vcard=<me>, adr=<me#home>, pred=vcard:street-address, obj="111 Lake
Drive"
vcard=<me>, adr=<me#work>, pred=vcard:country-name, obj="Australia"
vcard=<me>, adr=<me#work>, pred=vcard:locality, obj="WonderCity"
vcard=<me>, adr=<me#work>, pred=vcard:postal-code, obj="5555"
vcard=<me>, adr=<me#work>, pred=vcard:street-address, obj="33 Enterprise
Drive"

Consider the result set if blank nodes were used for the address node.
The result might look like the one below.

vcard=<me>, adr=_:b1, pred=vcard:locality, obj="WonderCity"
vcard=<me>, adr=_:b1, pred=vcard:street-address, obj="111 Lake Drive"
vcard=<me>, adr=_:b2, pred=vcard:street-address, obj="33 Enterprise
Drive"
vcard=<me>, adr=_:b2, pred=vcard:country-name, obj="Australia"
vcard=<me>, adr=_:b1, pred=vcard:country-name, obj="Australia"
vcard=<me>, adr=_:b2, pred=vcard:postal-code, obj="5555"
vcard=<me>, adr=_:b1, pred=vcard:postal-code, obj="5555"
vcard=<me>, adr=_:b2, pred=vcard:locality, obj="WonderCity"

This is due to ValueComparator's if (o1 instanceof BNode && o2 instanceof BNode) return 0;

Although each result of a vcard is ordered together, because it is a
URI, the ordering of the adr blank nodes looks random and is
unpredictable. When the
data used contains blank node there is no way to control the result ordering.

The author would expect that _:b1 is ordered before or after _:b2, but
the author would not expect that _:b1 is mixed among _:b2. Although,
there is no order between _:b1 and _:b2 and SPARQL provides no guidance
on how to arrange blank nodes, Sesame should at least group the same bnodes together when ordering them.

Sesame should only return 0 in ValueComparator's compare method iff both inputs are null or they are the same term.

If both inputs are BNodes the ValueComparator should compare them using BNode#getID() and only return 0 if the IDs are the same. This still compiles with the SPARQL spec, but makes the results more predictable by ordering the same terms together.

 All   Comments   Change History      Sort Order:
Comment by James Leigh [02/Nov/11 02:32 PM]
Changed ValueComparator to use a stable ordering and only return 0 iff both terms are the same in revision 11359