History | Log In     View a printable version of the current page. Get help!  
Issue Details [XML]

Key: SES-313
Type: Improvement Improvement
Status: Resolved Resolved
Resolution: Fixed
Priority: Minor Minor
Assignee: Herko ter Horst
Reporter: Arjohn Kampman
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
Sesame

Support multibyte characters in HTTP protocol

Created: 23/Oct/06 11:20 AM   Updated: 26/Oct/06 01:23 PM
Component/s: HTTP Server
Affects Version/s: 2.0-alpha-3, 2.0-alpha-2, 2.0-alpha-1
Fix Version/s: 2.0-alpha-4


 Description   
The RESTful HTTP protocol for Sesame 2 should support multibyte characters properly. Encoding such characters in URLs or URL-parameters for GET-requests has proven to be very problematic. The %xx escape sequences can only encode characters in the range of 0 - 255. Character encodings specified in headers only apply to request bodies. How encoded characters in the URL are handled is platform dependent; some apply the specified encoding, some apply the system's default encoding and some allow you to configure a specific encoding.

To be on the safe side, the HTTP protocol should allow clients to encode multibyte characters to some ASCII-string before URL-encoding these values. This specifically applies to the 'context' and 'baseURI' parameter in the current protocol, as these are used in non-POST requests. A suitable option is to use N-Triples/Turtle syntax for encoding URIs, blank nodes and literals.

Note: parameters in POST requests are safe wrt character encoding as these are send to the server in the body of the request.

 All   Comments   Change History      Sort Order:
Comment by Herko ter Horst [26/Oct/06 01:23 PM]
The server and HttpSail have been updated to encode GET parameters that can contain multi-byte characters using NTriples encoding. I still need to create a test-case for this in the TCK, but I'm marking this issue as resolved.