Discussion on RFC 2396 + RFC 2732 vs. RFC 3986.

======================================================================
XMLDSIG 2002  4.3.3.1
======================================================================
> The URI attribute identifies a data object using a URI-Reference, as
> specified by RFC2396 [URI]. The set of allowed characters for URI
> attributes is the same as for XML, namely [Unicode]. However, some
> Unicode characters are disallowed from URI references including all
> non-ASCII characters and the excluded characters listed in RFC2396
> [URI, section 2.4]. However, the number sign (#), percent sign (%),
> and square bracket characters re-allowed in RFC 2732 [URI-Literal]
> are permitted.


RFC 2396
========

fragment      = *uric
uric          = reserved | unreserved | escaped
reserved      = ";" | "/" | "?" | ":" | "@" | "&" | "=" | "+" |
                "$" | ","
unreserved    = alphanum | mark
mark          = "-" | "_" | "." | "!" | "~" | "*" | "'" |
                "(" | ")"


--> 

fragment      = *( 
                ";" | "/" | "?" | ":" | "@" | "&" | "=" | "+" |
                "$" | ","
                 alphanum |
                "-" | "_" | "." | "!" | "~" | "*" | "'" |
                "(" | ")"
                 )

--> 

a..zA..Z0..9-._~!$&'()*+,;=/?:@


XMLDSIG 2002 allowed square brackets([]) as in RFC 2732.


RFC 2732
========

> This document incudes an update to the generic syntax for Uniform
> Resource Identifiers defined in RFC 2396 [URL].  It defines a syntax
> for IPv6 addresses and allows the use of "[" and "]" within a URI
> explicitly for this reserved purpose.

      reserved    = ";" | "/" | "?" | ":" | "@" | "&" | "=" | "+" |
                    "$" | "," | "[" | "]"

-->

fragment      = *( 
                ";" | "/" | "?" | ":" | "@" | "&" | "=" | "+" |
                "$" | "," | "[" | "]"
                 alphanum |
                "-" | "_" | "." | "!" | "~" | "*" | "'" |
                "(" | ")"
                 )

--> 

a..zA..Z0..9-._~!$&'()*+,;=/?:@[]


Although the grammar was changed in RFC 2732 in a way that allowed 
"[" | "]" in the fragment the prose in RFC 2732 is saying:

> It defines a syntax
> for IPv6 addresses and allows the use of "[" and "]" within a URI
> explicitly for this reserved purpose.


That indicates that this overrules the grammar wich is also consistent
with the current RFC 3986 grammar.


XMLDSIG 2002 allowed (#), percent sign (%)
===========================================
Here the only valid interpretation is is that (#), percent sign (%)
are allowed (in their non-percent encoded form) to sperate the fragment
and to initiate a percent encoding respectively 
because RFC 2396
 says 
the following:

> The character "#" is excluded
> because it is used to delimit a URI from a fragment identifier in URI
> references (Section 4). The percent character "%" is excluded because
> it is used for the encoding of escaped characters.

Wich is also consistent with RFC 3986 and the latest draft XMLDSIG 2007.


+========+ The interpretation above makes the mention of number sign (#)
|        | and percent sign (%) in 4.3.3.1 redundant.
| BEWARE | Some implementations may have wrongly interpreted 4.3.3.1
|        | to allow number sign (#) and percent sign (%) in in their 
|        | non-percent encoded form in the fragment, wich however
|        | contradicts the grammar in RFC 2396 and the prose in 
+========+ RFC 2732 and is inconsistent with RFC 3986.

If such a misinterpretation caused the production of signatures 
containing an xpointer like the following 

#xpointer(//*[@authenticate='true']) (cf. EBICS-Standard in Germany)

it does not comply to the grammar in RFC 3986 and the interpretation 
of RFC 2732 above does not allow square brackets in the fragment. 

Correct would be the following

#xpointer(//*%5B@authenticate='true'%5D)


As however square brackets wrongly appear to be allowed in fragments
according to RFC 2732 grammar, but prohibited to the prose in RFC 2732
we may want to allow implementations to verify such signatures and 
advocate against the creation of new signatures that fail to escape the
gen-delims characters in RFC 3986 (unless they really delimit the 
components of the URI).


The text in the current draft correctly follows RFC 3986, but maybe we
would like to add a note pointing to this mail.

======================================================================
XMLDSIG 2007 4.3.3.1 
======================================================================

RFC 3986

fragment      = *( pchar / "/" / "?" )
pchar         = unreserved / pct-encoded / sub-delims / ":" / "@"
unreserved    = ALPHA / DIGIT / "-" / "." / "_" / "~"
sub-delims    = "!" / "$" / "&" / "'" / "(" / ")"
                 / "*" / "+" / "," / ";" / "="

--> 

fragment      = *( pct-encoded / ALPHA / DIGIT / "-" / "." / "_" / "~" 
                 / "!" / "$" / "&" / "'" / "(" / ")"
                 / "*" / "+" / "," / ";" / "=" 
                 / "/" / "?" )

--> 

a..zA..Z0..9-._~!$&'()*+,;=/?:@


==>

The allowed characters are equal usinf the interpretation in this mail.

RFC 2396 fragment chars are : a..zA..Z0..9-._~!$&'()*+,;=/?:@
RFC 3986 fragment chars are : a..zA..Z0..9-._~!$&'()*+,;=/?:@


regards

Konrad Lanz

P.S: Non percent encoded unicode caracters that can live in URI
references inside XML are disjoint from the set of characters in 
RFC 2396 and RFC 3986 grammar and hence do not need to be discussed
here further.

-- 
Konrad Lanz, IAIK/SIC - Graz University of Technology
Inffeldgasse 16a, 8010 Graz, Austria
Tel: +43 316 873 5547
Fax: +43 316 873 5520
https://www.iaik.tugraz.at/aboutus/people/lanz
http://jce.iaik.tugraz.at

Certificate chain (including the EuroPKI root certificate):
https://europki.iaik.at/ca/europki-at/cert_download.htm