- From: Bjoern Hoehrmann <derhoermi@gmx.net>
- Date: Wed, 06 Sep 2006 03:47:10 +0200
- To: public-appformats@w3.org
Dear Web Application Formats Working Group,
http://www.w3.org/TR/2006/WD-web-forms-2-20060821/ section 5.3 item 4
is:
Control names and values are escaped. Space characters are replaced by
"+" (U+002B), and other non-alphanumeric characters are encoded in the
submission character encoding and each resulting byte is replaced by
"%HH", a percent sign (U+0025) and two uppercase hexadecimal digits
representing the value of the byte.
This text is rather unclear and incorrect; it does not define what non-
alphanumeric characters are (and whatever it means, it's incorrect), the
character encoding is applied to the whole string, not just non-alpha-
numeric characters, and %hh encoding is applied based on what the bytes
are, not what the character were.
Consider the following cases:
* encoding is UTF-8 and the value is "_", implementations should not
apply %hh encoding to it even though it's not alphanumeric
* encoding is UTF-7 and the value is "�", the byte sequence would be
+APY- and implementations should apply %hh escaping only to the +,
not to the whole thing or nothing (depending on whether "�" is con-
sidered alphanumeric)
Please change the draft in a way that properly reflects the above and
current implementations. I don't know the exact set of bytes that need
to have %hh encoding applied, but I suspect the set is similar to that
of characters considered reserved in the query string as per RFC 3986.
regards,
--
Bj�rn H�hrmann � mailto:bjoern@hoehrmann.de � http://bjoern.hoehrmann.de
Weinh. Str. 22 � Telefon: +49(0)621/4309674 � http://www.bjoernsworld.de
68309 Mannheim � PGP Pub. KeyID: 0xA4357E78 � http://www.websitedev.de/
Received on Wednesday, 6 September 2006 01:54:07 UTC