WF2: application/x-www-form-urlencoded encoding ill-defined

Dear Web Application Formats Working Group,

  http://www.w3.org/TR/2006/WD-web-forms-2-20060821/ section 5.3 item 4
is:

  Control names and values are escaped. Space characters are replaced by
  "+" (U+002B), and other non-alphanumeric characters are encoded in the
  submission character encoding and each resulting byte is replaced by
  "%HH", a percent sign (U+0025) and two uppercase hexadecimal digits
  representing the value of the byte.

This text is rather unclear and incorrect; it does not define what non-
alphanumeric characters are (and whatever it means, it's incorrect), the
character encoding is applied to the whole string, not just non-alpha-
numeric characters, and %hh encoding is applied based on what the bytes
are, not what the character were.

Consider the following cases:

  * encoding is UTF-8 and the value is "_", implementations should not
    apply %hh encoding to it even though it's not alphanumeric

  * encoding is UTF-7 and the value is "�", the byte sequence would be
    +APY- and implementations should apply %hh escaping only to the +,
    not to the whole thing or nothing (depending on whether "�" is con-
    sidered alphanumeric)

Please change the draft in a way that properly reflects the above and
current implementations. I don't know the exact set of bytes that need
to have %hh encoding applied, but I suspect the set is similar to that
of characters considered reserved in the query string as per RFC 3986.

regards,
-- 
Bj�rn H�hrmann � mailto:bjoern@hoehrmann.de � http://bjoern.hoehrmann.de
Weinh. Str. 22 � Telefon: +49(0)621/4309674 � http://www.bjoernsworld.de
68309 Mannheim � PGP Pub. KeyID: 0xA4357E78 � http://www.websitedev.de/ 

Received on Wednesday, 6 September 2006 01:54:07 UTC