Character encodings #7

ezzatron · 2015-12-08T21:59:37Z

This library currently handles conversion between strings and sets of codepoints, in an attempt to provide an intuitive and easy to use API. It may be a better idea to require the codepoint conversion to take place before input to this library.

This would allow for systems using a third-party UTF-8 implementation such as utf8. It also neatly avoids the issue of what encodings PRECIS deems valid. For example, from RFC 7613:

An entity that prepares a string according to this profile MUST first
map fullwidth and halfwidth characters to their decomposition
mappings (see Unicode Standard Annex #11 [UAX11]). This is necessary
because the PRECIS "HasCompat" category specified in Section 9.17 of
[RFC7564] would otherwise forbid fullwidth and halfwidth characters.
After applying this width-mapping rule, the entity then MUST ensure
that the string consists only of Unicode code points that conform to
the PRECIS IdentifierClass defined in Section 4.2 of [RFC7564]. In
addition, the entity then MUST encode the string as UTF-8 [RFC3629].

(emphasis mine)

See discussion under #1 for more information.

ezzatron mentioned this issue Dec 8, 2015

Code review #1

Open

ezzatron mentioned this issue May 4, 2016

Directionality check for Username profiles #8

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Character encodings #7

Character encodings #7

ezzatron commented Dec 8, 2015

Character encodings #7

Character encodings #7

Comments

ezzatron commented Dec 8, 2015