Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The currency symbol/code is not always properly spaced #65

Open
bojanz opened this issue May 8, 2018 · 1 comment
Open

The currency symbol/code is not always properly spaced #65

bojanz opened this issue May 8, 2018 · 1 comment

Comments

@bojanz
Copy link
Contributor

bojanz commented May 8, 2018

We currently rely on the pattern to give us the complete layout of the final number.
But CLDR has additional rules that say when a space should be inserted around a currency symbol/code, which look like this:

"currencySpacing": {
            "beforeCurrency": {
              "currencyMatch": "[:^S:]",
              "surroundingMatch": "[:digit:]",
              "insertBetween": " "
            },
            "afterCurrency": {
              "currencyMatch": "[:^S:]",
              "surroundingMatch": "[:digit:]",
              "insertBetween": " "
            }
          },

Yes, that's quite confusing, which is why I missed it previously.

Looks like this is a good opportunity to check how our formatting logic compares with the ICU4J one.

Relevant links:
angular/angular#20708
andyearnshaw/Intl.js#221

@bojanz
Copy link
Contributor Author

bojanz commented Mar 23, 2020

I analyzed the dataset. All number formats have the same currencySpacing data. That means we can avoid parsing it, and just implement the relevant logic directly in the number formatter.

Also note that even beforeCurrency and afterCurrency rules are the same.
What remains is:

              "currencyMatch": "[:^S:]",
              "surroundingMatch": "[:digit:]",
              "insertBetween": " "

Translated into English, that is "By default a space is automatically added between letters in a currency symbol and adjacent numbers."

Quoting https://unicode.org/reports/tr35/tr35-numbers.html for a source:

This element controls whether additional characters are inserted on the boundary between the symbol and the pattern. For example, with the above currencySpacing, inserting the symbol "US$" into the pattern "#,##0.00¤" would result in an extra no-break space inserted before the symbol, for example, "#,##0.00 US$". The beforeCurrency element governs this case, since we are looking before the "¤" symbol. The currencyMatch is positive, since the "U" in "US$" is at the start of the currency symbol being substituted. The surroundingMatch is positive, since the character just before the "¤" will be a digit. Because these two conditions are true, the insertion is made.

Conversely, look at the pattern "¤#,##0.00" with the symbol "US$". In this case, there is no insertion; the result is simply "US$#,##0.00". The afterCurrency element governs this case, since we are looking after the "¤" symbol. The surroundingMatch is positive, since the character just after the "¤" will be a digit. However, the currencyMatch is not positive, since the "$" in "US$" is at the end of the currency symbol being substituted. So the insertion is not made.

That also gives us a good example for tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant