Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support new codepage ranges syntax #530 #1061

Merged
merged 1 commit into from
Jan 28, 2025

Conversation

black7375
Copy link
Contributor

Currently the problem is occurring in the parser. #807

File "/Users/etunni/venv/lib/python3.7/site-packages/glyphsLib/parser.py", line 81, in _parse_dict
self._parse_dict_into_object(res, text)
File "/Users/etunni/venv/lib/python3.7/site-packages/glyphsLib/parser.py", line 98, in _parse_dict_into_object
res[name] = d[name]
File "/Users/etunni/venv/lib/python3.7/site-packages/glyphsLib/classes.py", line 336, in setitem
setattr(self, key, value)
File "/Users/etunni/venv/lib/python3.7/site-packages/glyphsLib/classes.py", line 1366, in setValue
value = readIntlist(value)
File "/Users/etunni/venv/lib/python3.7/site-packages/glyphsLib/types.py", line 349, in readIntlist
return _mutate_list(int, src)
File "/Users/etunni/venv/lib/python3.7/site-packages/glyphsLib/types.py", line 344, in _mutate_list
l[i] = fn(l[i])
ValueError: invalid literal for int() with base 10: 'bit 29'

Therefore, pre-classify into codePageRanges and codePageRangesUnsupportedBits,

# Convert code page numbers to OS/2 ulCodePageRange bits. Empty lists stay empty lists.
class OS2CodePageRangesParamHandler(AbstractParamHandler):
def to_glyphs(self, glyphs, ufo):
ufo_codepage_bits = ufo.get_info_value("openTypeOS2CodePageRanges")
if ufo_codepage_bits is None:
return
codepages = []
unsupported_codepage_bits = []
for codepage in ufo_codepage_bits:
if codepage in REVERSE_CODEPAGE_RANGES:
codepages.append(REVERSE_CODEPAGE_RANGES[codepage])
else:
unsupported_codepage_bits.append(codepage)
glyphs.set_custom_value("codePageRanges", codepages)
if unsupported_codepage_bits:
glyphs.set_custom_value(
"codePageRangesUnsupportedBits", unsupported_codepage_bits
)
def to_ufo(self, builder, glyphs, ufo):
codepages = glyphs.get_custom_value("codePageRanges")
if codepages is None:
codepages = glyphs.get_custom_value("openTypeOS2CodePageRanges")
if codepages is None:
return
ufo_codepage_bits = [CODEPAGE_RANGES[v] for v in codepages]
unsupported_codepage_bits = glyphs.get_custom_value(
"codePageRangesUnsupportedBits"
)
if unsupported_codepage_bits:
ufo_codepage_bits.extend(unsupported_codepage_bits)
ufo.set_info_value("openTypeOS2CodePageRanges", sorted(ufo_codepage_bits))

For codePageRangesUnsupportedBits, just use the number with the bit prefix removed.

# https://www.microsoft.com/typography/otspec/os2.htm#cpr
CODEPAGE_RANGES = {
1252: 0,
1250: 1,
1251: 2,
1253: 3,
1254: 4,
1255: 5,
1256: 6,
1257: 7,
1258: 8,
# 9-15: Reserved for Alternate ANSI
874: 16,
932: 17,
936: 18,
949: 19,
950: 20,
1361: 21,
# 22-28: Reserved for Alternate ANSI and OEM
# 29: Macintosh Character Set (US Roman)
# 30: OEM Character Set
# 31: Symbol Character Set
# 32-47: Reserved for OEM
869: 48,
866: 49,
865: 50,
864: 51,
863: 52,
862: 53,
861: 54,
860: 55,
857: 56,
855: 57,
852: 58,
775: 59,
737: 60,
708: 61,
850: 62,
437: 63,
}

Note: The error occurs before handling custom_params, so it should be fixed in the parser.

Copy link

google-cla bot commented Jan 27, 2025

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@black7375
Copy link
Contributor Author

Now, I have completed CLA.
image

@anthrotype
Copy link
Member

thanks. Though I don't think the parser should import glyphsLib.builder or special-case those custom parameters every time it's parsing any object from a dict. I think what I wrote here still stands:

#530 (comment)

in order to be able to support parsing the official codePageRanges custom parameter properly, we need to not treat it as an integer list but as a List[Union[int, str]] where str is one of those "bit XX".

We could probably just remove codePageRanges from the GSCustomParameter._CUSTOM_INTLIST_PARAMS:

"codePageRanges",

and then the parser will not attempt to force all elements into integers.

When when the builder is converting to the equivalent UFO fontinfo property, the string values starting with "bit " prefix could be handlded specially.

@black7375
Copy link
Contributor Author

black7375 commented Jan 27, 2025

Thanks for quick replay.

As I tested, if I simply passed the value, a problem occurred in next:

    # glyphsLib/Lib/glyphsLib/builder/custom_params.py
    ufo.set_info_value("openTypeOS2CodePageRanges", sorted(ufo_codepage_bits))

Here, the codepage should be converted to a bit number list.
So, would it be better to pass it through the parser and process it in custom params instead?

@anthrotype
Copy link
Member

Indeed, glyphsLib/builder/custom_params.py is a good place where that conversion can take place. The parser itself can simply parse the raw list as it is, with some integers and some strings. Does that make sense?

@black7375
Copy link
Contributor Author

Thank you
Then, I'll process the conversions in custom_params, reflect Lint, and then force-push.

@black7375
Copy link
Contributor Author

I fixed it and it's definitely better. 👍

@anthrotype
Copy link
Member

GSCustomParameter._CUSTOM_INTLIST_PARAMS also contains a "openTypeOS2CodePageRanges", I think it's a synonym of "codePageRanges" custom parameter so should be treated the same, i.e. remove it from that list so it doesn't get forced into a list of ints

@khaledhosny khaledhosny linked an issue Jan 27, 2025 that may be closed by this pull request
@black7375 black7375 force-pushed the codepage-parse branch 2 times, most recently from 9ecec3a to 1e16762 Compare January 28, 2025 11:38
@anthrotype
Copy link
Member

anthrotype commented Jan 28, 2025

maybe don't make other unrelated changes within the same PR. You can do a follow up for those if you like.

Do you mind also adding a test? Thanks

@black7375
Copy link
Contributor Author

When I run flake8 Lib tests, the following error appears. Is it okay if I don't change the rest?

Lib/glyphsLib/builder/constants.py:265:5: B041 Repeated key-value pair in dictionary literal.
Lib/glyphsLib/builder/custom_params.py:518:25: B904 Within an `except` clause, raise exceptions with `raise ... from err` or `raise ... from None` to distinguish them from errors in exception handling.  See https://docs.python.org/3/tutorial/errors.html#exception-chaining for details.
Lib/glyphsLib/builder/custom_params.py:519:105: B950 line too long (104 > 80 characters)
Lib/glyphsLib/builder/glyph.py:63:12: E226 missing whitespace around arithmetic operator

Ok, I'll add the test to tests/builder/custom_params_test.py.

@anthrotype
Copy link
Member

When I run flake8 Lib tests, the following error appears

as long as the CI "lint" job passes, I don't really care. You can send a follow up PR to fix those if you like

I'll add the test

thank you

@anthrotype anthrotype merged commit f976349 into googlefonts:main Jan 28, 2025
8 checks passed
@black7375 black7375 mentioned this pull request Jan 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Handle new codepage ranges syntax
2 participants