Skip to content
This repository has been archived by the owner on Oct 5, 2023. It is now read-only.

Wrong encoding in EOT's 255SHORT for large negative numbers #101

Open
behdad opened this issue Sep 25, 2018 · 6 comments
Open

Wrong encoding in EOT's 255SHORT for large negative numbers #101

behdad opened this issue Sep 25, 2018 · 6 comments

Comments

@behdad
Copy link
Contributor

behdad commented Sep 25, 2018

Code here:
https://github.com/googlei18n/sfntly/blob/master/java/src/com/google/typography/font/tools/conversion/eot/GlyfEncoder.java#L185

The "spec" is here:
https://www.w3.org/Submission/MTX/#id_255SHORT

Note how in the spec, a wordCode-encoded short does NOT get a flipSignCode before it. The short is encoded as is in two's-complement.

Eg. to encode the number -984, currently we encode: FA FD 03 D8

FA – flipsignCode (250 decimal)
FD – WordCode (253 decimal)
03 D8 – Value (984 decimal)

Correct encoding is FD FC 28
FD – WordCode (253 decimal)
FC 28 – Value (-984 decimal)

(Reported to me by Microsoft.)

@behdad
Copy link
Contributor Author

behdad commented Sep 25, 2018

The correct code should do something like this:


const short lowestCode = 253;

short value;

 

if(value > (3 * lowestCode) || value <= -lowestCode)

{

      os.write(253);

      os.write((byte)(absValue >> 8));

      os.write((byte)(absValue & 0xff));

}

else

{

 

  if (value < 0) {

      os.write(250); /* flip the sign */

  value = (short)-value;

  }

  else if (value >= lowestCode) {

    value = (short)(value - lowestCode);

    if (value >= lowestCode) {

      value = (short)(value - lowestCode);

      os.write(254);

    }

    else {

      os.write(255);

    }

  }

  os.write((byte)value);

}

@nyshadhr9
Copy link

There are some values which can be encoded in multiple ways. For Ex:
500 can be encoded as both:
FE (oneMoreByteCode2) 00 (decimal 0)
or
FF(oneMoreByteCode1) FA (decimal 250)

Is there a preference for one or the other? Or are we just concerned about minimizing the number of bytes taken to encode, so either one works.

@behdad
Copy link
Contributor Author

behdad commented Sep 26, 2018

Yeah I see the overlaps. No idea. @taylorb-monotype can you advise, which way does Monotype encode / recommend encoding these? Thanks.

@taylorb-monotype
Copy link

Our code to write 255Shorts is the same as what Behdad has indicated above.
With this code I don't think there is any ambiguity. The choice is made by the code above.
For example, you have the following encodings:

value encoding
0 00
1 01
...
249 f9
250 ff 00
251 ff 01
...
499 ff f9
500 fe 00
501 fe 01
...
749 fe f9
750 fd 02 ee
751 fd 02 ef
...
767 fd 02 ff
768 fd 03 00
769 fd 03 01
...
etc.

@behdad
Copy link
Contributor Author

behdad commented Sep 26, 2018

Our code to write 255Shorts is the same as what Behdad has indicated above.

Thanks for confirming.

With this code I don't think there is any ambiguity.

There is ambiguity in the spec, but yeah sticking to the above scheme works.

The choice is made by the code above.
For example, you have the following encodings:

value encoding
0 00
1 01
...
249 f9
250 ff 00
251 ff 01
...
499 ff f9
500 fe 00
501 fe 01
...
749 fe f9
750 fd 02 ee
751 fd 02 ef
...
767 fd 02 ff
768 fd 03 00
769 fd 03 01
...
etc.

@nyshadhr9
Copy link

Thank you for clarifying :)

My concern with the above approach was that, 750-755 take up 3 bytes whereas they could be encoded in 2 bytes as FE FA - FE FF. But I guess that can be sacrificed for a simpler and consistent solution.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants