-
Notifications
You must be signed in to change notification settings - Fork 517
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
quoting strings composed of digits #98
Comments
I'm not at my desk to attempt, but use yaml.load() on a document that is formatted as you desire with an example int-string inside single quotes. Then view the python object to see how yaml formats it. That should be your answer. I.E. reverse engineer what you're trying to do. |
nope not going to work
|
it seems to convert all strings starting with a zero to int, but other ints stored as a string it keeps the quote marks around it. This is quite annoying |
The plain scalar A leading zero in YAML 1.1 indicates an octal number, which can only have digits
So it's safe to dump the The bug seems to be in the library that interprets it as an integer. |
If I read the related issues cloudtools/troposphere#994 and awslabs/aws-cfn-template-flip#41 correctly, the issues have been resolved there, so I guess this can be closed. |
The problem is, a leading 0 without quotes indicates octal. But PyYAML is converting the string "012345" to an octal, even though it was a string. In order to get around this I had to make a custom dumper and specifically not do anything with strings starting with a zero. Similar to https://github.com/awslabs/aws-cfn-template-flip/pull/43/files |
@justin8 if the string Can you please post the output of the following script?
I get:
As you can see, the quotes are omitted only if it can't be interpreted as an octal because 8 is not a valid digit for an octal number, as I explained above. |
I would have to check on my other computer if my sample data had 8s or 9s in them. It may well have. But how would I store a number that is an identifier? In this case it was AWS account IDs, which can start with zero, but converting to octal or an int is not desirable since the leading zero is a part of the descriptor; and hence why it is being stored as a string, until PyYAML decides to convert it to a different format. |
Actually, looking at the regex for int/float in YAML 1.2, it allows leading zeros.
Sorry I was forgetting about these. PyYAML only implements YAML 1.1, so it's behaving correctly. Since 1.2 allows leading zeros, it would maybe a good idea for PyYAML to quote such strings, as it wouldn't break anything and still be 1.1 compatible. |
Ah, this would explain it. At least it's fixed in the next version of the spec. |
Any word on this? The 1.2 spec is quite old. If we're not going to upgrade, then we can at least be forwards-compatible right? |
We should in general be moving pyyaml and libyaml closer towards 1.2 (in ways that don't conflict with 1.3 plans). We should probably move forward with a fix here. Any takers? :) Don't forget we'll need to make sure libyaml is in sync. |
I found also a bug which is sort of related to this. Successfully installed pyyaml-3.13
/ # python
Python 3.7.0 (default, Sep 5 2018, 03:33:35)
[GCC 6.4.0] on linux >>> import yaml
>>> print(yaml.dump({
... 'int_123': 123,
... 'int_123e1': 123e1,
... 'str_123': '123',
... 'str_123e1': '123e1',
... }, default_flow_style=False)) output: int_123: 123
int_123e1: 1230.0
str_123: '123'
str_123e1: 123e1 The value of that last one should be in quotes because it is converted into an int/float. That did not happen to the '123' string. |
I managed to get around this issue by following @justin8 's PR: https://github.com/awslabs/aws-cfn-template-flip/pull/43/files
Thanks @justin8 ! |
@bxnxiong I forgot about this thread. I've actually found a much nicer way to get around this. The answer was to stop using pyyaml. The below code snippets will work exactly like pyyaml, but with the YAML 1.2 spec, so you don't have to mess with this stuff:
there's no loads/load difference, it reads strings or file-like objects from the same function, but otherwise behaves almost identically aside from complying with the modern standard. |
Same when it comes to generate object keys: |
There is an issue with the YAML 1.1 spec which causes some strings to be interpreted incorrectly as integers (see yaml/pyyaml#98), so we'll use the json template to try and prevent this Signed-off-by: Weston Steimel <[email protected]>
There is an issue with the YAML 1.1 spec which causes some strings to be interpreted incorrectly as integers (see yaml/pyyaml#98), so we'll use the json template to try and prevent this Signed-off-by: Weston Steimel <[email protected]>
+1 -- I vote to fix this Or should I say +"01" |
+1 I had to do this:
:-| |
this is still a problem(or a bug)
its really annoying |
same issue... thx @justin8 ! I had to switch to ruamel also :| |
It's an issue regarding YAML 1.1 and 1.2. But there won't be any support for 1.2 before @ingydotnet gives feedback on #116 |
Btw, I just tested ruamel.yaml (0.16.10), and it resolved the following items as numbers/timestamps/etc., although that doesn't match the spec. They should all be loaded as strings when using the default loader:
Edit: I tested with this script: https://github.com/yaml/yaml-runtimes/blob/master/docker/python/testers/py-ruamel-py |
+1 |
1 similar comment
+1 |
+1 |
You can use ruamel.yaml instead it is better documented than pyyaml: https://pypi.org/project/ruamel.yaml/ And with this library, you can at least dump digits without quotes and with no extra configuration |
Except that ruamel.yaml is wrong in some other cases (tested with 0.17.20) as I wrote here: |
Just felt here and got quite surprised such a problem has never been solved by the "most common YAML implementation in Python". My scenario is not with octal numbers (not mentioned in the title, but later discovered by OP) but I need to force an integer to be represented as a string (simply because it's a request from the application reading the final YAML, ahem Docker Compose's interpretation of The solution found was to work around that with a "proprietary" tag syntax based off this previous comment.
|
You can use https://pypi.org/project/yamlcore/ on top of PyYAML for YAML 1.2 support:
|
For anyone else having this issue - I submitted the following PR. I'm not sure I fully understand why it can't be fixed since some ints with leading zeroes load and dump fine and others do not- see attached test-yaml-dump-str.txt, it seems like it fails if the number contains/ends in 8 or 9. If it doesn't get fixed and anyone else is looking for an easy solution, this is what's working for me for now.
|
There is nothing to be fixed, as it ain't broken. If for some reason you need to have quotes around a
if your other loader has a bug, then monkeypatching the PyYAML dumper like in #98 (comment) is probably the best idea. |
Hi,
Consider the following program:
The value for the key 'sampleName' is composed of digits, but is a string. When run the program produces the following output:
This leads to problems downstream because the sequence of digits gets interpreted as an integer with the effect that when reserialized, the leading zero is lost.
Is this actually a bug? If not, how do I arrange for the value to be quoted?
Version information:
The text was updated successfully, but these errors were encountered: