Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent conversion of strings from json to yaml #41

Closed
loghen41 opened this issue Jan 9, 2018 · 11 comments
Closed

Inconsistent conversion of strings from json to yaml #41

loghen41 opened this issue Jan 9, 2018 · 11 comments

Comments

@loghen41
Copy link

loghen41 commented Jan 9, 2018

I am converting a document from json to yaml as part of a CloudFormation Template, and am noticing an odd error where some Id's that are marked as strings are being converted to strings, and other times not.

Here's a json snippet I'm working with right now which are the mappings for some of the Generic Elastic Load Balancer ID's for AWS:

        "Regions": {
            "us-east-1": {
                "ELBID": "127311923021",
                "Name": "ue1"
            },
            "us-east-2": {
                "ELBID": "033677994240",
                "Name": "ue2"
            },
            "us-west-1": {
                "ELBID": "027434742980",
                "Name": "uw1"
            },
            "us-west-2": {
                "ELBID": "797873946194",
                "Name": "uw2"
            }
        }
    }

And This is the resulting yaml I'm getting after calling to_yaml:

Mappings:
  Regions:
    us-east-1:
      ELBID: '127311923021'
      Name: ue1
    us-east-2:
      ELBID: 033677994240
      Name: ue2
    us-west-1:
      ELBID: 027434742980
      Name: uw1
    us-west-2:
      ELBID: '797873946194'
      Name: uw2

Strangely enough, any number beginning with 0 is converted, but the ones beginning with other numbers do not. I'm not sure what the expected behavior should be in this case, (either fully converted or not) but having it half and half is inconsistent, and I would believe is a bug.

Currently I'm having errors with using this yaml with sceptre/CloudFormation due to some of the Elastic Load Balancer ID's not being strings.

@stilvoid
Copy link
Member

Ok, this is a very interesting bug. Yaml actually treats any unquoted string of digits that begins with a zero as an octal number so this will produce strange results.

I will look into this shortly.

@loghen41
Copy link
Author

That makes a lot more sense. I look forward to your solution. BTW, I fixed my multi-string line comments in my previous comment. Sorry for any frustration that may have caused you.

@jk2l
Copy link

jk2l commented Mar 5, 2018

+1 for me. trying to create CF with Peer connection with an account ID with leading zero. and failed. also can't use ARN in this case as the parameter require account ID

@jk2l
Copy link

jk2l commented Mar 5, 2018

Trying to find out how to fix this. and found this funny issue


>>> print(yaml.dump({'testing': {'testing': "033677"}}, default_flow_style=False))
testing:
  testing: '033677'

>>> print(yaml.dump({'testing': {'testing': "033677994240"}}, default_flow_style=False))
testing:
  testing: 033677994240

it is only break if the string is long enough...

@jk2l
Copy link

jk2l commented Mar 5, 2018

related to yaml/pyyaml#98

@stilvoid
Copy link
Member

stilvoid commented Mar 7, 2018

Ok, so I've been digging in to this and pyyaml's behaviour is actually perfectly fine. It only omits quotes when the number could not parse as a valid octal number - for example, if it contains a digit higher than '7'.

>>> print(yaml.dump({"test with an 8": "012345678"}))
{test with an 8: 012345678}

>>> print(yaml.dump({"test with no 8": "012345677"}))
{test with no 8: '012345677'}

Which pyyaml then correctly loads back in as strings in both cases:

>>> print(yaml.load("{test with an 8: 012345678}"))
{'test with an 8': '012345678'}

>>> print(yaml.load("{test with no 8: '012345677'}"))
{'test with no 8': '012345677'}

So given that the output from pyyaml is correct, valid yaml with no ambiguity (a sequence of digits that begins with '0' but contains invalid octal digits, e.g. '8' or '9' is correctly interpreted as a string)...

where is the issue seen?

@markpeek
Copy link
Contributor

@stilvoid take a look at this issue cloudtools/troposphere#994 where an AWS account ID run through cfn-flip to yaml will have the leading 0 truncated causing an illegal account ID in CloudFormation.

@stilvoid
Copy link
Member

Ok, so it looks like a difference in implementation of the yaml spec. pyyaml (used by cfn-flip) treats invalid sequences of digits (e.g. 012345678 is not valid octal) as strings whereas CloudFormation tries to interpret them as numbers and, naturally, fails. This is probably better for safety so I'm going to try a fix by quoting all digit sequences from cfn-flip. This should be ok with CloudFormation but I'll test it.

@stilvoid
Copy link
Member

Ok, I think I've fixed this in the branch issue-41

I'll try it out with a few templates to make sure.

@perlpunk
Copy link

@stilvoid
In my opinion the YAML spec is pretty clear on that:
http://yaml.org/type/int.html (Spec for YAML 1.1 int)

The plain scalar 012348 does not match any of the regular expressions for int, therefor it is interpreted as a string. That's why the string "012348" can be dumped without quotes. A YAML processor interpreting this as an octal number has a bug and should be fixed, IMHO.

@perlpunk
Copy link

Actually, see the discussion in yaml/pyyaml#98
It might be a good idea to make PyYAML compatible with 1.2 in this case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants