Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Byte position -- wrong for larger than buffer size #26

Open
zerothi opened this issue Apr 30, 2024 · 0 comments
Open

Byte position -- wrong for larger than buffer size #26

zerothi opened this issue Apr 30, 2024 · 0 comments

Comments

@zerothi
Copy link

zerothi commented Apr 30, 2024

raise Exception(u'Unexpected \\{} in backslash encoding! Position {}'.format(c.decode('utf-8'), self.readbuf_read - 1))

The error messages for locating erroneous json files can be quite handy, however, I encounter some problems with deciphering the actual buffer position.

I might be wrong, but I mashed up this small test example:

import bigjson as json

with open("test.json", 'rb') as f:
    try:
        j = json.load(f)["Jobs"]
    except Exception as e:
        print(e)

json.FileReader._READBUF_CHUNK_SIZE = 10
with open("test.json", 'rb') as f:
    try:
        j = json.load(f)["Jobs"]
    except Exception as e:
        print(e)

and then parsing this on a faulty json file resulted in 2 different byte positions.

Here is a snippet of the json file:

{
"Jobs":[
{
"WallTeff":49.88
},
{
"WallTlimit":inf
},
]
}

I get this when running the faulty code:

Unexpected bytes! Value '}' Position 48
Unexpected bytes! Value '}' Position 3

I am not sure how to fix bigjson but I would have expected that both errors showed the same byte-location. I think it has to do with not using _tell_read_pos in the errors. But I am not fully sure?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant