Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parsing bug in oscar.pyx #45

Open
cbogart opened this issue Apr 28, 2021 · 3 comments
Open

Parsing bug in oscar.pyx #45

cbogart opened this issue Apr 28, 2021 · 3 comments
Assignees

Comments

@cbogart
Copy link
Contributor

cbogart commented Apr 28, 2021

New oscar.pyx changes may be triggering some parsing bugs. The following code, run on da4:

import oscar

for commit in oscar.Project("buttermilk-crypto_b2"):
    print(commit)

Produces a few correct responses, then a parse error:

0fb93e4f750d75f6c1ccaf1f1e43b5680b82b61f
12b14fb30e7483edaf87ca6f3c4f97d836ea801f
14e609896cbe0819cd9f80ddd6902a58d4cfda40
1aa03a789bdb2dda02a196946f44d63e422880fc
1c7d68857bfc439deb43875db253e97437c6f358
221e8b6571dad078a2c086b13b6e774b7156e519
224d1fa0f9b68801418d57e0e591b28aace1ca79
247e29b7c1bd0d3d6557c338249133be49a77398
27e5cb3d5e4e3f6d67353d5825d3174894a0238d
304bd7c5fd53ea52ff17a800b623e88ac823ecfb
356d25daa208e992d10045f8b19e9dcc4ef886e0
Traceback (most recent call last):
  File "demo_bug.py", line 3, in <module>
    for commit in oscar.Project("buttermilk-crypto_b2"):
  File "oscar.pyx", line 1355, in __iter__
  File "oscar.pyx", line 949, in oscar.Commit.__getattr__
  File "oscar.pyx", line 1056, in oscar.Commit._parse
ValueError: need more than 1 value to unpack

The problem appears to be in line 1056 of the Commit class; this sometimes doesn't work, and it appears that maybe self.data is returning an empty string in some cases:

    def _parse(self):
        self.header, self.full_message = self.data.split(b'\n\n', 1)
@cbogart
Copy link
Contributor Author

cbogart commented Apr 28, 2021

Here's a more precise example:

import oscar

c1 = oscar.Commit("356d25daa208e992d10045f8b19e9dcc4ef886e0")
print(c1.author)
c2 = oscar.Commit("39fdb9fa930fcac4a182b3e4d29190c0f436c925")
print(c2.author)

prints

b'buttermilk-crypto <[email protected]>'
Traceback (most recent call last):
  File "demo_bug.py", line 6, in <module>
    print(c2.author)
  File "oscar.pyx", line 949, in oscar.Commit.__getattr__
  File "oscar.pyx", line 1056, in oscar.Commit._parse
ValueError: need more than 1 value to unpack

because 39fdb9fa930fcac4a182b3e4d29190c0f436c925 is not present in /fast/All.sha1c. So it could be a dataset problem, or maybe oscar.pyx needs to handle it more gracefully.

(BTW, the commit is also not in github.

https://api.github.com/repos/buttermilk-crypto/b2/commits/39fdb9fa930fcac4a182b3e4d29190c0f436c925

does not appear, even though it is returned by echo "buttermilk-crypto_b2" | ~/lookup/getValues p2c)

@audrism
Copy link
Contributor

audrism commented Apr 28, 2021

Content of some commits associated with a project may be lost and the current repo may no longer have them: hence no way to recover it

Yes, the instantiation of commits for which only sha1 is available need to be done more gracefully: such commits will always be there as there is often no way to recover lost content

Doing this may need nontrivial design decisions: do you have a suggestion?

@user2589
Copy link
Member

I'm going to look into this on the weekend. The goal is to catch these cases and use information from other relations that we have, e.g. to retrieve parents, author etc., and remove reasonable defaults (blank strings / special values) for the rest.

@user2589 user2589 self-assigned this Apr 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants