Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent files from being partially read from the cache #3819

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 15 additions & 2 deletions SCons/CacheDir.py
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,20 @@ def CacheRetrieveFunc(target, source, env):
if fs.islink(cachefile):
fs.symlink(fs.readlink(cachefile), t.get_internal_path())
else:
env.copy_from_cache(cachefile, t.get_internal_path())
try:
env.copy_from_cache(cachefile, t.get_internal_path())
except:
try:
# In case file was partially retrieved (and now corrupt)
# delete it to avoid poisoning commands like 'ar' that
# read from the initial state of the file they are writing
# to.
t.fs.unlink(t.get_internal_path())
except:
pass

raise
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Raise what? The original exception from env.copy_from_cache or t.fs.unlink? This behaves differently in python2 and 3:

$ cat testexc.py
try:
    raise ValueError
except:
    try:
        raise AttributeError
    except:
        pass
    raise

$ python2.7 testexc.py
Traceback (most recent call last):
  File "testexc.py", line 6, in <module>
    raise AttributeError
AttributeError

$ python3.8 testexc.py
Traceback (most recent call last):
  File "testexc.py", line 3, in <module>
    raise ValueError
ValueError

I guess that this is what we're after:

$ cat testexc.py
try:
    raise ValueError
except Exception as e1:
    try:
        raise AttributeError
    except Exception as e2:
        pass
    finally:
        raise e1

$ python2.7 testexc.py
Traceback (most recent call last):
  File "testexc.py", line 10, in <module>
    raise e1
ValueError

$ python3.8 testexc.py
Traceback (most recent call last):
  File "testexc.py", line 10, in <module>
    raise e1
  File "testexc.py", line 3, in <module>
    raise ValueError
ValueError

We should probably catch only the expected exceptions, and not all of them.

Copy link
Collaborator

@mwichmann mwichmann Oct 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably catch only the expected exceptions, and not all of them.

indeed - the CI checks include linters on modified/added code, and the Sider checker is in fact complaining on these (E722 do not use bare 'except')

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, an irritation here is we don't know for sure what the underlying retrieval method will be, due to layers of abstraction. It defaults to shutil.copy2; shutil.copy if the decider is timestamp-newer, but could presumably be changed.


try:
os.utime(cachefile, None)
except OSError:
Expand All @@ -70,7 +83,7 @@ def CacheRetrieveString(target, source, env):
cd = env.get_CacheDir()
cachedir, cachefile = cd.cachepath(t)
if t.fs.exists(cachefile):
return "Retrieved `%s' from cache" % t.get_internal_path()
return "Retrieving `%s' from cache" % t.get_internal_path()
return None

CacheRetrieve = SCons.Action.Action(CacheRetrieveFunc, CacheRetrieveString)
Expand Down
7 changes: 5 additions & 2 deletions SCons/Taskmaster.py
Original file line number Diff line number Diff line change
Expand Up @@ -223,10 +223,13 @@ def execute(self):
cached_targets.append(t)
if len(cached_targets) < len(self.targets):
# Remove targets before building. It's possible that we
# partially retrieved targets from the cache, leaving
# them in read-only mode. That might cause the command
# retrieved a subset of targets from the cache, leaving
# them in an inconsistent state. That might cause the command
# to fail.
#
# Note that retrieve_from_cache() ensures no single target can
# be partially retrieved (file left in corrupt state).
#
for t in cached_targets:
try:
t.fs.unlink(t.get_internal_path())
Expand Down