Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add tips on preserving file metadata #19

Open
mvdan opened this issue Feb 21, 2020 · 3 comments
Open

Add tips on preserving file metadata #19

mvdan opened this issue Feb 21, 2020 · 3 comments

Comments

@mvdan
Copy link

mvdan commented Feb 21, 2020

I have a program that used to overwrite input files in-place, and that was lacking atomicity. When switching to renameio, the biggest question for me was - what perm bits should I use? The old os.OpenFile(path, os.O_WRONLY|os.O_TRUNC, 0) call didn't need to change any permissions, so it's not obvious what should I do here.

I don't think there's a way to replace a file and keep its permission bits in an atomic way. For now, I'm just using os.Lstat first to grab the bits, then using them for renameio.WriteFile. I realise that's racy, for example if the permissions change between the two operations, but it seems like an OK tradeoff while keeping it impossible to lose data.

Maybe I'm alone in seeing this as a gap in the README or docs. Do you reckon any tips would be a good addition?

@mvdan
Copy link
Author

mvdan commented Feb 21, 2020

I also realise one may not be able to keep other metadata in place in all cases, such as the owner or group of the file. I think we can largely ignore that, as it's normal for software to write files that belong to the current user. Such information is also not stored in git, whereas the executable bit of the permissions is stored.

@stapelberg
Copy link
Collaborator

I have a program that used to overwrite input files in-place, and that was lacking atomicity. When switching to renameio, the biggest question for me was - what perm bits should I use? The old os.OpenFile(path, os.O_WRONLY|os.O_TRUNC, 0) call didn't need to change any permissions, so it's not obvious what should I do here.

My rule of thumb: specify permission bits that are minimal and most standard: 0644 and 0755 are widely known/understood good defaults for regular files and directories (or executable files), respectively. Privacy sensitive data should default to 0600 and 0700, respectively. If users want to further influence modes, they can set the umask accordingly.

I don't think there's a way to replace a file and keep its permission bits in an atomic way. For now, I'm just using os.Lstat first to grab the bits, then using them for renameio.WriteFile. I realise that's racy, for example if the permissions change between the two operations, but it seems like an OK tradeoff while keeping it impossible to lose data.

I think you’re right in that there is no atomic way. At the same time, as you say, using os.Lstat is going to be fine in practice if you’re set on retaining modes. The most reliable way for users to perform a chmod operation is:

  1. chmod while your program is not also running
  2. if that’s impossible, repeat the chmod operation a few times and it will most likely succeed

Whether retaining modes makes sense is something that largely depends on the program you’re writing, I think. For example, in debiman, we create HTML files with 0644, and I think requiring users to add an extra, external step in their pipeline to adjust permissions is reasonable. This entirely side-steps the problem with retaining modes, removing a failure mode entirely. I think that makes it the more elegant choice.

Maybe I'm alone in seeing this as a gap in the README or docs. Do you reckon any tips would be a good addition?

Adding a few sentences to the README to cover this nuance seems fine to me. PRs welcome :)

I also realise one may not be able to keep other metadata in place in all cases, such as the owner or group of the file. I think we can largely ignore that, as it's normal for software to write files that belong to the current user. Such information is also not stored in git, whereas the executable bit of the permissions is stored.

Yeah, IIUC git only distinguishes between regular or executable file, which seems reasonable for a version control system, and is aligned with my earlier suggestion of using 0644 and 0755 as permissions. I think Prometheus uses tar files to retain non-standard permissions in its procfs tests.

@mvdan
Copy link
Author

mvdan commented Feb 22, 2020

Fair enough - thanks for the input.

You're right that this depends on the use case. My use case is shfmt, which can format scripts in-place. Some scripts are executable, and some others are not, and I don't want the tool to force either of them. The separate lstat works fine for this purpose.

I agree that a line or two in the README would be good, even if this use case isn't very common :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants