-
Notifications
You must be signed in to change notification settings - Fork 548
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Why not using os.rename to write file? #467
Comments
Hey! I've thought about this and I absolutely see the value of not damaging the database file when the process dies during a write operation. But I also see a cost in that creating a temporary file may not be something that application authors have considered when writing their applications. E.g. I can imagine this leading to some sort of weird permissions error where the running can't create new files in the directory of the database file. As a compromise my thought was that this could be a new storage backend that is included with TinyDB, named something like What do you think? |
Hi, Markus, thanks for your reply. Moreover, I think that applications authors who are really concerned about performance, wouldn't probably choose TinyDB: However, a second solution is possible, if you want to optimize this aspect, which would be more performant, that is to say the double write technique. It avoids to create the temporary file at each write. Let me know what you think about this :-) EDIT: This is not related to this issue. But, what do you think of storing each collection in a separate json file? It could improve performance a lot, without affecting too much the simplicity of the library. |
Thanks for your input, @ostafen! I haven't forgotten about this and will reply in the next few days! 🙂 |
Thanks again for your input, @ostafen! I’ll admit that I’m really torn about this. Let me explain: On the one hand, I absolutely agree that software should strive towards reliability even in challenging circumstances like application crashes or power outages. If I was to write TinyDB from scratch a second time, I'd seriously consider shipping the default But on the other hand, what I’m most worried about with a change like this is introducing subtle and hard-to-catch bugs. My main concern is that making TinyDB write a temporary file in the directory of the database file might break user’s assumption that TinyDB only tries to access the file it is told to access. Maybe I’m a little paranoid here, but I’ve had to debug issues that arose from libraries changing stuff like this quietly in a minor or patch release. These experiences made me cautious about changes that do not directly lead to errors but rather invalidate assumptions about how the library works under the hood and thus leads to subtle breakage where other programs rely on these assumptions. All that being said, I’m open to have my mind changed. I’d really appreciate input from other users and people from the community if I’m too cautious here and we should just implement these changes, or if we should go for a compromise like the
I think this would break user assumptions even more than writing a temporary file altough I’d love to see a community-based extension that adds a storage backend like this! |
@msiemens, I don't really see why an user should make assumptions about the database layout on disk. However, I understand your point of view, especially if you are not familiar with this solution, and I don't want to influence your opinion about this. |
Actually I remember people asking questions about the database layout becase folks wanted to process TinyDB database files in external programs 😉 Altough unfortunately I can't find these discussions right now. But to the original question: I think I suffered some sort of analysis paralysis with this decision. Looking at this again I think we should go ahead with merging your PR and then document this feature properly (in the release notes and in the documentation). I got stuck with thinking about this because TinyDB as a project has way (way!) more users than I ever expected or dreamed and sometimes I'm a little scared about inadvertantly breaking stuff for some people. If the PyPI stats are to be believed, TinyDB has had more than 8 million downloads in total (although a lot will come from PyPI mirrors and CI pipelines). With a scale like this I'm sure that a lot of really rare edge cases will trigger just through sheer scale 🙈 But I think if we inform users in the relase notes they'll be able to figure out what's happening if they run into issues 🙂 |
You are right to be careful with this kind of things. Anyway, as far as I'm concerned, this is a standard technique. I've seen it being used in a lot of projects (for example, several databases use this to safely save metadata files). |
If you want to avoid disturbing users, you could create a new major version or make this optional in a minor version with a warning and make it the default in next major version. |
Hi, I noticed that the JsonStorage uses the write method to rewrite the entire file.
However if a crash occurs while the file is being written, the db could be damaged.
Why don't write to a temporary file and rename?
You can check what I mean in PR #468 :)
The text was updated successfully, but these errors were encountered: