-
Notifications
You must be signed in to change notification settings - Fork 318
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New approach to squashfs images #798
base: main
Are you sure you want to change the base?
Conversation
Because my /opt/compiler-explorer is filled with other stuff like CE git repo's, I did Question;
|
Cool! In this case you could have "just" used
Each installation will be at least 2 sqfs files:
To make it concrete, something like:
The old root The plan is to never have more than a few layers around, with a weekly consolidation process that "flattens" down images into a single base layer again. Or something like that.
Correct: the
Right! The hope is to literally put everything in one sqfs. And then actually squash the daliy builds in an overlay so there's one unified way of seeing all the compilers etc. No more "hack the squashfs" stuff. And if we rebuild something we can
We have some choices:
Yeah this is a great point. I was going to make a
Yes! autofs is (deliberately) configured with a timeout here so it unmounts stuff pretty aggressively. We can of course change that, but the value here is one that's battle-tested in $dayjob so it should be ok for most practical purposes (if what you're seeing is indeed timeouts). |
Re: baking it into an AMI. Daily builds would be fine, but I would not be looking forward to using terraforming as a deploy mechanism. So maybe just fetching the hash from S3 is the best option here. |
I lied about tuning: |
I see. Can we maybe turn that off for production if we find out that causes issues? |
It shouldn't cause issues - any open file handle inside the mount is enough to keep it mounted (and also bump the timeout forward). We use this exact autofs setup at $dayjob. The auto unmount is what allows "magic" atomic swaps with no cleanup. "Just" swap out the symlink that points to the new autofs mount point and everybody picks up the new stuff during their next compile, and then the old mounts die after a timeout. |
Ok I see. But if you're super unlucky and it's a quiet day and you hit an instance that hasn't had activity for 10 minutes, you have a nanominuscule chance that you connect during an unmount/mount cycle. Probably fine. |
We can avoid this my moving the health check file inside the sqfs root - similar to how we currently have the health check on nfs. Every ELB health check would resolve the symlink and ensure the most recent root is mounted. |
but what's true for the root is also true for all the data images too? And we've never seen this at dayjob: I wonder if there's something specifically odd going on with shells left |
5-6ish minutes in this case |
Thanks for the info partouf. I am 99.5% sure that's an artifact of Separately to this: I've been trying to tidy up the nomenclature. For more PoC stuff I'll need to do:
Once I've tried that out I'll write it up more and we can discuss. Maybe on Friday?! |
…python3.8 and below'" This reverts commit 396f775.
This reverts commit 8b74dfc.
Will be resurrecting this, or something like this. I'm strongly considering a separate project to handle "squashfs utility" type things as I think this is more generally applicable, though whatever it is will need to work for CE's use case and would be part of the CE project more widely. |
Intention is to replace
/opt/compiler-explorer
entirely with something like this! at least configurably; should allow for both regularce_install install ..
backed by real files and supermagic squashfs things, pretty much transparently.Very much WIP! But I got this PoC working well enough to see promise. Ideally I'll run with a whole base image and see what we get.
This does mean we rely entirely on the
yaml
files to install everything "from scratch" when we rebuild. We may not want this but for dicsussion!