Replies: 3 comments 1 reply
-
Still looking for more feedback! |
Beta Was this translation helpful? Give feedback.
-
I suggest adding an option to schedule reboots via the gui. List what changes are pending and present a window to schedule. Also, I would like to know if there is a specific order that reboots are preferred? Manager first or or last? Should we wait for the search/manager to be up before updating the XXX node? |
Beta Was this translation helpful? Give feedback.
-
I rebooted all my nodes 5 days ago. Yesterday afternoon I rebooted them all again because the Gui showed that they needed it. All showed OK. This morning all nodes on the grid show that they need rebooted again, only 16 hours later. How often is the system updating the kernel? This seems a bit frantic. At this rate should I expect the grid's 'normal' state to always show that everything needs Rebooted? It seems that the Grid section is gathering a lot of real-time info about each node already, can't we get a bit more info about these reboot requests? |
Beta Was this translation helpful? Give feedback.
-
We have had lots of feedback from our customers and our community concerning reboots. Let's talk about the history of rebooting in past versions. This only applies to non airgap deployments:
16.04 - All kernel, OS, and Security Onion updates were handled through soup. This means your packages did not auto update.
2.3 - We decoupled the OS and kernel updates from the Security Onion updates. This allowed auto updating of the kernel and the OS. We added a message to the MOTD every node in the grid so that if you connected via ssh you would see a list of machines that needed a reboot due to kernel updates.
2.4 - With the introduction of the web interface, the need to log in via ssh is a lot less frequent. We needed a way to communicate that certain systems in your grid needed to be rebooted so we added it to the grid screen. At first we used the blue warning on the menu side but removed that because it looked like there was a problem with grid health which is really not the case. If you ssh in you will still see the MOTD with the hosts needing to be rebooted.
We have had several users from our customer base and community that have asked for the ability to do auto reboots. I think it is important to discuss the implications of that. If you have a large amounts of Elastic data then auto reboots are a really bad idea. For a home user this might not be a big deal as it doesn't take a long time to load the indices back after the reboot. If you have 50TB of Elastic data you could be down for almost an hour as the shards initialize. In instances like this you would want to stagger your reboots in order to keep your cluster healthy so you don't experience downtime. Another potential scenario is you are in the middle of a live incident and your grid starts rebooting.
The reason we did not lock the kernel is we wanted to give users the freedom to keep their systems up to date and not require us to release something in order to do this. We do hold the salt versions and the docker to ensure stability in the product. An example of this was the recent ssh vulnerability that was automatically patched if you had auto updates turned on. Users did not have to wait for SO version 2.4.X for this update. Keep in mind that not all kernel updates are critical and everyone has different change windows in which to do maintenance.
I created this poll below to get your input to see if there is something that we can do that makes sense.
The only thing I would feel comfortable about rebooting en masse would be sensors. You can do this with the following command:
salt "*_sensor" -b 5 system.reboot
This will reboot all of your sensors in batches of 5.
Thanks for your input!
11 votes ·
Beta Was this translation helpful? Give feedback.
All reactions