-
-
Notifications
You must be signed in to change notification settings - Fork 204
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Automatically send status notifications to systemd (#1029)
* First pass on systemd integration This adds a somewhat naive `SystemdService` class that will send appropriate notifications to systemd or do nothing if it doesn't detect that the process is being run by systemd. It pretty much just notifies that the process is ready, is stopping, and actively notifies the watchdog while it's running. It doesn't reach into the actual job system to check that things are OK, or hook into ActiveSupport notifications to tell systemd about other events (like restarting/reloading). Fixes #1027. * Log using notifications * Add some docs * Handle graceful shutdown notifications better * Update Sorbet typing information * Systemd errors should probably always log * Add tests * Add example systemd configuration file * Make Sorbet happy Stubbed constants are very not cool as far as Sorbet is concerned, so the way I wrote these tests before broke the linter. OTOH, trying to figure out how to make this acceptable did get some slightly nicer test setup. * Skip socket tests on JRuby * Move vendored sd_notify into lib/ * Add note about test skipping It's good to know why these are skipped so we can stop skipping them in the future if the JRuby issue is fixed (or someone figures out a workaround).
- Loading branch information
Showing
10 changed files
with
449 additions
and
3 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,92 @@ | ||
# This is an example systemd service configuration that keeps GoodJob running. | ||
# | ||
# Customize this file based on your bundler location, app directory, etc. | ||
# | ||
# TO RUN AS A USER SERVICE... | ||
# Customize and copy this file to ~/.config/systemd/user/goodjob.service | ||
# Then run: | ||
# - systemctl --user enable goodjob | ||
# - systemctl --user {start,stop,restart,status} goodjob | ||
# Also you might want to run: | ||
# - loginctl enable-linger username | ||
# So that the service is not killed when the user logs out. | ||
# | ||
# TO RUN AS A SYSTEM SERVICE... | ||
# Customize and copy this to: | ||
# (on CentOS) /usr/lib/systemd/system/goodjob.service | ||
# (on Ubuntu) /lib/systemd/system/goodjob.service | ||
# Then run (you may need to use `sudo`): | ||
# - systemctl enable goodjob | ||
# - systemctl {start,stop,restart,status} goodjob | ||
# | ||
# This file corresponds to a single GoodJob process. Add multiple copies | ||
# to run multiple processes (goodjob-1, goodjob-2, etc). | ||
# | ||
# Use `journalctl --unit goodjob -rn 100` to view the last 100 log lines. | ||
# Or `journalctl --unit goodjob --follow` to view live log output. | ||
# | ||
[Unit] | ||
Description=GoodJob Background Job Processor | ||
# Start only once the network is available. | ||
# If running Postgres locally and it's also managed by systemd, consider adding | ||
# `postgresql.service` (this list is space-separated). | ||
After=network.target | ||
|
||
# See these pages for lots of options: | ||
# | ||
# https://www.freedesktop.org/software/systemd/man/systemd.service.html | ||
# https://www.freedesktop.org/software/systemd/man/systemd.exec.html | ||
# | ||
# THOSE PAGES ARE CRITICAL FOR ANY LINUX DEVOPS WORK; read them multiple | ||
# times! systemd is a critical tool for all developers to know and understand. | ||
# | ||
[Service] | ||
# Type=notify is supported as of GoodJob v3.17.0. In earlier versions, use | ||
# Type=simple and remove the WatchdogSec line. | ||
Type=notify | ||
# If systemd doesn't get pinged by GoodJob at least this often, restart GoodJob. | ||
WatchdogSec=5s | ||
|
||
WorkingDirectory=<PATH_TO_YOUR_RAILS_APP> | ||
# The actual command to run. | ||
# If you use the system's ruby: | ||
ExecStart=/usr/local/bin/bundle exec good_job start | ||
# If you use rbenv: | ||
# ExecStart=/bin/bash -lc 'exec /home/<USERNAME>/.rbenv/shims/bundle exec good_job start' | ||
# If you use rvm in production without gemset and your ruby version is 2.6.5 | ||
# ExecStart=/home/<USERNAME>/.rvm/gems/ruby-2.6.5/wrappers/bundle exec good_job start | ||
# If you use rvm in production with gemset and your ruby version is 2.6.5 | ||
# ExecStart=/home/<USERNAME>/.rvm/gems/ruby-2.6.5@gemset-name/wrappers/bundle exec good_job start | ||
# If you use rvm in production with gemset and ruby version/gemset is specified in .ruby-version, | ||
# .ruby-gemsetor or .rvmrc file in the working directory: | ||
# ExecStart=/home/<USERNAME>/.rvm/bin/rvm in <PATH_TO_YOUR_RAILS_APP> do bundle exec good_job start | ||
|
||
# Uncomment this if you are going to use this as a system service | ||
# if using as a user service then leave commented out, or you will get an error trying to start the service | ||
# !!! Change this to your deploy user account if you are using this as a system service !!! | ||
# User=<USERNAME> | ||
# Group=<USERGROUP> | ||
# UMask=0002 | ||
|
||
# Set any environment variables your application needs, one `Environment=X` line | ||
# per environment variable. | ||
Environment=RAILS_ENV=production | ||
# Greatly reduce Ruby memory fragmentation and heap usage: | ||
# https://www.mikeperham.com/2018/04/25/taming-rails-memory-bloat/ | ||
Environment=MALLOC_ARENA_MAX=2 | ||
|
||
# If GoodJob crashes, restart after a short delay: | ||
RestartSec=1s | ||
Restart=always | ||
|
||
# Send output to the systemd journal. You can view it with: | ||
# journalctl --unit goodjob | ||
# To send output to a file, set the path here instead. | ||
StandardOutput=journal | ||
StandardError=journal | ||
|
||
# This will default to "bundler" if we don't specify it. | ||
SyslogIdentifier=goodjob | ||
|
||
[Install] | ||
WantedBy=multi-user.target |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,157 @@ | ||
# frozen_string_literal: true | ||
|
||
# This is a copy of https://github.com/agis/ruby-sdnotify as of v0.1.1 | ||
# (commit 21240f1) | ||
# Any changes have been marked with "FORK:" comments. | ||
# | ||
# It is included here because it is a very small gem, and doing so reduces | ||
# the number of dependencies and the supply chain risks they pose. | ||
|
||
# FORK: nest SdNotify inside the GoodJob module to prevent name collisions in | ||
# case a GoodJob user also uses the actual sd_notify gem. | ||
module GoodJob | ||
|
||
# The MIT License | ||
# | ||
# Copyright (c) 2017, 2018, 2019, 2020 Agis Anastasopoulos | ||
# | ||
# Permission is hereby granted, free of charge, to any person obtaining a copy of | ||
# this software and associated documentation files (the "Software"), to deal in | ||
# the Software without restriction, including without limitation the rights to | ||
# use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of | ||
# the Software, and to permit persons to whom the Software is furnished to do so, | ||
# subject to the following conditions: | ||
# | ||
# The above copyright notice and this permission notice shall be included in all | ||
# copies or substantial portions of the Software. | ||
# | ||
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR | ||
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS | ||
# FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR | ||
# COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER | ||
# IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN | ||
# CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. | ||
|
||
require "socket" | ||
|
||
# SdNotify is a pure-Ruby implementation of sd_notify(3). It can be used to | ||
# notify systemd about state changes. Methods of this package are no-op on | ||
# non-systemd systems (eg. Darwin). | ||
# | ||
# The API maps closely to the original implementation of sd_notify(3), | ||
# therefore be sure to check the official man pages prior to using SdNotify. | ||
# | ||
# @see https://www.freedesktop.org/software/systemd/man/sd_notify.html | ||
module SdNotify | ||
# Exception raised when there's an error writing to the notification socket | ||
class NotifyError < RuntimeError; end | ||
|
||
READY = "READY=1" | ||
RELOADING = "RELOADING=1" | ||
STOPPING = "STOPPING=1" | ||
STATUS = "STATUS=" | ||
ERRNO = "ERRNO=" | ||
MAINPID = "MAINPID=" | ||
WATCHDOG = "WATCHDOG=1" | ||
FDSTORE = "FDSTORE=1" | ||
|
||
def self.ready(unset_env=false) | ||
notify(READY, unset_env) | ||
end | ||
|
||
def self.reloading(unset_env=false) | ||
notify(RELOADING, unset_env) | ||
end | ||
|
||
def self.stopping(unset_env=false) | ||
notify(STOPPING, unset_env) | ||
end | ||
|
||
# @param status [String] a custom status string that describes the current | ||
# state of the service | ||
def self.status(status, unset_env=false) | ||
notify("#{STATUS}#{status}", unset_env) | ||
end | ||
|
||
# @param errno [Integer] | ||
def self.errno(errno, unset_env=false) | ||
notify("#{ERRNO}#{errno}", unset_env) | ||
end | ||
|
||
# @param pid [Integer] | ||
def self.mainpid(pid, unset_env=false) | ||
notify("#{MAINPID}#{pid}", unset_env) | ||
end | ||
|
||
def self.watchdog(unset_env=false) | ||
notify(WATCHDOG, unset_env) | ||
end | ||
|
||
def self.fdstore(unset_env=false) | ||
notify(FDSTORE, unset_env) | ||
end | ||
|
||
# @param [Boolean] true if the service manager expects watchdog keep-alive | ||
# notification messages to be sent from this process. | ||
# | ||
# If the $WATCHDOG_USEC environment variable is set, | ||
# and the $WATCHDOG_PID variable is unset or set to the PID of the current | ||
# process | ||
# | ||
# @note Unlike sd_watchdog_enabled(3), this method does not mutate the | ||
# environment. | ||
def self.watchdog? | ||
wd_usec = ENV["WATCHDOG_USEC"] | ||
wd_pid = ENV["WATCHDOG_PID"] | ||
|
||
return false if !wd_usec | ||
|
||
begin | ||
wd_usec = Integer(wd_usec) | ||
rescue | ||
return false | ||
end | ||
|
||
return false if wd_usec <= 0 | ||
return true if !wd_pid || wd_pid == $$.to_s | ||
|
||
false | ||
end | ||
|
||
# Notify systemd with the provided state, via the notification socket, if | ||
# any. | ||
# | ||
# Generally this method will be used indirectly through the other methods | ||
# of the library. | ||
# | ||
# @param state [String] | ||
# @param unset_env [Boolean] | ||
# | ||
# @return [Fixnum, nil] the number of bytes written to the notification | ||
# socket or nil if there was no socket to report to (eg. the program wasn't | ||
# started by systemd) | ||
# | ||
# @raise [NotifyError] if there was an error communicating with the systemd | ||
# socket | ||
# | ||
# @see https://www.freedesktop.org/software/systemd/man/sd_notify.html | ||
def self.notify(state, unset_env=false) | ||
sock = ENV["NOTIFY_SOCKET"] | ||
|
||
return nil if !sock | ||
|
||
ENV.delete("NOTIFY_SOCKET") if unset_env | ||
|
||
begin | ||
Addrinfo.unix(sock, :DGRAM).connect do |s| | ||
s.close_on_exec = true | ||
s.write(state) | ||
end | ||
rescue StandardError => e | ||
raise NotifyError, "#{e.class}: #{e.message}", e.backtrace | ||
end | ||
end | ||
end | ||
|
||
# FORK: Finish nesting inside GoodJob. | ||
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,69 @@ | ||
# frozen_string_literal: true | ||
|
||
require 'concurrent/timer_task' | ||
require 'good_job/sd_notify' | ||
|
||
module GoodJob # :nodoc: | ||
# | ||
# Manages communication with systemd to notify it about the status of the | ||
# GoodJob CLI. If it doesn't look like systemd is controlling the process, | ||
# SystemdService doesn't do anything. | ||
# | ||
class SystemdService | ||
def self.task_observer(_time, _output, thread_error) # :nodoc: | ||
return if thread_error.is_a? Concurrent::CancelledOperationError | ||
|
||
ActiveSupport::Notifications.instrument("systemd_watchdog_error.good_job", { error: thread_error }) | ||
GoodJob._on_thread_error(thread_error) if thread_error | ||
end | ||
|
||
# Indicates whether the service is actively notifying systemd's watchdog. | ||
def notifying? | ||
@watchdog&.running? || false | ||
end | ||
|
||
# Notify systemd that the process is ready. If the service is configured in | ||
# systemd to use the watchdog, this will also start pinging the watchdog. | ||
def start | ||
GoodJob::SdNotify.ready | ||
run_watchdog | ||
end | ||
|
||
# Notify systemd that the process is stopping and stop pinging the watchdog | ||
# if currently doing so. If given a block, it will wait for the block to | ||
# complete before stopping watchdog notifications, so systemd has a clear | ||
# indication when graceful shutdown started and finished. | ||
def stop | ||
GoodJob::SdNotify.stopping | ||
|
||
yield if block_given? | ||
|
||
@watchdog&.kill | ||
@watchdog&.wait_for_termination | ||
end | ||
|
||
private | ||
|
||
def run_watchdog | ||
return false unless GoodJob::SdNotify.watchdog? | ||
|
||
# Systemd recommends pinging the watchdog at half the configured interval: | ||
# https://www.freedesktop.org/software/systemd/man/sd_watchdog_enabled.html | ||
interval = watchdog_interval / 2 | ||
|
||
ActiveSupport::Notifications.instrument("systemd_watchdog_start.good_job", { interval: interval }) | ||
@watchdog = Concurrent::TimerTask.execute(execution_interval: interval) do | ||
GoodJob::SdNotify.watchdog | ||
end | ||
@watchdog.add_observer(self.class, :task_observer) | ||
|
||
true | ||
end | ||
|
||
def watchdog_interval | ||
return 0.0 unless GoodJob::SdNotify.watchdog? | ||
|
||
Integer(ENV.fetch('WATCHDOG_USEC')) / 1_000_000.0 | ||
end | ||
end | ||
end |
Oops, something went wrong.