Options to monitor good_job in production? #750

GregoryVds · 2022-11-17T12:53:44Z

Hello,

First, thanks for sharing this nice project, it looks awesome!

We are currently considering a switch from Sidekiq to Goodjob (essentially for the durability gaurantees that it offers compared to Sidekiq).

However I coudn't find anything related to observability/monitoring for good_job. Today, Prometheus/OpenTelemetry is becoming a de-facto standard for monitoring. With Sidekiq, we currently use the Sidekiq Prometheus Exporter. This allows monitoring queue depth, queue latency and so forth and get paged/alerted in case some value goes above a certain threshold.

Is there anything similar that exists with good-job? (I coudn't find anything).
If not, how do people typically monitor a good_job deployment in production?

Cheers,
Gregory

sandstrom · 2023-01-13T11:40:54Z

There are job monitoring tools such as Cronitor one can use.

They have an integration for Sidekiq, but currently no ready-made plugin for GoodJob. They do have a Ruby SDK though, so integration Good Job with them shouldn't be too difficult.

I think this is an area where GoodJob could provide an interface to access metrics, but shouldn't be too concerned with exactly how it's being monitored.

The actual monitoring should be delegated to other tools that does it well (instead of building too much functionality into GoodJob).

gagalago · 2023-04-24T09:48:57Z

seems duplicate with #532

This removes the health check logic from the ProbeServer and renames the ProbeServer to UtilityServer that accepts any Rack based app. The health check and catchall logic are moved into simple Rack middleware that can be composed by users however they like and be used to preserve existing health check behavior while transitioning to a more general purpose utility server. All and all this pattern will allow users to add whatever functionality they like to GoodJob's web server by composing Rack apps and using GoodJob's configuration to pass in users' Rack apps. IE: ``` config.good_job.middleware = Rack::Builder.app do use GoodJob::Middleware::MyCustomMiddleware use GoodJob::Middleware::PrometheusExporter use GoodJob::Middleware::Healthcheck run GoodJob::Middleware::CatchAll end config.good_job.middleware_port = 7001 ``` This could help resolve: * bensheldon#750 * bensheldon#532

* Make health probe server more general purpose This removes the health check logic from the ProbeServer and renames the ProbeServer to UtilityServer that accepts any Rack based app. The health check and catchall logic are moved into simple Rack middleware that can be composed by users however they like and be used to preserve existing health check behavior while transitioning to a more general purpose utility server. All and all this pattern will allow users to add whatever functionality they like to GoodJob's web server by composing Rack apps and using GoodJob's configuration to pass in users' Rack apps. IE: ``` config.good_job.middleware = Rack::Builder.app do use GoodJob::Middleware::MyCustomMiddleware use GoodJob::Middleware::PrometheusExporter use GoodJob::Middleware::Healthcheck run GoodJob::Middleware::CatchAll end config.good_job.middleware_port = 7001 ``` This could help resolve: * #750 * #532 * Use new API * Revert server name change We decided to leave the original ProbeServer name better sets expectations. See: #1079 (review) This also splits out middleware testing into separate specs. * Restore original naming This also helps ensure that the existing behavior and API remain intact. * Appease linters * Add required message for mock * Make test description relevant * Allow for handler to be injected into ProbeServer * Add WEBrick WEBrick handler * Add WEBrick as a development dependency * Add WEBrick tests and configuration * Add idle_timeout method to mock * Namespace server handlers * Warn and fallback when WEBrick isn't loadable Since the probe server has the option to use WEBrick as a server handler, but this library doesn't have WEBrick as a dependency, we want to throw a warning when WEBrick is configured, but not in the load path. This will also gracefully fallback to the built in HTTP server. * inspect load path * Account for multiple webrick entries in $LOAD_PATH * try removing load path test * For error on require to initiate test As opposed to manipulating the load path. * Handle explicit nils in intialization * Allow probe_handler to be set in configuration * Add documentation for probe server customization * Appease linter * retrigger CI * Rename `probe_server_app` to `probe_app`; make handler name a symbol; rename Rack middleware/app for clarity * Update readme to have relevant app example * Fix readme grammar --------- Co-authored-by: Ben Sheldon [he/him] <[email protected]>

* Make health probe server more general purpose This removes the health check logic from the ProbeServer and renames the ProbeServer to UtilityServer that accepts any Rack based app. The health check and catchall logic are moved into simple Rack middleware that can be composed by users however they like and be used to preserve existing health check behavior while transitioning to a more general purpose utility server. All and all this pattern will allow users to add whatever functionality they like to GoodJob's web server by composing Rack apps and using GoodJob's configuration to pass in users' Rack apps. IE: ``` config.good_job.middleware = Rack::Builder.app do use GoodJob::Middleware::MyCustomMiddleware use GoodJob::Middleware::PrometheusExporter use GoodJob::Middleware::Healthcheck run GoodJob::Middleware::CatchAll end config.good_job.middleware_port = 7001 ``` This could help resolve: * bensheldon/good_job#750 * bensheldon/good_job#532 * Use new API * Revert server name change We decided to leave the original ProbeServer name better sets expectations. See: bensheldon/good_job#1079 (review) This also splits out middleware testing into separate specs. * Restore original naming This also helps ensure that the existing behavior and API remain intact. * Appease linters * Add required message for mock * Make test description relevant * Allow for handler to be injected into ProbeServer * Add WEBrick WEBrick handler * Add WEBrick as a development dependency * Add WEBrick tests and configuration * Add idle_timeout method to mock * Namespace server handlers * Warn and fallback when WEBrick isn't loadable Since the probe server has the option to use WEBrick as a server handler, but this library doesn't have WEBrick as a dependency, we want to throw a warning when WEBrick is configured, but not in the load path. This will also gracefully fallback to the built in HTTP server. * inspect load path * Account for multiple webrick entries in $LOAD_PATH * try removing load path test * For error on require to initiate test As opposed to manipulating the load path. * Handle explicit nils in intialization * Allow probe_handler to be set in configuration * Add documentation for probe server customization * Appease linter * retrigger CI * Rename `probe_server_app` to `probe_app`; make handler name a symbol; rename Rack middleware/app for clarity * Update readme to have relevant app example * Fix readme grammar --------- Co-authored-by: Ben Sheldon [he/him] <[email protected]>

bensheldon added this to GoodJob Backlog v2 Nov 17, 2022

bensheldon moved this to Inbox in GoodJob Backlog v2 Nov 17, 2022

jklina mentioned this issue Sep 17, 2023

Make health probe server more general purpose #1079

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Options to monitor good_job in production? #750

Options to monitor good_job in production? #750

GregoryVds commented Nov 17, 2022

sandstrom commented Jan 13, 2023 •

edited

Loading

gagalago commented Apr 24, 2023

Options to monitor good_job in production? #750

Options to monitor good_job in production? #750

Comments

GregoryVds commented Nov 17, 2022

sandstrom commented Jan 13, 2023 • edited Loading

gagalago commented Apr 24, 2023

sandstrom commented Jan 13, 2023 •

edited

Loading