Skip to content

Latest commit

 

History

History
156 lines (99 loc) · 7.44 KB

README.md

File metadata and controls

156 lines (99 loc) · 7.44 KB

The Nurse

Alert whoever you want when your apps are in a bad shape. It uses Sickbay for app monitoring.

Code Climate Build Status Coverage Status

How does it work?

  1. Register the many apps you want to be monitored (with Name, URL to be checked and the HTTP statuses that indicate your app is fine).
  • Every X minutes (completely up to you) The Nurse checks your apps
  • If N of last M requests (again, completely up to you) returns a status code different from the one you expect, The Nurse will warn the Doctor about it.
    • This warn is a POST request containing the name of the service, its URL and the last M HTTP codes received. This POST will be sent to whoever URL you want.

Notice: The app also registers an entry you your DB for each health check. This way you can easily go back in time and check how was your app at any given time.

Why?

The Nurse can be used to trigger a Kill Switch mechanism in your app: When your app receives the The Nurse's request into some endpoint, it stops some critical and automatic procedure to keep going.

This can be extremely useful when dealing with a microservice architecture or when you app depends on external services.

The Nurse can be also be used as a way to monitoring your apps and warn the right people when something is bad.

Setting up

This setup assumes you have a proper Ruby workspace setted up with:

  • Ruby 2.3.1
  • Rails 5.0.0.1
  • PostgreSQL
  • Redis

Just run:

$ git clone http://github.com/IgorMarques/The-Nurse
$ cd The-Nurse
$ bundle install
$ rake db:create db:migrate

Configuring the app

The app runs just fine for demo right out of the box (you just need to register some apps). But before putting your instance of The Nurse into production, remember to set it properly for your own needs.

Registering apps

Using rails console (don't worry, we have plans to add a proper web interface in the near future), create the apps/services you want to monitor. To run the console, run:

$ bundle exec rails console

And to create the apps, run this inside the console:

Service.create(name: 'ExampleService', url: 'www.example-service.com/health', allowed_codes: [200])

NOTICE: The allowed_codes field is an array

Now your app will be properly monitored once you run the app.

Your Sickbay instance

By default, The Nurse uses my instance of Sickbay on Heroku (on a free tier plan) to run the checks. If you plan on using this app for real, please set your own Sickbay instance. The deploy on Heroku is pretty straightforward (you literally just need to push the code there).

After the setup, remember to set the ENV variable SICKBAY_URL to the proper URL.

Monitoring frequency

By default, The Nurse will check the Sickbay instance every minute. You can change this by setting up the ENV variable HEALTH_CHECK_RATE to the time in minutes you desire.

Outage criteria

By default, if 2 in the last 3 checks to the endpoint of the service return a value that is not present in the allowed_codes list, The Nurse will notify your Doctor endpoint. You can custom set both values by setting up the ENV variables ENTRIES_FETCHED and ENTRIES_OK.

Unregistering apps

You can disable the monitoring for a specific app setting its active attribute to false. Only apps with the value true are checked.

Warning whoever you want

Just set the variable DOCTOR_URL to whoever app should be notified when an outage happens. This URL should be able to receive a proper POST HTTP request with the params like:

{
  "service_name": "TheFailingService",
  "service_url": "www.this_service_failed.com/health",
  "codes": ["200", "500", "500"]
}

Running

Once everything is setted up, this will run your healthchecks :)

$ foreman start

This will start all the components of the app:

You can also start each component alone. Check the Procfile for more info.

Other use cases

As mentioned earlier, you can use The Nurse to check the health at your app at any given time The Nurse was paying attention to it.

All health checks are stored into Statuses entries. Feel free to run the SQL or active record queries you like to fetch whatever data you want.

Example:

2.3.1 :001 > Service.first.statuses

  Service Load (28.0ms)  SELECT  "services".* FROM "services" ORDER BY "services"."id" ASC LIMIT $1  [["LIMIT", 1]]
  Status Load (55.4ms)  SELECT "statuses".* FROM "statuses" WHERE "statuses"."service_id" = $1  [["service_id", 1]]
 => #<ActiveRecord::Associations::CollectionProxy [#<Status id: 1, code: 200, service_id: 1, created_at: "2016-11-29 19:25:31", updated_at: "2016-11-29 19:25:31">, #<Status id: 3, code: 200, service_id: 1, created_at: "2016-11-30 17:04:08", updated_at: "2016-11-30 17:04:08">, #<Status id: 6, code: 200, service_id: 1, created_at: "2016-11-30 17:04:59", updated_at: "2016-11-30 17:04:59">, #<Status id: 9, code: 200, service_id: 1, created_at: "2016-11-30 17:05:58", updated_at: "2016-11-30 17:05:58">, #<Status id: 12, code: 200, service_id: 1, created_at: "2016-11-30 17:06:59", updated_at: "2016-11-30 17:06:59">]>

You can do the same for outages:

2.3.1 :013 > Service.first.statuses
  Service Load (0.3ms)  SELECT  "services".* FROM "services" ORDER BY "services"."id" ASC LIMIT $1  [["LIMIT", 1]]
  Status Load (40.5ms)  SELECT "statuses".* FROM "statuses" WHERE "statuses"."service_id" = $1  [["service_id", 1]]
 => #<ActiveRecord::Associations::CollectionProxy [#<Status id: 1, code: 200, service_id: 1, created_at: "2016-11-29 19:25:31", updated_at: "2016-11-29 19:25:31">, #<Status id: 3, code: 200, service_id: 1, created_at: "2016-11-30 17:04:08", updated_at: "2016-11-30 17:04:08">, #<Status id: 6, code: 200, service_id: 1, created_at: "2016-11-30 17:04:59", updated_at: "2016-11-30 17:04:59">, #<Status id: 9, code: 200, service_id: 1, created_at: "2016-11-30 17:05:58", updated_at: "2016-11-30 17:05:58">, #<Status id: 12, code: 200, service_id: 1, created_at: "2016-11-30 17:06:59", updated_at: "2016-11-30 17:06:59">]>

Testing

This app uses Rspec for testing. To run the test suit:

$ rspec

Deploying

This project is compatible with heroku. Following their tutorial should be enough. You'll need at least one paid dyno, since the free plans only support up to two (and we have three components: the server, Sidekiq and Clockwork). Also remember to properly config a Redis to Go addon.

Plans for the future and contributing

There's still a lot to be done. Here are some features planed:

  • Web interface with the live status of each registered service
  • Web interface for managing (creating, editing, deleting, etc) services
  • Support for reading data from multiple Sickbay instances at once

Feel free to contribute with a PR :)