Skip to content

Stage 5: Monitoring and Observability

Past due by 12 months 0% complete

Define metrics and monitor the system in production to detect errors and potential issues

Using a monitoring system, which includes a server, collection agents and a monitoring console, deploy the agents to collect the defined indicators (metrics).

  • Configure monitoring dashboards to track key metrics and configure alerts to notify potential issues.
  • Set t…

Define metrics and monitor the system in production to detect errors and potential issues

Using a monitoring system, which includes a server, collection agents and a monitoring console, deploy the agents to collect the defined indicators (metrics).

  • Configure monitoring dashboards to track key metrics and configure alerts to notify potential issues.
  • Set threshold values and create relevant alarms based on measured metrics
  • Define and create automatic actions in response to these alarms
  • When the console shows a fault on an indicator, determine the cause of the problem
  • Regularly report to developers on performance statistics of their applications in production

Bonus step

  • Define metrics, "health points" relevant to the application.
  • Create dashboard to make sense of metric collection.
  • Define thresholds from which it is considered that there is a malfunction of the architecture and/or the application, this must trigger an alarm.
  • Define automatic actions in response to these alarms.
Loading