title | description | layout | category | subcategory | toc_h_max |
---|---|---|---|---|---|
CloudWatch Insights 101 |
Basic guide to querying against our logs with CloudWatch Insights |
article |
Reporting |
CloudWatch |
4 |
You will need an AWS account and access. See [setting up aws-vault][aws-vault] for more information.
[aws-vault]: {% link _articles/platform-setting-up-aws-vault.md %}
-
Log in to AWS
- Via terminal:
# depending on the roles you have pick between aws-vault login prod-analytics # or aws-vault login prod-power
- Via AWS GUI:
The console login method does not allow a user to switch roles or environments (ie,
sandbox-power
). If you know you will need to switch back and forth, it may be easier to log in via the AWS GUI.- Navigate to AWS
- Click the
Sign In to the Console
button on the top right - Input Account ID as
login-master
, and your IAM username and password - Click
Sign In
- Generate an MFA Code with your Yubikey
- Enter MFA code
- Click
Submit
- Click on the top right user dropdown and click on
Switch role
to log into the appropriate role.
-
Search for "CloudWatch"
![screenshot showing how to search for CloudWatch in the AWS Console]({{ site.baseurl }}/images/aws-cloudwatch-home.png)
-
Select "Logs Insights"
![screenshot showing the Logs Insights menu item in CloudWatch]({{ site.baseurl }}/images/aws-cloudwatch-logs-insights.png)
![screenshot of the query interface for CloudWatch Insights]({{ site.baseurl }}/images/aws-cloudwatch-query.png)
-
Make sure to select a log group. For most queries, we want
prod_/srv/idp/shared/log/events.log
. In the "Select up to 50 log groups" combobox, type in "events.log" to filter down the list and select the prod_ one. -
Set the time range. For consistency across timezones, we recommend the UTC timezone.
-
Write your query, and hit "Run Query"
If you are comfortable with the command line, you can also use our [query-cloudwatch][query-cloudwatch] script, which can be useful for batch processing.
{% component alert type=:info %} See [Analytics Events]({% link _articles/analytics-events.md %}) for the most up-to-date documentation of individual events and their fields. {% endcomponent %}
This query filters down to one event, "SP redirect initiated":
fields @timestamp, name, properties.event_properties.success, @message
| filter name = 'SP redirect initiated'
| limit 10000
This query filters to any one of multiple events:
fields @timestamp, name, properties.event_properties.success, @message
| filter name in ['SP redirect initiated', 'User registration: complete']
| limit 10000
The Login.gov internal UUID is logged as properties.user_id
, so we can filter based on that.
A query for one user:
fields @timestamp, name, properties.event_properties.success, @message
| filter properties.user_id = 'aaaa-bbbb-cccc-dddd'
| limit 10000
A query for multiple users:
fields @timestamp, name, properties.event_properties.success, @message
| filter properties.user_id in ['aaaa-bbbb-cccc-dddd', '1111-2222-3333-4444']
| limit 10000
The AWS documentation for CloudWatch Insights is fairly thorough. If anything is left unanswered after reading this guide, that documentation is likely to contain the answer.
If you've used SQL, a lot of concepts map to CloudWatch Insights queries nicely:
SQL | CloudWatch Insights |
---|---|
SELECT |
fields or display |
WHERE |
filter |
GROUP BY |
stats ... by |
ORDER BY |
sort |
LIMIT |
limit |
CloudWatch Insights does not support joins
Use the [query-cloudwatch][query-cloudwatch] script with the --complete
option, it will
automatically break re-query for shorter time ranges until it has all matching rows.
CloudWatch Insights gives up on parsing after there are too many fields in a blob. To work around it, export the full @message
field and parse it externally, such as in Ruby
Similar to too many fields, this can be worked around by parsing externally.
Another approach is to use the parse
command to manually parse out a specific field
Briefly: Do not use count_distinct
, it is inaccurate
From the AWS documentation on count_distinct
:
Returns the number of unique values for the field. If the field has very high cardinality (contains many unique values), the value returned by count_distinct is just an approximation.
To work around this, there are a few options
-
Consider doing post processing externally, the [
query-cloudwatch
][query-cloudwatch] script has a--count-distinct
option to streamline this -
If there are fewer than 10,000 things, you can use plain
count(*)
:| stats count(*) by PROPERTY_TO_COUNT
[query-cloudwatch]: {% link _articles/devops-scripts.md %}#query-cloudwatch
- Devops CloudWatch guide
- [Analytics Events Documentation][analytics-events]
- AWS documentation for CloudWatch Insights
[analytics-events]: {% link _articles/analytics-events.md %}