-
Notifications
You must be signed in to change notification settings - Fork 18
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Added logs schema with a note for data warehouse (#569)
* added logs schema with a note for data warehouse * added unextracted word that was being used for a table name prefix in the data warehouse
- Loading branch information
1 parent
97c9961
commit 23cdaa7
Showing
2 changed files
with
93 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,92 @@ | ||
--- | ||
title: "Data Warehouse Log Tables Schema" | ||
description: "Schema definition for logs tables in the data warehouse" | ||
layout: article | ||
category: "Reporting" | ||
subcategory: "Data Warehouse" | ||
--- | ||
|
||
# Data Warehouse Logs Schema | ||
|
||
This is a guide to provide the schema definitions for the four log tables in our Data Warehouse: | ||
|
||
- `events.logs` | ||
- `production.logs` | ||
- `unextracted_events.logs` | ||
- `unextracted_production.logs` | ||
|
||
## logs.production | ||
|
||
The `logs.production` table contains the following fields: | ||
|
||
- `cloudwatch_timestamp` | ||
- `message` | ||
- `uuid` (primary key) | ||
- `method` | ||
- `path` | ||
- `format` | ||
- `controller` | ||
- `action` | ||
- `status` | ||
- `duration` | ||
- `git_sha` | ||
- `git_branch` | ||
- `timestamp` | ||
- `pid` | ||
- `user_agent` | ||
- `ip` | ||
- `host` | ||
- `trace_id` | ||
|
||
## logs.events | ||
|
||
The `logs.events` table contains the following fields: | ||
|
||
- `cloudwatch_timestamp` | ||
- `message` | ||
- `id` (primary key) | ||
- `name` | ||
- `time` | ||
- `visitor_id` | ||
- `visit_id` | ||
- `log_filename` | ||
- `new_event` | ||
- `path` | ||
- `user_id` | ||
- `locale` | ||
- `user_ip` | ||
- `hostname` | ||
- `pid` | ||
- `service_provider` | ||
- `trace_id` | ||
- `git_sha` | ||
- `git_branch` | ||
- `user_agent` | ||
- `browser_name` | ||
- `browser_version` | ||
- `browser_platform_name` | ||
- `browser_platform_version` | ||
- `browser_device_name` | ||
- `browser_mobile` | ||
- `browser_bot` | ||
- `success` | ||
|
||
## logs.unextracted_events | ||
|
||
The `logs.unextracted_events` table contains the following fields: | ||
|
||
- `cloudwatch_timestamp` | ||
- `message` | ||
|
||
## logs.unextracted_production | ||
|
||
The `logs.unextracted_production` table contains the following fields: | ||
|
||
- `cloudwatch_timestamp` | ||
- `message` | ||
|
||
> **NOTE:** At present, we only allow valid JSON to land into the data warehouse tables. For example, the production logs containing Ruby hash will be ignored. | ||
> | ||
> Below is an example of the logs that will NOT be ingested: | ||
> | ||
> `2024-06-10T17:10:15.234Z;"{:name=>""unused_identity_config_keys"", :keys=>[:ab_testing_idv_ten_digit_otp_enabled, :ab_testing_idv_ten_digit_otp_percent, :acuant_timeout, :disallow_all_web_crawlers, :doc_auth_exit_question_section_enabled, :doc_auth_selfie_capture_enabled, :platform_authentication_enabled, :phone_recaptcha_mock_validator]}"` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -46,6 +46,7 @@ | |
"ThreatMetrix", | ||
"touchpoints", | ||
"triaging", | ||
"unextracted", | ||
"wargame", | ||
"wargames", | ||
"wireframe", | ||
|