Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TT-2539] Transaction logs #6841

Merged
merged 22 commits into from
Jan 22, 2025
Merged

[TT-2539] Transaction logs #6841

merged 22 commits into from
Jan 22, 2025

Conversation

LLe27
Copy link
Contributor

@LLe27 LLe27 commented Jan 21, 2025

User description

TT-2539
Summary Transaction/Access Logs
Type Story Story
Status In Dev
Points N/A
Labels A, America's, CSE, Gold, QA_Fail, customer_request, innersource, jira_escalated

FR Jira Ticket

https://tyktech.atlassian.net/browse/TT-2539

Description

  • Added the TYK_GW_ACCESSLOGS_ENABLED Gateway config option

    • The Tyk Gateway will determine to print access logs to STDOUT for both success and error handling situations
    • If the TYK_GW_ACCESSLOGS_ENABLED is set to true then the Gateway will print access logs to STDOUT
    • If the TYK_GW_ACCESSLOGS_ENABLED is set to false then the Gateway will not print access logs to STDOUT
  • Added the TYK_GW_ACCESSLOGS_TEMPLATE Gateway config option

    • If the TYK_GW_ACCESSLOGS_TEMPLATE is contains the following options below then the Custom Log Template will be used to print out the access logs
    • api_key will include they obfuscated or hashed key.
    • client_ip will include the ip of the request.
    • host will include the host of the request.
    • latency_total will include the total latency of the request.
    • method will include the request method.
    • path will include the path of the request.
    • protocol will include the protocol of the request.
    • remote_addr will include the remote address of the request.
    • status will include the response status code.
    • upstream_address will include the upstream address (scheme, host and path)
    • upstream_latency will include the upstream latency of the request.
    • use_agent will include the user agent of the request.

Note that this feature is off by default and that the AccessLog struct only contains the more common elements. Below are some examples of an access log

time="Jan 20 11:10:57" level=debug msg="Using CUSTOM access log template" api_id=bbcde1bde9a946af4695bb53b615de93 api_name=httpbin org_id=678e6771247d80fd2c435bf3
time="Jan 20 11:10:57" level=info api_id=bbcde1bde9a946af4695bb53b615de93 api_key=82ebbc4b34021b9f api_name=httpbin client_ip="::1" client_remote_addr="[::1]:65000" host="localhost:8080" method=GET org_id=678e6771247d80fd2c435bf3 prefix=access-log protocol=HTTP/1.1 request_url=/get status_code=200 total_latency=116 upstream_address="http://httpbin.org/get" upstream_latency=115 user_agent=PostmanRuntime/7.43.0

Related Issue

Motivation and Context

Today the Tyk Gateway does not print access logs for success API calls but instead only for error API calls. Providing access logs for both scenarios within the Tyk Gateway is extremely valuable especially if you are monitoring logs, capturing analytics or even debugging. Providing the option to turn on or off the Tyk Gateway access logs will provide clients more insights in for API calls in regards to success and error situations.

How This Has Been Tested

  • Manual testing
  • Unit testing
  • Performance testing/benchmarks

Screenshots (if appropriate)

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Refactoring or add test (improvements in base code or adds test coverage to functionality)

Checklist

  • I ensured that the documentation is up to date
  • I explained why this PR updates go.mod in detail with reasoning why it's required
  • I would like a code coverage CI quality gate exception and have explained why

PR Type

Enhancement, Tests


Description

  • Introduced a new AccessLogsConfig for transaction logging.

  • Added functionality to log access details for success and error handlers.

  • Implemented a new accesslog package for structured logging.

  • Enhanced test coverage with unit and benchmark tests for access logs.


Changes walkthrough 📝

Relevant files
Enhancement
5 files
config.go
Added `AccessLogsConfig` for transaction logging configuration.
+29/-0   
handler_error.go
Added access log handling for error responses.                     
+7/-0     
handler_success.go
Added access log handling for successful responses.           
+7/-0     
middleware.go
Implemented `recordAccessLog` for structured transaction logging.
+53/-0   
access_log.go
Introduced `accesslog` package for structured access log handling.
+170/-0 
Tests
2 files
access_log_test.go
Added unit tests for `accesslog` package.                               
+50/-0   
access_log_test.go
Added benchmark tests for access log performance.               
+85/-0   
Configuration changes
2 files
schema.json
Updated schema to include `access_logs` configuration.     
+12/-0   
Taskfile.yml
Updated test coverage configuration for `accesslog` package.
+2/-1     

💡 PR-Agent usage: Comment /help "your question" on any pull request to receive relevant information

@buger
Copy link
Member

buger commented Jan 21, 2025

I'm a bot and I 👍 this PR title. 🤖

Copy link
Contributor

github-actions bot commented Jan 21, 2025

API Changes

--- prev.txt	2025-01-22 13:47:12.142787349 +0000
+++ current.txt	2025-01-22 13:47:07.469731070 +0000
@@ -4904,6 +4904,33 @@
 
 TYPES
 
+type AccessLogsConfig struct {
+	// Enabled controls enabling the transaction logs. Default: false.
+	Enabled bool `json:"enabled"`
+
+	// Template configures which fields to log into the access log.
+	// If unconfigured, all fields listed will be logged.
+	//
+	// Example: ["client_ip", "path"].
+	//
+	// Template Options:
+	//
+	// - `api_key` will include they obfuscated or hashed key.
+	// - `client_ip` will include the ip of the request.
+	// - `host` will include the host of the request.
+	// - `method` will include the request method.
+	// - `path` will include the path of the request.
+	// - `protocol` will include the protocol of the request.
+	// - `remote_addr` will include the remote address of the request.
+	// - `upstream_address` will include the upstream address (scheme, host and path)
+	// - `upstream_latency` will include the upstream latency of the request.
+	// - `latency_total` will include the total latency of the request.
+	// - `user_agent` will include the user agent of the request.
+	// - `status` will include the response status code.
+	Template []string `json:"template"`
+}
+    AccessLogsConfig defines the type of transactions logs printed to stdout.
+
 type AnalyticsConfigConfig struct {
 	// Set empty for a Self-Managed installation or `rpc` for multi-cloud.
 	Type string `json:"type"`
@@ -5356,6 +5383,10 @@
 	// If not set or left empty, it will default to `standard`.
 	LogFormat string `json:"log_format"`
 
+	// AccessLogs configures the output for access logs.
+	// If not configured, the access log is disabled.
+	AccessLogs AccessLogsConfig `json:"access_logs"`
+
 	// Section for configuring OpenTracing support
 	// Deprecated: use OpenTelemetry instead.
 	Tracer Tracer `json:"tracing"`
@@ -8212,6 +8243,10 @@
 
 func (t *BaseMiddleware) OrgSessionExpiry(orgid string) int64
 
+func (t *BaseMiddleware) RecordAccessLog(req *http.Request, resp *http.Response, latency analytics.Latency)
+    RecordAccessLog is used for Success/Error handler logging. It emits a log
+    entry with populated access log fields.
+
 func (t *BaseMiddleware) SetName(name string)
 
 func (t *BaseMiddleware) SetOrgExpiry(orgid string, expiry int64)
@@ -12607,6 +12642,8 @@
     manipulate
 
 func MyPluginReturningError(rw http.ResponseWriter, r *http.Request)
+# Package: ./tests/accesslog
+
 # Package: ./tests/coprocess
 
 package coprocess // import "github.com/TykTechnologies/tyk/tests/coprocess"

Copy link
Contributor

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 4 🔵🔵🔵🔵⚪
🧪 PR contains tests
🔒 Security concerns

Sensitive information exposure:
The WithApiKey function in access_log.go processes API keys, which are either obfuscated or hashed. Ensure that the obfuscation and hashing mechanisms are secure and comply with best practices to prevent sensitive information exposure.

⚡ Recommended focus areas for review

Possible Misconfiguration

The AccessLogsConfig struct introduces a new Template field for customizing access logs. Ensure that the default behavior and template validation logic are robust and handle edge cases, such as invalid or empty templates.

// AccessLogsConfig defines the type of transactions logs printed to stdout.
type AccessLogsConfig struct {
	// Enabled controls enabling the transaction logs. Default: false.
	Enabled bool `json:"enabled"`

	// Template defaults to empty which prints the default log.
	// Set this value to determine which fields will be printed in the access log.
	// Example: ["..."].
	//
	// Template Options:
	// - `api_key` will include they obfuscated or hashed key.
	// - `client_ip` will include the ip of the request.
	// - `host` will include the host of the request.
	// - `latency_total` will include the total latency of the request.
	// - `method` will include the request method.
	// - `path` will include the path of the request.
	// - `protocol` will include the protocol of the request.
	// - `remote_addr` will include the remote address of the request.
	// - `status` will include the response status code.
	// - `upstream_address` will include the upstream address (scheme, host and path)
	// - `upstream_latency` will include the upstream latency of the request.
	// - `use_agent` will include the user agent of the request.
	Template []string `json:"template"`
}
Logging Performance

The recordAccessLog function processes and filters fields for access logs. Verify that this implementation does not introduce significant performance overhead, especially under high traffic conditions.

// recordAccessLog is only used for Success/Error handler
func (t *BaseMiddleware) recordAccessLog(req *http.Request, resp *http.Response, latency *analytics.Latency) {
	hashKeys := t.Gw.GetConfig().HashKeys
	accessLog := accesslog.NewRecord()
	allowedFields := t.Gw.GetConfig().AccessLogs.Template
	fields := accessLog.Fields()

	// Set the access log fields
	accessLog.WithApiKey(req, hashKeys, t.Gw.obfuscateKey)
	accessLog.WithClientIP(req)
	accessLog.WithRemoteAddr(req)
	accessLog.WithHost(req)
	accessLog.WithLatencyTotal(latency)
	accessLog.WithMethod(req)
	accessLog.WithPath(req)
	accessLog.WithProtocol(req)
	accessLog.WithStatus(resp)
	accessLog.WithUpstreamAddress(req)
	accessLog.WithUpstreamLatency(latency)
	accessLog.WithUserAgent(req)

	// Check for template config
	if isValidAccessLogTemplate(allowedFields) {
		// Filter based on custom log template
		t.logger.Debug("Using CUSTOM access log template")
		filteredFields := accesslog.Filter(&fields, allowedFields)
		t.Logger().WithFields(*filteredFields).Info()
	} else {
		// Default access log
		t.logger.Debug("Using DEFAULT access log template")
		t.Logger().WithFields(fields).Info()
	}
}

func isValidAccessLogTemplate(template []string) bool {
	// Check if template is empty
	if len(template) == 0 {
		return false
	}

	// Check if there is at least one non-empty config string
	for _, conf := range template {
		if strings.TrimSpace(conf) != "" {
			return true
		}
	}
	return false
}
Field Filtering Logic

The Filter function in access_log.go filters log fields based on allowed templates. Ensure that this logic is efficient and correctly handles edge cases, such as case sensitivity and unexpected input.

// Filter filters the input logrus fields and retains only the allowed fields.
func Filter(in *logrus.Fields, allowedFields []string) *logrus.Fields {
	// Create a map to quickly check if a field is allowed.
	allowed := make(map[string]struct{}, len(allowedFields))
	for _, field := range allowedFields {
		allowed[strings.ToLower(field)] = struct{}{}
	}

	// Create a new logrus.Fields to store the filtered fields.
	filtered := logrus.Fields{}

	// Add the "prefix" field by default, if it exists in the input
	if prefix, exists := (*in)["prefix"]; exists {
		filtered["prefix"] = prefix
	}

	// Filter keys based on config
	for key, value := range *in {
		if _, exists := allowed[strings.ToLower(key)]; exists {
			filtered[key] = value
		}
	}

	return &filtered
}

Copy link
Contributor

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Score
Possible issue
Prevent nil pointer dereference in URL handling

Add a check to ensure that the req.URL is not nil in WithUpstreamAddress to prevent
potential nil pointer dereferences.

internal/httputil/accesslog/access_log.go [114-120]

-upstreamAddress := &url.URL{
-    Scheme: req.URL.Scheme,
-    Host:   req.URL.Host,
-    Path:   req.URL.Path,
+if req.URL != nil {
+    upstreamAddress := &url.URL{
+        Scheme: req.URL.Scheme,
+        Host:   req.URL.Host,
+        Path:   req.URL.Path,
+    }
+    a.fields["upstream_address"] = upstreamAddress.String()
 }
Suggestion importance[1-10]: 9

Why: The suggestion addresses a potential nil pointer dereference in WithUpstreamAddress by adding a check for req.URL. This is a valid and impactful improvement that enhances the robustness of the code.

9
General
Add test for nil URL handling

Include test cases to validate behavior when req.URL is nil in WithUpstreamAddress
to ensure the function handles such scenarios gracefully.

internal/httputil/accesslog/access_log_test.go [28-30]

 req := httptest.NewRequest(http.MethodGet, "http://example.com/path?userid=1", nil)
 req.RemoteAddr = "0.0.0.0"
+req.URL = nil // Simulate nil URL scenario
Suggestion importance[1-10]: 8

Why: The suggestion proposes adding a test case to validate behavior when req.URL is nil, which complements the robustness improvement suggested for WithUpstreamAddress. This is a valid and useful addition to ensure the function handles edge cases gracefully.

8

@titpetric titpetric changed the title Feature/tt 2539/transaction logs [TT-2539] Transaction logs Jan 21, 2025
@LLe27 LLe27 force-pushed the feature/tt-2539/transaction_logs branch from 106aaa9 to 50f0820 Compare January 22, 2025 13:46
@LLe27 LLe27 merged commit 5a9c0ff into master Jan 22, 2025
30 of 41 checks passed
@LLe27 LLe27 deleted the feature/tt-2539/transaction_logs branch January 22, 2025 18:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants