Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use structured logging #50

Merged
merged 1 commit into from
Nov 19, 2024

Conversation

uablrek
Copy link
Contributor

@uablrek uablrek commented Jul 3, 2024

Use structured logging in json format if -logging-format=json is specified. The same setup functions as for K8s core packages are used (component-base).

Also, contextual logging is used (beta in 1.30, so no feature-gateway is needed).

Fixes #48

This PR will be a draft until all code is changed to use structured/contextual logging.

@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jul 3, 2024
@k8s-ci-robot k8s-ci-robot requested review from aojea and danwinship July 3, 2024 10:49
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jul 3, 2024
@uablrek
Copy link
Contributor Author

uablrek commented Jul 3, 2024

/cc @pohly

@pohly Can you please review the updates in main() in the first commit? I don't want to do any fundamental mistakes. The rest I think I can manage.

@k8s-ci-robot k8s-ci-robot requested a review from pohly July 3, 2024 10:51
@uablrek
Copy link
Contributor Author

uablrek commented Jul 3, 2024

Example output:

# kubectl logs -n kube-system   kube-network-policies-8hb7p | jq
...
{
  "ts": 1720002972865.8306,
  "caller": "cache/reflector.go:359",
  "msg": "Caches populated for *v1.Pod from pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:232",
  "v": 2
}
{
  "ts": 1720002972953.2607,
  "caller": "cache/shared_informer.go:320",
  "msg": "Caches are synced for kube-network-policies",
  "v": 0
}
{
  "ts": 1720002972953.3674,
  "caller": "networkpolicy/metrics.go:61",
  "msg": "Registering metrics",
  "v": 0
}
{
  "ts": 1720002972953.4724,
  "caller": "networkpolicy/controller.go:344",
  "msg": "Syncing nftables rules",
  "v": 0
}

@uablrek
Copy link
Contributor Author

uablrek commented Jul 3, 2024

Lint complaints:

Error: cmd/main.go:81:3: exitAfterDefer: os.Exit will exit, and `defer cancel()` will not run (gocritic)

This is just as bad with klog.Fatalf, but lint misses that. A pattern seem to be:

func main() {
  os.Exit(run())
}
func run() int {
  return errorCode
}

Shall we use it? Else I restore klog.Fatalf in main()

I will call cancel() explicitly. (but since it's right before exit, it will not really matter)

@aojea
Copy link
Contributor

aojea commented Jul 3, 2024

func main() {
  os.Exit(run())
}
func run() int {
  return errorCode
}

that sounds like the right way to do it

@aojea
Copy link
Contributor

aojea commented Jul 9, 2024

@uablrek what is missing here? it lgtms

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added lgtm "Looks good to me", indicates that a PR is ready to be merged. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Jul 9, 2024
@uablrek
Copy link
Contributor Author

uablrek commented Jul 10, 2024

@uablrek what is missing here? it lgtms

Lots and lots of Infof(), and friends, should be changed to InfoS().

Also i was planning to introduce contextual logging, but I think that can wait.

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed lgtm "Looks good to me", indicates that a PR is ready to be merged. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jul 11, 2024
@uablrek
Copy link
Contributor Author

uablrek commented Jul 11, 2024

Structured logging allows a nice optimization possibility. Without this p.String() is always called, regardless of verbosity.

	// evaluatePacket() should be fast unless trace logging is enabled.
	// Logging optimization: We check if V(2) is enabled before hand,
	// rather than evaluating the all parameters make an unnecessary logger call
	tlogger := logger.V(2)
	if tlogger.Enabled() {
		tlogger.Info("Evaluating packet", "packet", p)
		tlogger = tlogger.WithValues("id", p.id)
	}

@@ -371,7 +373,7 @@ func (c *Controller) Run(ctx context.Context) error {

nf, err := nfqueue.Open(&config)
if err != nil {
klog.Infof("could not open nfqueue socket: %v", err)
logger.Info("could not open nfqueue socket", "error", err)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logging like this exist at several places. I am unsure how to handle them. A general recommendation is to not log errors if you return them (let the caller decide). And normally logger.Error(err, ...) is used for logging errors, but then it will be at a higher log-level. I made a compromise by keeping the log at Info level.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A way to keep info about what caused the error is to wrap them:

return fmt.Errorf("could not open nfqueue socket %w", err)

(ref: recent discussion on slack)

@uablrek
Copy link
Contributor Author

uablrek commented Jul 12, 2024

Example packet log in JSON format:

{
  "ts": 1720777151838.3552,
  "caller": "networkpolicy/controller.go:461",
  "msg": "Evaluating packet",
  "v": 2,
  "packet": {
    "Id": 9,
    "Family": "IPv4",
    "SrcIP": "11.0.3.3",
    "DstIP": "11.0.2.2",
    "Proto": "TCP",
    "SrcPort": 33803,
    "DstPort": 6000
  }
}

Text output is not altered:

I0712 09:53:52.257809       1 controller.go:461] "Evaluating packet" packet=<
        [8] 11.0.1.3:40243 11.0.2.2:6000 TCP
        00000000  00 00 a0 02 fa f0 19 33  00 00 02 04 05 b4 04 02  |.......3........|
        00000010  08 0a c1 d9 c1 bd 00 00  00 00 01 03 03 07        |..............|
 >

@uablrek uablrek marked this pull request as ready for review July 12, 2024 16:19
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jul 12, 2024
@k8s-ci-robot k8s-ci-robot requested a review from thockin July 12, 2024 16:19
@uablrek
Copy link
Contributor Author

uablrek commented Jul 12, 2024

I will squash commits later

@aojea
Copy link
Contributor

aojea commented Jul 12, 2024

let me know once is ready for merging

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 17, 2024
cmd/main.go Outdated
ctx, cancel := signal.NotifyContext(
context.Background(), os.Interrupt, unix.SIGINT)
defer cancel()
logger := klog.NewKlogr()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't be needed and might never give you a JSON logger.

Leave the context as it is. A klog.FromContext will then get the global default prepared by logsapi.ValidateAndApply.

Suggested change
logger := klog.NewKlogr()
logger := klog.FromContext(ctx)

cmd/main.go Outdated
context.Background(), os.Interrupt, unix.SIGINT)
defer cancel()
logger := klog.NewKlogr()
ctx = klog.NewContext(ctx, logger)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delete this.

@uablrek uablrek force-pushed the structured-logging branch from babda54 to a247ecc Compare July 21, 2024 08:17
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 21, 2024
@uablrek uablrek force-pushed the structured-logging branch 2 times, most recently from 468ae28 to 0d6caa3 Compare July 21, 2024 09:39
@uablrek
Copy link
Contributor Author

uablrek commented Jul 21, 2024

Rebased, updated after review, and use klog.KObj. Ready for review.

Please note that payload is empty for TCP/SYN, which is the only TCP packets we see:

I0721 08:56:53.060501       1 controller.go:491] "Evaluating packet" packet=<
        [10] 11.0.4.3:34751 11.0.4.2:6000 TCP
 >

@uablrek
Copy link
Contributor Author

uablrek commented Jul 21, 2024

let me know once is ready for merging

@aojea Ready for review at least. I hope for merging. No logic is altered (except in main()), but very many log printouts

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 4, 2024
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 4, 2024
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 8, 2024
@uablrek
Copy link
Contributor Author

uablrek commented Nov 17, 2024

The flag-logging can be made in different ways:

  1. The "structured object way" (this doesn't look so good for non-json logging):
	flag.VisitAll(func(flag *flag.Flag) {
		logger.Info("flag", "flag", flag)
	})
{
  "ts": 1731837080411.8074,
  "caller": "cmd/main.go:83",
  "msg": "flag",
  "v": 0,
  "flag": {
    "Name": "nfqueue-id",
    "Usage": "Number of the nfqueue used",
    "Value": 100,
    "DefValue": "100"
  }
}
  1. The "name/value way" (this is what I selected)
	flag.VisitAll(func(flag *flag.Flag) {
		logger.Info("flag", "name", flag.Name, "value", flag.Value)
	})
{
  "ts": 1731837858630.011,
  "caller": "cmd/main.go:83",
  "msg": "flag",
  "v": 0,
  "name": "nfqueue-id",
  "value": "100"
}

Non-json logging:

I1117 11:05:07.914817  119614 main.go:83] "flag" name="nfqueue-id" value="100"
  1. The "non structured logging way" (not recommended)
	flag.VisitAll(func(flag *flag.Flag) {
		klog.Infof("FLAG: --%s=%q", flag.Name, flag.Value)
	})
{
  "ts": 1731838042913.3442,
  "caller": "cmd/main.go:84",
  "msg": "FLAG: --nfqueue-id=\"100\"",
  "v": 0
}

Klog is initiated with support for json output, and a context
with a logger is created. The context catches term signals.
Log packets in JSON format for -logging-format=json.
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 17, 2024
@uablrek
Copy link
Contributor Author

uablrek commented Nov 17, 2024

The failing test seem unrelated to the updates. I ran e2e:

(BASE_FAMILY=IPv6 PROXY_MODE=nftables)
export SKIP=NOTHING
export FOCUS=Netpol
Ran 50 of 6622 Specs in 132.293 seconds
SUCCESS! -- 50 Passed | 0 Failed | 0 Pending | 6572 Skipped

@uablrek
Copy link
Contributor Author

uablrek commented Nov 18, 2024

/retest

@aojea
Copy link
Contributor

aojea commented Nov 18, 2024

helper.go:123: StatefulSet replicas in namespace network-policy-conformance-forbidden-forrest not rolled out yet. 0/2 replicas are available.

it seems a resource problem, let me retry

@paulgmiller can you please review, since you have been working on this in a separate PR

/assign @paulgmiller

@k8s-ci-robot
Copy link
Contributor

@aojea: GitHub didn't allow me to assign the following users: paulgmiller.

Note that only kubernetes-sigs members with read permissions, repo collaborators and people who have commented on this issue/PR can be assigned. Additionally, issues/PRs can only have 10 assignees at the same time.
For more information please see the contributor guide

In response to this:

helper.go:123: StatefulSet replicas in namespace network-policy-conformance-forbidden-forrest not rolled out yet. 0/2 replicas are available.

it seems a resource problem, let me retry

@paulgmiller can you please review, since you have been working on this in a separate PR

/assign @paulgmiller

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@aojea
Copy link
Contributor

aojea commented Nov 18, 2024

/lgtm
/hold

waiting for @paulgmiller review, we can unhold tomorrow if he does not have time to review

@paulgmiller just do /hold cancel if you are happy with the PR

@k8s-ci-robot k8s-ci-robot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. lgtm "Looks good to me", indicates that a PR is ready to be merged. labels Nov 18, 2024
Copy link
Contributor

@paulgmiller paulgmiller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems good sorry to be slow had to learn some things about k8s.io/component-base/logs/

@@ -444,11 +446,11 @@ func (c *Controller) Run(ctx context.Context) error {
}

startTime := time.Now()
klog.V(2).Infof("Processing sync for packet %d", *a.PacketID)
logger.V(2).Info("Processing sync for packet", "id", *a.PacketID)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we use the same tlogger := logger.V(2) if tlogger.Enabled tricke we use here as we use at 517 in evaluage packet?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not as urgent in this case since only PacketID is passed (not an entire packet analyzed)

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: aojea, paulgmiller, uablrek

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@paulgmiller
Copy link
Contributor

/hold cancel

left one perf comment but doesn't seeem worth holding up

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 19, 2024
@k8s-ci-robot k8s-ci-robot merged commit 6bdcaa5 into kubernetes-sigs:main Nov 19, 2024
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use structured logging
5 participants