Squash allocations in selector parsing and other hotspots #9564

fasaxc · 2024-12-05T17:40:33Z

Description

Noticed that selector parsing was a memory allocation hotspot in clusters with lots of policy. Most of that was down to validating selectors so we were just throwing away the parsed selector.

Check log level in various places before logging to avoid allocs in logrus.
Add tokenizer.AppendTokens() to do tokenization into a pre-allocated buffer. Use a shared instance of the parser to hold the shared token buffer (protected my mutex).
Add dedicatied Validate() function that calls parser with a flag telling it not to allocate the selector.
Replace regexes with custom code. The regex engine is relatively slow for these simple matches and it allocates for every call.
Modernise some of the tokenizer code using strings.Cut and friends; makes for easier reading and less string slicing.
Generate a String() method for the token constants.
Add more coverage tests.

Before:

goos: linux
goarch: amd64
pkg: github.com/projectcalico/calico/libcalico-go/lib/selector
cpu: Intel(R) Core(TM) i9-8950HK CPU @ 2.90GHz
BenchmarkParse-12    	  162223	      6416 ns/op	    2633 B/op	      46 allocs/op
PASS

After:

goos: linux
goarch: amd64
pkg: github.com/projectcalico/calico/libcalico-go/lib/selector
cpu: Intel(R) Core(TM) i9-8950HK CPU @ 2.90GHz
BenchmarkParse-12       	  919357	      1209 ns/op	     520 B/op	      22 allocs/op
BenchmarkValidate-12    	 3878425	       293.8 ns/op	       0 B/op	       0 allocs/op
PASS

Related issues/PRs

CORE-10829

Todos

Tests
Documentation
Release note

Release Note

Improve performance of selector parsing and validation.

Reminder for the reviewer

Make sure that this PR has the correct labels and milestone set.

Every PR needs one docs-* label.

docs-pr-required: This change requires a change to the documentation that has not been completed yet.
docs-completed: This change has all necessary documentation completed.
docs-not-required: This change has no user-facing impact and requires no docs.

Every PR needs one release-note-* label.

release-note-required: This PR has user-facing changes. Most PRs should have this label.
release-note-not-required: This PR has no user-facing changes.

Other optional labels:

cherry-pick-candidate: This PR should be cherry-picked to an earlier release. For bug fixes only.
needs-operator-pr: This PR is related to install and requires a corresponding change to the operator.

Reduce usage of WithFields on the hot path; it allocates heavily.

mazdakn

LGTM. Just left a few questions and nits.

mazdakn · 2024-12-11T23:19:26Z

libcalico-go/lib/selector/tokenizer/tokenizer.go

 			log.Debug("Remaining input: ", input)
 		}
 		startLen := len(input)
-		input = strings.TrimLeft(input, whitespace)
+		input = trimWhitespace(input)


I assume we can do this because of the limited charset we recognise, right? But what's the main motivation it?

Yes, I think TrimLeft was allocating/doing work to calculate the cut set so this was faster.

mazdakn · 2024-12-11T23:22:29Z

libcalico-go/lib/selector/tokenizer/tokenizer.go

-)
+func (t Token) String() string {
+	return fmt.Sprintf("%s(%s)", t.Kind, t.Value)
+}

 // Tokenize transforms string to token slice
 func Tokenize(input string) (tokens []Token, err error) {


nit: no need for named return values.

mazdakn · 2024-12-11T23:33:09Z

libcalico-go/lib/selector/tokenizer/tokenizer.go

-			if len(input) > 1 && input[1] == '=' {
-				tokens = append(tokens, Token{TokEq, nil})
-				input = input[2:]
+			if input, found = strings.CutPrefix(input, "=="); found {


Using CutPrefix is definitely easier to read. But I wonder why here we move to a cleaner version, while for TrimLeft the decision is to implement a new function?

fasaxc · 2024-12-12T15:44:07Z

/merge-when-ready squash-commits

marvin-tigera · 2024-12-12T15:44:14Z

OK, I will merge the pull request when it's ready, squash the commits when I merge it, and leave the branch after I've merged it.

…ico#9564) * Squash some chatty allocations. Reduce usage of WithFields on the hot path; it allocates heavily. * Avoid allocations when validating selectors. * Clean ups in tokenizer. * Tweaks. * Markups.

marvin-tigera added this to the Calico v3.30.0 milestone Dec 5, 2024

marvin-tigera added release-note-required Change has user-facing impact (no matter how small) docs-pr-required Change is not yet documented labels Dec 5, 2024

fasaxc force-pushed the squash-allocs branch 2 times, most recently from e57d003 to 51d7220 Compare December 5, 2024 18:20

fasaxc added docs-not-required Docs not required for this change and removed docs-pr-required Change is not yet documented labels Dec 5, 2024

fasaxc force-pushed the squash-allocs branch from 51d7220 to f30ffe0 Compare December 5, 2024 18:35

fasaxc added 2 commits December 6, 2024 13:28

Squash some chatty allocations.

c700962

Reduce usage of WithFields on the hot path; it allocates heavily.

Avoid allocations when validating selectors.

b1ea729

fasaxc force-pushed the squash-allocs branch from f30ffe0 to b1ea729 Compare December 6, 2024 13:29

Clean ups in tokenizer.

e570829

fasaxc force-pushed the squash-allocs branch from 64ad3f7 to 023cc9d Compare December 6, 2024 15:26

fasaxc marked this pull request as ready for review December 6, 2024 15:34

fasaxc requested a review from a team as a code owner December 6, 2024 15:34

Tweaks.

af153d2

fasaxc force-pushed the squash-allocs branch from 023cc9d to af153d2 Compare December 6, 2024 15:37

mazdakn approved these changes Dec 12, 2024

View reviewed changes

Markups.

f7aa454

marvin-tigera added merge-when-ready squash-commits labels Dec 12, 2024

marvin-tigera merged commit 927bb3e into projectcalico:master Dec 12, 2024
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Squash allocations in selector parsing and other hotspots #9564

Squash allocations in selector parsing and other hotspots #9564

fasaxc commented Dec 5, 2024 •

edited

Loading

mazdakn left a comment

mazdakn Dec 11, 2024 •

edited

Loading

fasaxc Dec 12, 2024 •

edited

Loading

mazdakn Dec 11, 2024

mazdakn Dec 11, 2024

fasaxc commented Dec 12, 2024

marvin-tigera commented Dec 12, 2024

Squash allocations in selector parsing and other hotspots #9564

Squash allocations in selector parsing and other hotspots #9564

Conversation

fasaxc commented Dec 5, 2024 • edited Loading

Description

Related issues/PRs

Todos

Release Note

Reminder for the reviewer

mazdakn left a comment

Choose a reason for hiding this comment

mazdakn Dec 11, 2024 • edited Loading

Choose a reason for hiding this comment

fasaxc Dec 12, 2024 • edited Loading

Choose a reason for hiding this comment

mazdakn Dec 11, 2024

Choose a reason for hiding this comment

mazdakn Dec 11, 2024

Choose a reason for hiding this comment

fasaxc commented Dec 12, 2024

marvin-tigera commented Dec 12, 2024

fasaxc commented Dec 5, 2024 •

edited

Loading

mazdakn Dec 11, 2024 •

edited

Loading

fasaxc Dec 12, 2024 •

edited

Loading