memmetrics: simplify locking and solve data race #209

sbunce · 2020-11-05T23:17:41Z

This package was updating counters concurrently because of incorrect
locking. There's not really a reason to have two separate locks or try
to optimize with RW locks.

This change replaces the two RW locks with one exclusive lock. The
RTMetrics struct methods always acquire this exclusive lock. This is
simple and will be easier to keep correct as the code evolves.

This package was updating counters concurrently because of incorrect locking. There's not really a reason to have two separate locks or try to optimize with RW locks. This change replaces the two RW locks with one exclusive lock. The RTMetrics struct methods always acquire this exclusive lock. This is simple and will be easier to keep correct as the code evolves.

dustin-decker

LGTM

On a 3ghz intel CPU we do zero allocs and record takes 255ns.

sbunce · 2020-11-06T18:43:33Z

Here is the performance before/after with the benchmark I added.

We went from 280ns -> 255ns on my intel 3ghz CPU. We got a little faster by switching from RWMutex to Mutex. The benchmark I added isn't concurrent. But this is sufficient evidence that we're not introducing any performance issue with this change. 255ns means we can lock around 11.7 million times per second which is much higher than any HTTP request rates.

Before this change.

seth.bunce@GQBS9Z2-DT:~/go/src/github.com/vulcand/oxy$ go test -count=1 -bench=. ./memmetrics
goos: linux
goarch: amd64
pkg: github.com/vulcand/oxy/memmetrics
BenchmarkRecord-20    	 4032144	       280 ns/op	       0 B/op	       0 allocs/op
PASS
ok  	github.com/vulcand/oxy/memmetrics	1.463s

After this change.

seth.bunce@GQBS9Z2-DT:~/go/src/github.com/vulcand/oxy$ go test -count=1 -bench=. ./memmetrics
goos: linux
goarch: amd64
pkg: github.com/vulcand/oxy/memmetrics
BenchmarkRecord-20    	 4430594	       253 ns/op	       0 B/op	       0 allocs/op
PASS
ok  	github.com/vulcand/oxy/memmetrics	1.431s

sylr · 2020-11-07T10:23:44Z

Maybe add a concurrent benchmark ?

func BenchmarkRecordConcurrently(b *testing.B) {
	b.ReportAllocs()

	rr, err := NewRTMetrics(RTClock(testutils.GetClock()))
	require.NoError(b, err)

	// warm up metrics. Adding a new code can do allocations, but in the steady
	// state recording a code is cheap. We want to measure the steady state.
	const codes = 100
	for code := 0; code < codes; code++ {
		rr.Record(code, time.Second)
	}

	concurrency := runtime.NumCPU() * 2
	wg := sync.WaitGroup{}
	wg.Add(concurrency)

	b.ResetTimer()
	for i := 0; i < concurrency; i++ {
		go func() {
			for j := 0; j < b.N; j++ {
				rr.Record(j%codes, time.Second)
			}
			wg.Done()
		}()
	}

	wg.Wait()
}

sbunce · 2020-11-09T17:39:53Z

Our IT guys upgraded my CPU and I didn't even know it. I have 10 cores 20 threads at 3.3Ghz now. (the previous benchmarks were also on this CPU)
Intel(R) Core(TM) i9-9820X CPU @ 3.30GHz

Good idea to add a concurrent benchmark because the context in which this package will be used should be highly concurrent.

I switched the benchmark so that b.N is divided among goroutines so we can do more of an apples to apples comparison. This makes it so the same number of results are recorded in the concurrent and non-concurrent test. I also set concurrency (number of goroutines in the benchmark) equal to the number of CPUs (hardware threads). The goroutines will end up having affinity to a hardware thread, so having extra concurrency would add more scheduler overhead. I tried setting concurrency double to number of CPUs and it only made like 40ns/op difference.

The poor scaling with CPU count of RWMutex is explained here.
golang/go#17973

Here is before the locking change. We observe a 6.8x slowdown with concurrency.

seth.bunce@GQBS9Z2-DT:~/go/src/github.com/vulcand/oxy$ go test -bench=. ./memmetrics
goos: linux
goarch: amd64
pkg: github.com/vulcand/oxy/memmetrics
BenchmarkRecord-20                	 4036707	       281 ns/op	       0 B/op	       0 allocs/op
BenchmarkRecordConcurrently-20    	  919545	      1929 ns/op	       0 B/op	       0 allocs/op
--- BENCH: BenchmarkRecordConcurrently-20
    roundtrip_test.go:48: NumCPU: 20, Concurrency: 20, GOMAXPROCS: 20
PASS
ok  	github.com/vulcand/oxy/memmetrics	3.260s

Here is after the locking change. We observe a 4x slowdown with concurrency. We're roughly twice as efficient as the old code with RWMutex.

seth.bunce@GQBS9Z2-DT:~/go/src/github.com/vulcand/oxy$ go test -bench=. ./memmetrics
goos: linux
goarch: amd64
pkg: github.com/vulcand/oxy/memmetrics
BenchmarkRecord-20                	 4408964	       255 ns/op	       0 B/op	       0 allocs/op
BenchmarkRecordConcurrently-20    	 1221457	      1009 ns/op	       0 B/op	       0 allocs/op
--- BENCH: BenchmarkRecordConcurrently-20
    roundtrip_test.go:48: NumCPU: 20, Concurrency: 20, GOMAXPROCS: 20
PASS
ok  	github.com/vulcand/oxy/memmetrics	3.669s

dreamstick · 2023-04-03T15:19:07Z

So this issue has not been resolved yet？ i have the same issue. i think this issue may cause inaccurate statistics？ why not resolve thie issue?

sbunce marked this pull request as ready for review November 5, 2020 23:22

sbunce mentioned this pull request Nov 5, 2020

Data race in memmetrics/counter. Locking problem. #207

Open

dustin-decker approved these changes Nov 6, 2020

View reviewed changes

add benchmark for recording status code

a71cfa1

On a 3ghz intel CPU we do zero allocs and record takes 255ns.

dustin-decker approved these changes Nov 6, 2020

View reviewed changes

add concurrent benchmark for recording the status code

b494329

sbunce force-pushed the sbunce/fix-data-race branch from bee0f61 to b494329 Compare November 9, 2020 17:38

Merge branch 'master' into sbunce/fix-data-race

2c97128

ldez self-requested a review November 9, 2020 18:49

dustin-decker approved these changes Nov 30, 2020

View reviewed changes

ldez force-pushed the master branch from 236342f to 885db13 Compare May 7, 2022 01:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

memmetrics: simplify locking and solve data race #209

memmetrics: simplify locking and solve data race #209

sbunce commented Nov 5, 2020

dustin-decker left a comment

sbunce commented Nov 6, 2020

sylr commented Nov 7, 2020

sbunce commented Nov 9, 2020

dreamstick commented Apr 3, 2023 •

edited

Loading

memmetrics: simplify locking and solve data race #209

Are you sure you want to change the base?

memmetrics: simplify locking and solve data race #209

Conversation

sbunce commented Nov 5, 2020

dustin-decker left a comment

Choose a reason for hiding this comment

sbunce commented Nov 6, 2020

sylr commented Nov 7, 2020

sbunce commented Nov 9, 2020

dreamstick commented Apr 3, 2023 • edited Loading

dreamstick commented Apr 3, 2023 •

edited

Loading