Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Blocklist update fix #4446

Merged
merged 3 commits into from
Dec 17, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -103,6 +103,7 @@ querier:
* [BUGFIX] Initialize histogram buckets to 0 to avoid downsampling. [#4366](https://github.com/grafana/tempo/pull/4366) (@javiermolinar)
* [BUGFIX] Utilize S3Pass and S3User parameters in tempo-cli options, which were previously unused in the code. [#4259](https://github.com/grafana/tempo/pull/4259) (@faridtmammadov)
* [BUGFIX] Fixed an issue in the generator where the first batch was counted 2x against a traces size. [#4365](https://github.com/grafana/tempo/pull/4365) (@joe-elliott)
* [BUGFIX] Fix compaction bug in SingleBinaryMode that could lead to 2x, 3x, etc TraceQL metrics results [#4446](https://github.com/grafana/tempo/pull/4446) (@mdisibio)
* [BUGFIX] Unstable compactors can occassionally duplicate data. Check for job ownership during compaction and cancel a job if ownership changes. [#4420](https://github.com/grafana/tempo/pull/4420) (@joe-elliott)

# v2.6.1
Expand Down
102 changes: 54 additions & 48 deletions tempodb/blocklist/list.go
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
package blocklist

import (
"slices"
"sync"

"github.com/google/uuid"
"github.com/grafana/tempo/tempodb/backend"
)

Expand Down Expand Up @@ -125,64 +125,70 @@ func (l *List) Update(tenantID string, add []*backend.BlockMeta, remove []*backe
// updateInternal exists to do the work of applying updates to held PerTenant and PerTenantCompacted maps
// it must be called under lock
func (l *List) updateInternal(tenantID string, add []*backend.BlockMeta, remove []*backend.BlockMeta, compactedAdd []*backend.CompactedBlockMeta, compactedRemove []*backend.CompactedBlockMeta) {
// ******** Regular blocks ********
blocklist := l.metas[tenantID]

matchedRemovals := make(map[uuid.UUID]struct{})
for _, b := range blocklist {
for _, rem := range remove {
if b.BlockID == rem.BlockID {
matchedRemovals[(uuid.UUID)(rem.BlockID)] = struct{}{}
break
}
hasID := func(id backend.UUID) func(*backend.BlockMeta) bool {
return func(b *backend.BlockMeta) bool {
return b.BlockID == id
}
}

existingMetas := make(map[uuid.UUID]struct{})
newblocklist := make([]*backend.BlockMeta, 0, len(blocklist)-len(matchedRemovals)+len(add))
// rebuild the blocklist dropping all removals
for _, b := range blocklist {
existingMetas[(uuid.UUID)(b.BlockID)] = struct{}{}
if _, ok := matchedRemovals[(uuid.UUID)(b.BlockID)]; !ok {
newblocklist = append(newblocklist, b)
hasIDC := func(id backend.UUID) func(*backend.CompactedBlockMeta) bool {
return func(b *backend.CompactedBlockMeta) bool {
return b.BlockID == id
}
}
// add new blocks (only if they don't already exist)
for _, b := range add {
if _, ok := existingMetas[(uuid.UUID)(b.BlockID)]; !ok {
newblocklist = append(newblocklist, b)
}
}

l.metas[tenantID] = newblocklist

// ******** Compacted blocks ********
compactedBlocklist := l.compactedMetas[tenantID]

compactedRemovals := map[uuid.UUID]struct{}{}
for _, c := range compactedBlocklist {
for _, rem := range compactedRemove {
if c.BlockID == rem.BlockID {
compactedRemovals[(uuid.UUID)(rem.BlockID)] = struct{}{}
break
// ******** Regular blocks ********
if len(add) > 0 || len(remove) > 0 || len(compactedAdd) > 0 || len(compactedRemove) > 0 {
var (
existing = l.metas[tenantID]
final = make([]*backend.BlockMeta, 0, max(0, len(existing)+len(add)-len(remove)))
)

// rebuild dropping all removals
for _, b := range existing {
if slices.ContainsFunc(remove, hasID(b.BlockID)) {
continue
}
final = append(final, b)
}
}
// add new if they don't already exist and weren't also removed
for _, b := range add {
if slices.ContainsFunc(final, hasID(b.BlockID)) ||
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

before we built a map for lookups and it looks like we're just doing linear searches here. wdyt think about sorting and doing a binary search?

Copy link
Contributor Author

@mdisibio mdisibio Dec 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I forgot to include the benchmarks but this version is already significantly faster, mainly by eliminating the maps, which reduces allocs.

pkg: github.com/grafana/tempo/tempodb/blocklist
cpu: Apple M4 Pro
       │ before.txt  │             after.txt              │
       │   sec/op    │   sec/op     vs base               │
Update   5.485m ± 2%   1.212m ± 2%  -77.91% (p=0.002 n=6)

       │  before.txt  │              after.txt              │
       │     B/op     │     B/op      vs base               │
Update   6.345Mi ± 0%   1.746Mi ± 0%  -72.48% (p=0.002 n=6)

       │  before.txt   │             after.txt             │
       │   allocs/op   │ allocs/op   vs base               │
Update   4079.000 ± 0%   4.000 ± 0%  -99.90% (p=0.002 n=6)

Since the number of metas to add/remove is low (max 4?), sorting/binsearch may not be much better linear, but I can check.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, ok. This is called with just a few metas each compaction to pass the results of the cycle into the polled blocklist. And then each polling cycle it's called once with all accumulated modifications that cycle b/c it assumes that the tenant index is still a bit behind.

given everything the compactor does this may not be noticeable with either a linear or binary search. i'm good as is and will approve. up to you if you want to explore binary search.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested a bit and sorting the current blocklist is outweighing the benefits of the quicker lookups (benchmark using 100K blocklist). I checked and the even the current implementation isn't showing up in profiles, so I think it's not worth digging more in here.

slices.ContainsFunc(remove, hasID(b.BlockID)) ||
slices.ContainsFunc(compactedAdd, hasIDC(b.BlockID)) ||
slices.ContainsFunc(compactedRemove, hasIDC(b.BlockID)) {
continue
}

existingMetas = make(map[uuid.UUID]struct{})
newCompactedBlocklist := make([]*backend.CompactedBlockMeta, 0, len(compactedBlocklist)-len(compactedRemovals)+len(compactedAdd))
// rebuild the blocklist dropping all removals
for _, b := range compactedBlocklist {
existingMetas[(uuid.UUID)(b.BlockID)] = struct{}{}
if _, ok := compactedRemovals[(uuid.UUID)(b.BlockID)]; !ok {
newCompactedBlocklist = append(newCompactedBlocklist, b)
final = append(final, b)
}

l.metas[tenantID] = final
}
for _, b := range compactedAdd {
if _, ok := existingMetas[(uuid.UUID)(b.BlockID)]; !ok {
newCompactedBlocklist = append(newCompactedBlocklist, b)

// ******** Compacted blocks ********
if len(compactedAdd) > 0 || len(compactedRemove) > 0 {
var (
existing = l.compactedMetas[tenantID]
final = make([]*backend.CompactedBlockMeta, 0, max(0, len(existing)+len(compactedAdd)-len(compactedRemove)))
)

// rebuild dropping all removals
for _, b := range existing {
if slices.ContainsFunc(compactedRemove, hasIDC(b.BlockID)) {
continue
}
final = append(final, b)
}
// add new if they don't already exist and weren't also removed
for _, b := range compactedAdd {
if slices.ContainsFunc(existing, hasIDC(b.BlockID)) ||
slices.ContainsFunc(compactedRemove, hasIDC(b.BlockID)) {
continue
}
final = append(final, b)
}
}

l.compactedMetas[tenantID] = newCompactedBlocklist
l.compactedMetas[tenantID] = final
}
}
Loading
Loading