-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improvements to TileCodec to compress and skip redundant windows #539
Conversation
# Conflicts: # online/src/main/scala/ai/chronon/online/GroupByServingInfoParsed.scala # online/src/main/scala/ai/chronon/online/TileCodec.scala # spark/src/test/scala/ai/chronon/spark/test/InMemoryStream.scala
# Conflicts: # online/src/test/scala/ai/chronon/online/TileCodecTest.scala
bucketPos += 1 | ||
} | ||
} | ||
flattenedIr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it might be more efficient to directly implement and not pay for the intermediate collection.
updateFlatWithTile(historicalFlatIr, tileIr)
vs. update(historicalFlatIr, flatten(tile))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! One optimization suggestion. But we can revisit that if we find the latency to be an issue. I would still recommend adding that as a todo.
Summary
This is a follow up from our prior TileCodec PR - #523 (comment)
Made some changes to TileCodec to not compute and store the full cross product of buckets * windows for a given FG. The window are redundant within a tile and costs us on the aggregator side + on the payload ser / deser side. Now we'll only have one counter for given aggregation, bucket in a tile. We explode this out on the deser side so that this continues to work with @aarongreen-stripe's PR - #531.
Why / Goal
Improve aggregation efficiency and cut down tile payload sizes written out to KV store
Test Plan
Checklist
Reviewers
@nikhilsimha || @ezvz