-
Notifications
You must be signed in to change notification settings - Fork 200
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DataApi] Get BlobCount By AccountID #541
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good to me.
Is the primary purpose of adding the new field and index to support count query?
I feel like getting the actual metadata might be more useful. Maybe we can support that later?
q.mu.RLock() | ||
defer q.mu.RUnlock() | ||
count := int32(0) | ||
for _, meta := range q.Metadata { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This for loop can potentially be really large right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
true but this is only for testing purpose.....
@ian-shim yes that is correct this is a stop gap solution to get blobCount by AccountID. I have suggested couple more approaches in a document that make it more real-time based on DynamodB streams |
// GetBlobMetadataByAccount Count returns the count of all the metadata with the given status | ||
// Because this function scans the entire index, it should only be used for status with a limited number of items. | ||
// It should only be used to filter "Processing" status. To support other status, a streaming version should be implemented. | ||
func (s *BlobMetadataStore) GetBlobMetadataCountByAccountID(ctx context.Context, accountID core.AccountID) (int32, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably want the total amount of data rather than the total number of blobs. Or perhaps both.
Let's do another sync on the exact use case for this work!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mooselumph i think if we want total amount of data everytime than better approach is to have a lambda invoke on DynamodB stream.....which only processes INSERT_EVENT and can be used to update:
- Amount of Data
- Count of blobs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for now temporarily i will just update Batcher to update amount of data after Blob is confirmed and increment count
// Update AccountID to accountKey | ||
// This is a combination of origin and authenticatedAddress | ||
// AccountId is later used to track blobs sent by the same account | ||
blob.RequestHeader.BlobAuthHeader.AccountID = accountKey |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might need to change depending on if the authenticated endpoint is being used right ?
@@ -244,6 +250,7 @@ func (s *server) Start() error { | |||
{ | |||
feed.GET("/blobs", s.FetchBlobsHandler) | |||
feed.GET("/blobs/:blob_key", s.FetchBlobHandler) | |||
feed.GET("/blobs/count/:accountId", s.FetchBlobCountByAccountIdHandler) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should definitely cache these endpoints.
Why are these changes needed?
These changes enable querying blob count by AccountId
Changes
accountId
in BlobAuthHeader.Checks