-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
naïve version of task sync #364
base: main
Are you sure you want to change the base?
Conversation
) -> Result<Option<EnqueueJob>, JobError> { | ||
let aggregators = Aggregators::find() | ||
.filter(AggregatorColumn::IsFirstParty.eq(true)) // eventually we may want to check for a capability | ||
.all(db) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
eventually there will be too many records to do this with and we will want to use stream
src/queue/job/v1/task_sync.rs
Outdated
for aggregator in aggregators { | ||
let client = aggregator.client(job_state.http_client.clone()); | ||
for task_id in client.get_task_ids().await? { | ||
if 0 == Tasks::find_by_id(&task_id).count(db).await? { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we may eventually want to perform a single query to prefetch all ids for the given aggregator
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This likely also should constrain the query to a given aggregator, since it's just as invalid for the aggregator to have another aggregator's task as it is to have a task that we don't know about at all
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this PR accomplishes what it aims to do, but I'm not certain if deleting all tasks from Janus that divviup-api doesn't know about is the right thing to do. How will this interact with taskprov? @inahga can you comment?
I think we'll need to add task provenance to the aggregator api representation |
This represents a sort of "minimal first version of task sync." It deletes tasks from first party aggregators if we don't have them locally. It does not yet compare parameters. There are a lot of optimizations and rate limiting changes that could be applied to this, but those changes should be informed by benchmarks on representative data.
Refs #328