-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
submission monitor 20 per pass #2670
Conversation
@@ -57,6 +57,8 @@ submissionmonitor { | |||
trackDetailedSubmissionMetrics = true | |||
attributeUpdatesPerWorkflow = 20000 | |||
enableEmailNotifications = true | |||
# how many workflows should the submission monitor attempt to process and update at once? | |||
workflowsPerMonitorPass = 20 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is a good initial value for this setting?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Less than 600!
statuses: WorkflowStatuses.WorkflowStatus* | ||
): ReadAction[Seq[WorkflowRecord]] = | ||
findWorkflowsBySubmissionId(submissionId).filter(_.status inSetBind (statuses.map(_.toString))).result | ||
findWorkflowsBySubmissionId(submissionId).filter(_.status inSetBind (statuses.map(_.toString))).take(limit).result |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the .take(limit)
here should result in only returning ${workflowsPerMonitorPass}
workflows to the submission monitor actor for any given pass.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The drawback to adding a limit here is that if the first limit
rows from the db are still running, but limit+1, limit+2, limit+3 are done … the monitor won't complete the ones that are done. This isn't great.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could alleviate this by select … order by rand() limit 20
; in mysql, to get a random selection of 20 rows that match the where clause. I'm not sure how to do that in Slick.
def countWorkflowRecsForSubmissionAndStatuses(submissionId: UUID, | ||
statuses: WorkflowStatuses.WorkflowStatus* | ||
): ReadAction[Int] = | ||
findWorkflowsBySubmissionId(submissionId).filter(_.status inSetBind (statuses.map(_.toString))).length.result |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
previously, SubmissionMonitorActor.updateSubmissionStatus()
called listWorkflowRecsForSubmissionAndStatuses()
, returning all workflow records as a Seq, and then checked if that Seq was empty or not. This moves the logic to the db, returning just a count. I needed to change something here, because by adding a limit to listWorkflowRecsForSubmissionAndStatuses()
, it would have been invalid for SubmissionMonitorActor.updateSubmissionStatus()
- two birds with one stone?
Ticket:
PR checklist
model/
, then you should publish a new officialrawls-model
and updaterawls-model
in Orchestration's dependencies.