-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue #719 paralleljobthreading #723
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some initial feedback
def job_worker(i, backend_name): | ||
with semaphore: | ||
try: | ||
self._launch_job(start_job, not_started, i, backend_name, stats) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't the db lock also apply to this launch job callable (because _launch_job
has access to the db dataframe)?
However, this indicates there is a bit of problem: if you lock around launch_job
, then you effectively loose parallelism again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless the db lock is only to protect the job_db.persist
calls. But then _launch_job
should be given a read-only version of the dataframe row. I'm not sure if that is compatible with how users use _launch_job
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I indeed made the db lock purely for the persist calls. Compatible in what way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if that is compatible with how users use _launch_job.
maybe some users update some fields in the pandas row from withing their _launch_job
implementation, expecting it to be persisted. But guaranteeing that undermines the opportunity for thread-safe and effective parallelism
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so do we then want to avoid the locking, which may lead to concurrency issues?
Or will we not support altering the _launch_job functionality and document that changes to the dataframe must occur within the persist function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we then want to avoid the locking
Indeed, the goal of this feature is to exploit parallelism for more efficient use of time and every lock that would be needed to ensure consistency undermines that goal. Instead of sharing state (e.g. pandas dataframes) between threads (requiring locks), I think we should aim for a design where there is as little as possible state sharing (e.g. dataframes) between the main thread and the worker threads.
For example, as the scope here is mainly to offload job starting in side-threads, these threads should be able to just do their work given the job id as single string (and probably a valid access token, again one string, to address auth). All the other state/objects drag in concurrency risks
Made an update on how the threading and queing work together. I believe now we no longer send out batches of jobs, but continiously add jobs if a thread becomes available. I did have to set a lock, to ensure the unit tests would pass |
No description provided.