Issue #719 paralleljobthreading #723

HansVRP · 2025-02-03T12:26:34Z

No description provided.

HansVRP · 2025-02-03T14:33:59Z

@soxofaan @jdries

Also did a small integration test on 10 jobs.

before we needed 2 minutes and 10 seconds to start all jobs:

now it is reduced to 22 seconds. Notice how initially 5 jobs were started together (equal to the max thread pool)

soxofaan

some initial feedback

openeo/extra/job_management/__init__.py

soxofaan · 2025-02-03T15:21:10Z

openeo/extra/job_management/__init__.py

+        def job_worker(i, backend_name):
+            with semaphore:
+                try:
+                    self._launch_job(start_job, not_started, i, backend_name, stats)


Shouldn't the db lock also apply to this launch job callable (because _launch_job has access to the db dataframe)?

However, this indicates there is a bit of problem: if you lock around launch_job, then you effectively loose parallelism again.

Unless the db lock is only to protect the job_db.persist calls. But then _launch_job should be given a read-only version of the dataframe row. I'm not sure if that is compatible with how users use _launch_job.

I indeed made the db lock purely for the persist calls. Compatible in what way?

I'm not sure if that is compatible with how users use _launch_job.

maybe some users update some fields in the pandas row from withing their _launch_job implementation, expecting it to be persisted. But guaranteeing that undermines the opportunity for thread-safe and effective parallelism

so do we then want to avoid the locking, which may lead to concurrency issues?

Or will we not support altering the _launch_job functionality and document that changes to the dataframe must occur within the persist function?

do we then want to avoid the locking

Indeed, the goal of this feature is to exploit parallelism for more efficient use of time and every lock that would be needed to ensure consistency undermines that goal. Instead of sharing state (e.g. pandas dataframes) between threads (requiring locks), I think we should aim for a design where there is as little as possible state sharing (e.g. dataframes) between the main thread and the worker threads.

For example, as the scope here is mainly to offload job starting in side-threads, these threads should be able to just do their work given the job id as single string (and probably a valid access token, again one string, to address auth). All the other state/objects drag in concurrency risks

HansVRP · 2025-02-04T12:12:49Z

Made an update on how the threading and queing work together. I believe now we no longer send out batches of jobs, but continiously add jobs if a thread becomes available.

I did have to set a lock, to ensure the unit tests would pass

HansVRP · 2025-02-05T13:42:35Z

lock indeed causes a bottleneck in the que implementation:

Will need to discuss how best to proceed

related to #719/#723

HansVRP added 7 commits January 31, 2025 16:46

add threading

8efc046

start working on a test

a2a445b

clean up and fix test

c1f373a

further clean up

5b5e1a2

roll back

6d15d08

include locking for persist and stats

6a3a681

db locking

b69a760

HansVRP requested review from soxofaan and jdries February 3, 2025 14:34

clean up

9b466cf

soxofaan reviewed Feb 3, 2025

View reviewed changes

HansVRP added 8 commits February 3, 2025 16:58

clean-up

fb907ed

clean-up

61df4a2

clean-up

617dfd9

queing

62fad7d

types

902bc3a

copy

a7fb9cd

rollback

368c03a

cleaner que integration

8086b16

soxofaan mentioned this pull request Feb 4, 2025

JobManager: create & start in parallel #719

Open

thread pool for continious updates

04b3117

HansVRP requested a review from soxofaan February 4, 2025 12:10

test with lock

f072ee7

HansVRP added 4 commits February 6, 2025 22:15

simplification

21e9883

fix: stats update out of launch

2172d33

fix: stats update out of launch

4c21b24

fix: unit tests

b0af337

soxofaan added a commit that referenced this pull request Feb 10, 2025

Add Connection.authenticate_bearer_token

9a7f727

related to #719/#723

soxofaan changed the title ~~Issue717 paralleljobthreading~~ Issue #719 paralleljobthreading Feb 10, 2025

soxofaan linked an issue Feb 10, 2025 that may be closed by this pull request

JobManager: create & start in parallel #719

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue #719 paralleljobthreading #723

Issue #719 paralleljobthreading #723

HansVRP commented Feb 3, 2025

HansVRP commented Feb 3, 2025

soxofaan left a comment

soxofaan Feb 3, 2025

soxofaan Feb 3, 2025

HansVRP Feb 3, 2025

soxofaan Feb 3, 2025 •

edited

Loading

HansVRP Feb 3, 2025 •

edited

Loading

soxofaan Feb 4, 2025 •

edited

Loading

HansVRP commented Feb 4, 2025 •

edited

Loading

HansVRP commented Feb 5, 2025

Issue #719 paralleljobthreading #723

Are you sure you want to change the base?

Issue #719 paralleljobthreading #723

Conversation

HansVRP commented Feb 3, 2025

HansVRP commented Feb 3, 2025

soxofaan left a comment

Choose a reason for hiding this comment

soxofaan Feb 3, 2025

Choose a reason for hiding this comment

soxofaan Feb 3, 2025

Choose a reason for hiding this comment

HansVRP Feb 3, 2025

Choose a reason for hiding this comment

soxofaan Feb 3, 2025 • edited Loading

Choose a reason for hiding this comment

HansVRP Feb 3, 2025 • edited Loading

Choose a reason for hiding this comment

soxofaan Feb 4, 2025 • edited Loading

Choose a reason for hiding this comment

HansVRP commented Feb 4, 2025 • edited Loading

HansVRP commented Feb 5, 2025

soxofaan Feb 3, 2025 •

edited

Loading

HansVRP Feb 3, 2025 •

edited

Loading

soxofaan Feb 4, 2025 •

edited

Loading

HansVRP commented Feb 4, 2025 •

edited

Loading