Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat: Update concurrency formula. (#1907)
## The Problem There's a "bug" in the concurrency formula on the scheduler (all strategies). This is the formula: ```go tableConcurrency := max(s.concurrency/minResourceConcurrency, minTableConcurrency) resourceConcurrency := tableConcurrency * minResourceConcurrency ``` But these values are hardcoded: ```go minResourceConcurrency := 100 minTableConcurrency := 1 ``` So if you replace: ```go tableConcurrency := max(s.concurrency/100, 1) resourceConcurrency := tableConcurrency * 100 ``` This means that any plugin whose default concurrency is `<= 100` will have a table concurrency of `1`, even if it's the only table being synced (assuming no one changes the default). ## The Fix I made a very subtle change in the formula. Only if concurrency is `<= 100`, I change the `minResourceConcurrency` to `concurrency/10`. This decreases the resource concurrency up to 10x (that we don't seem to be hitting anyway), and increases the table concurrency up to 10x. ## Plugins affected (on default concurrency) - bamboo-hr - bigquery - ~clickhouse~ (doesn't use scheduler) - confluence - crowddev - ~file~ (doesn't use scheduler) - leanix - oracledb - ~s3~ (doesn't use scheduler) - sentinelone - servicenow - shopify - sonarqube - statuspage ## The Results I'm still working on the results (it's trickier than it seems). In principle, they are very encouraging: ### BigQuery **Before** ``` $ cli sync bigquery_to_postgresql.yaml Loading spec(s) from bigquery_to_postgresql.yaml Starting sync for: bigquery (cloudquery/[email protected]) -> [postgresql (cloudquery/[email protected])] Sync completed successfully. Resources: 26139, Errors: 0, Warnings: 0, Time: 2m4s ``` **After** ``` $ cli sync bigquery_to_postgresql.yaml Loading spec(s) from bigquery_to_postgresql.yaml Starting sync for: bigquery (cloudquery/[email protected]) -> [postgresql (cloudquery/[email protected])] Sync completed successfully. Resources: 26139, Errors: 0, Warnings: 0, Time: 1m27s ``` **Result** 1.43x of regular speed (43% faster) ### Sentinelone **Before** ``` $ cli sync . Loading spec(s) from . Starting sync for: sentinelone (grpc@localhost:7777) -> [postgresql (cloudquery/[email protected])] Sync completed successfully. Resources: 1231, Errors: 0, Warnings: 0, Time: 1m4s ``` **After** ``` $ cli sync . Loading spec(s) from . Starting sync for: sentinelone (grpc@localhost:7777) -> [postgresql (cloudquery/[email protected])] Sync completed successfully. Resources: 1231, Errors: 0, Warnings: 0, Time: 15s ``` **Result** 4.27x of regular speed (327% faster) ### Sonarqube **Before** ``` $ cli sync sonarqube_to_postgresql.yaml Loading spec(s) from sonarqube_to_postgresql.yaml Starting sync for: sonarqube (grpc@localhost:7777) -> [postgresql (cloudquery/[email protected])] Sync completed successfully. Resources: 4594, Errors: 0, Warnings: 0, Time: 39s ``` **After** ``` $ cli sync sonarqube_to_postgresql.yaml Loading spec(s) from sonarqube_to_postgresql.yaml Starting sync for: sonarqube (grpc@localhost:7777) -> [postgresql (cloudquery/[email protected])] Sync completed successfully. Resources: 4594, Errors: 0, Warnings: 0, Time: 22s ``` **Result** 1.77x of regular speed (77% faster)
- Loading branch information