Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(file-lock): Added configurable retries for file-lock #886

Closed

Conversation

GiDW
Copy link
Contributor

@GiDW GiDW commented Jan 6, 2025

When test processes with isolated testcontainers are running in parallel they become very slow to start because of the file-lock retry strategy in testcontainers.

proper-filelock uses node-retry (repo) and exposes options to configure node-retry behavior. The current option is { retries: { forever: true } }. This leaves the rest on defaults (factor 2, minTimeout 1000, maxTimeout infinity).

This PR allows the user to configure some parts of the file lock retry strategy.

Example

Multiple tests are started in parallel with jest as different processes (default behavior). They start at the same time (default jest behavior).
Each test uses a GenericContainer and tries to start it. Because the process are started at the same time and use the same exponential back-off algorithm (node-retry) most of the time the processes check again on the same moment. This causes exponentially longer tests depending on how many are running in parallel.

Tweaking the retry strategy to lower the minTimeout and factor can result in serious speed-up.

Copy link

netlify bot commented Jan 6, 2025

Deploy Preview for testcontainers-node ready!

Name Link
🔨 Latest commit 510969c
🔍 Latest deploy log https://app.netlify.com/sites/testcontainers-node/deploys/677be1c9458b4100088644ff
😎 Deploy Preview https://deploy-preview-886--testcontainers-node.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@cristianrgreco
Copy link
Collaborator

cristianrgreco commented Jan 8, 2025

Thanks for the PR @GiDW.

Each test uses a GenericContainer and tries to start it. Because the process are started at the same time and use the same exponential back-off algorithm (node-retry) most of the time the processes check again on the same moment. This causes exponentially longer tests depending on how many are running in parallel.

Tweaking the retry strategy to lower the minTimeout and factor can result in serious speed-up.

I wonder instead of implementing the ability for users to configure the retry mechanism, if it would be better to just improve the defaults?

@cristianrgreco cristianrgreco added enhancement New feature or request minor Backward compatible functionality labels Jan 8, 2025
@GiDW
Copy link
Contributor Author

GiDW commented Jan 8, 2025

Thanks for your response @cristianrgreco.

Based on arbitrary testing. In my case, the operation where the lock is held took around 1 to 2 seconds. My recommendation would be something like this:

{
  factor: 1.1,
  minTimeout: 500,
  maxTimeout: 2500
}

However I have no idea of the impact on other platforms / setups.

My current setup is macOs with podman, so I have the TESTCONTAINERS_RYUK_DISABLED=true environment variable set.

@cristianrgreco
Copy link
Collaborator

The CICD test suites are pretty comprehensive, they run podman/docker and 2 node versions. It may be worth raising a PR with the improved defaults and we can assess. WDYT?

@cristianrgreco
Copy link
Collaborator

Replaced by #890

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request minor Backward compatible functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants