Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pool-resource needs an atomic move operation #30

Open
chendrix opened this issue Jun 6, 2017 · 1 comment
Open

pool-resource needs an atomic move operation #30

chendrix opened this issue Jun 6, 2017 · 1 comment

Comments

@chendrix
Copy link
Contributor

chendrix commented Jun 6, 2017

Moved from concourse/concourse#196

cc @xtreme-gavin-enns


Currently in order to move an environment between pools (to model state changes) we need to perform separate add and remove operations. This has two downfalls:

  1. one operation may fail leaving us with duplicated or deleted environments
  2. one operation may take a long time, leaving us temporarily in a state similar to the above

This could be resolved by implementing a move operation that performs both parts of the move in a single commit/push.

@sc68cal
Copy link
Contributor

sc68cal commented May 18, 2020

I have a ConcourseCI pipeline that tries to implement this as much as possible without the ability to have atomic/guaranteed operations.

---
resources:
  - name: every-24h
    type: time
    source: {interval: 24h}
    check_every: 12h

  - name: dirty-hardware
    type: pool

  - name: clean-hardware
    type: pool

jobs:

  - name: Pick up a dirty cluster and clean it up
    plan:
      - get: every-24h
        trigger: true

      - put: dirty-hardware
        params: {acquire: true}

      - task: Do work
          # Work here, but deleted for brevity
        
        on_failure:
          # Work attempt failed, release the lock and try again next time
          put: dirty-hardware
          params: {release: dirty-hardware}

      # Work succeeded, change the hardware state
      - put: clean-hardware
        params: {add: dirty-hardware}

      # Delete the old lock 
      - put: dirty-hardware
        params: {remove: dirty-hardware}

I have had instances where ConcourseCI itself, under load, has had steps fail due to resource exhaustion - but they've been very very rare.

I know it's not perfect, but this is what I ran into and how I attempted to solve it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants