Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider removing global state from the API #30

Open
t0yv0 opened this issue Feb 1, 2022 · 9 comments
Open

Consider removing global state from the API #30

t0yv0 opened this issue Feb 1, 2022 · 9 comments
Labels
area/sdks SDKs kind/enhancement Improvements or new features

Comments

@t0yv0
Copy link
Member

t0yv0 commented Feb 1, 2022

Hello!

  • Vote on this issue by adding a 👍 reaction
  • If you want to implement this feature, comment to let us know (we'll work with you on design, scheduling, etc.)

Issue details

Global state is inherited from the C# translation. It primarily is a Deployment instance object and, indirectly, Stack. If this can be avoided by explicit param passing or other means, we get the ability for several Pulumi deployments coexisting peacefully in the same JVM process. While we do not know use cases for this yet, it may become desirable someday - just a general design sense.

Notes from Pawel (from #1):


- global state of the deployment instance is essentially a god object,
  but there might be ways to avoid that, e.g. by using a state machine
  to separate uninitialized deployment and initialized deployment as
  separate types, here is the code:
  https://github.com/pulumi/pulumi-java/blob/d302f9c049ac7a9653d56e9073ca3f091be0ace9/sdk/jvm/pulumi/src/main/java/io/pulumi/deployment/internal/DeploymentInstanceHolder.java#L16

- related to the above, Stack has this in the constructor:
https://github.com/pulumi/pulumi-java/blob/d302f9c049ac7a9653d56e9073ca3f091be0ace9/sdk/jvm/pulumi/src/main/java/io/pulumi/Stack.java#L80

- C# has AsyncLocal, as an alternative, we could look into one of those:
    - https://github.com/eclipse/microprofile-context-propagation
    - https://github.com/tunnelvisionlabs/java-threading/blob/c4fe47cfb25f724f897aa2cf6ea63d9c95aecee4/src/com/tunnelvisionlabs/util/concurrent/AsyncLocal.java

Affected area/feature

@t0yv0 t0yv0 added the kind/enhancement Improvements or new features label Feb 1, 2022
@pawelprazak
Copy link
Contributor

pawelprazak commented Feb 1, 2022

Yes, this would at least make unit testing much easier, right now complex mocks and spies are needed in many places, because of hidden dependencies on the static global deployment instance, hidden in constructors and various other places.

The simplest solution I can think of is "manual dependency injection" a.k.a. just passing dependencies in constructor. It would be more "best practices" and less "tricky" Java.

As for the use case, one that I had in mind is using Automation API to provide support for complex integration tests and e2e tests, that depend on infrastructure, directly from java code, using either docker and kind, or "real" infrastructure.
I've had to implement this in the past many times on big projects and it was fragile and expensive. IMO this would be a killer feature, a thin library for setting up infra as code in your tests. A common practice I see is to have a "dev infra" using something like Testcontainers or docker compose and an ops infra for cloud environments, this can lead to duplication and consistency issues and slow down team velocity.
This feature would obviously benefit from running those tests in parallel and therefore getting rid of the global state.

@t0yv0
Copy link
Member Author

t0yv0 commented Feb 1, 2022

Right. If we follow the way other languages do Automation API right now the script written against Automation API would get its own separate JVM, and every up/destroy/etc command would get a separate JVM as well. This way global state is isolated, but JVM startup cost and overhead is paid multiple times.

@pawelprazak
Copy link
Contributor

pawelprazak commented Feb 1, 2022 via email

@t0yv0
Copy link
Member Author

t0yv0 commented Apr 8, 2022

Decided on an approach with ThreadLocal+manual propagation in the WIP PR #323

@cowwoc
Copy link

cowwoc commented Oct 24, 2024

FYI, the aforementioned PR was closed on May 2022. There doesn't seem to be any progress on this issue.

@t0yv0
Copy link
Member Author

t0yv0 commented Oct 29, 2024

You can help us prioritize this by upvoting and explaining your use case! Thank you!

@mikhailshilkov
Copy link
Member

This issue is relevant again, so I will try to provide some context.

Pulumi SDKs have a notion of Deployment, that helps coordinate communications with the engine and tracks deployment progression. As two examples: every resource constructor invokes a registerResource RPC call, and every Output needs to be tracked until it resolves. Both of those flows will use a Deployment to "register" itself into a single coherent program execution flow.

Therefore, an instance of Deployment has to be available to SDK classes like Resource and Output constructors. The problem is that those SDK objects are instantiated by a user in their code, and we don't provide a mechanism to pass an explicit Deployment-related context (as e.g. in Go). In our current implementation of the Java SDK, we use DeploymentInstanceHolder to resolve a Deployment, which is a singleton object.

That works ok for Pulumi programs, because there is always a single instance of a Deployment that needs to exist in a given process. However, this is not the case for other scenarios:

  • Integration tests that we want to run in parallel
  • Automation API where users may define multiple inline Pulumi programs in the same Java program
  • Provider SDK where a Construct method may be called multiple times from the same process

This problem is not unique to Java, and for other languages we found language-specific solutions. For example, the closest "relative" of our Java SDK - .NET SDK - uses AsyncLocal<T> which holds a deployment for the entire call tree that is spawn from a single async root. Java 11 doesn't seem to have a similar concept that would work across CompletableFuture's. We likely need to use ThreadLocal and propagate context across thread boundaries that we control.

A prototype of this has been created in #323. The PR history isn't visible anymore but the code still exists in https://github.com/pulumi/pulumi-java/blob/1cfe305e348c672c7d0c896a4c4c3bf7057f2635, e.g. CurrentDeployment and its usage. We may need to revive that prototype and see if we can land it reliably.

@t0yv0
Copy link
Member Author

t0yv0 commented Feb 4, 2025

The code in #323 uses a ThreadLocal combined with wrapping all known callbacks received from the user to run in context where the ThreadLocal is set correctly. This works as you would expect for applyValue and other Output combinators, but it is not entirely foolproof and it would break down in advanced scenarios.

For example, in scenarios where users obtain a raw CompletableFuture, for example from an invoke, and then attempt to do Pulumi operations in a callback where the ThreadLocal is not set correctly. This code would fail to look up the Deployment.

future.thenAccept(result -> new Bucket()); 

More advanced scenarios where users manage their own threadpools could also be affected.

At the time of writing #323 we thought it might be viable to accept this behavior and compensate with documentation pointing users to APIs they need to call to obtain a Deployment handle and pass it around explicitly for these advanced use cases.

@davidemassarenti-optio3
Copy link

For invoke, we can wrap the result in a Future that resets the deployment. For a custom thread pool, no good solution short of providing a PulumiThreadPool helper.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/sdks SDKs kind/enhancement Improvements or new features
Projects
None yet
Development

No branches or pull requests

5 participants