Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(api): Idempotent Resource Creation #298

Closed

Conversation

justenwalker
Copy link
Contributor

@justenwalker justenwalker commented Jan 1, 2024

What

Based on [API] Idempotent Resource Creation Discussion.

This change adds an idempotencyKey to the API and database tables. This enables an idempotent way to create resources; preventing duplicate creation of the same resource for retries and concurrent create requests.

Why

And Idempotency key of some kind is necessary in order to prevent duplicates from being created in concurrent or retried requests.

For example, I often run into this scenario; especially if the Airbyte Server is under load (when it starts returning 504/502 errors):

  • Try to create a source/connection/destination/etc...
  • Get a 504/502 response back; indicating a timeout waiting for a server response.
  • Retry after some time
  • Initial request succeeds and creates a source/connection/destination/etc.
  • Retried request also succeeds and creates a source/connection/destination/etc.... with the same name

The more retries, the more duplicates.

Since no resource has any unique constraint besides its UUID (which is generated on creation) -- the server will gladly accept multiple create requests for an identical resource.

How

  1. Add a nullable uuid column called idempotency_key to the database tables of organizations, users, workspaces, connections, actors, and actor_definitions.
  2. Add a unique index to prevent duplicate idempotency_key rows and fast lookup by idempotency_key.
  3. Modify the API to accept an idempotencyKey value on create for these resources.
  4. Create requests look up the value in the DB if it was provided, and will return the existing value instead of accepting the write.

Recommended reading order

Database Migrations

  1. airbyte-db/db-lib/src/main/java/io/airbyte/db/instance/configs/migrations/V0_50_33_016__AddIdempotencyKeys.java
  2. airbyte-db/db-lib/src/main/resources/configs_database/schema_dump.txt
  3. airbyte-bootloader/src/test/java/io/airbyte/bootloader/BootloaderTest.java

API Changes

  1. airbyte-api/src/main/openapi/config.yaml
  2. airbyte-api/src/main/openapi/api.yaml

API Server Mappings

  1. airbyte-api-server/src/main/kotlin/io/airbyte/api/server/mappers/ConnectionCreateMapper.kt
  2. airbyte-api-server/src/main/kotlin/io/airbyte/api/server/services/DestinationService.kt
  3. airbyte-api-server/src/main/kotlin/io/airbyte/api/server/services/SourceService.kt
  4. airbyte-api-server/src/main/kotlin/io/airbyte/api/server/services/WorkspaceService.kt

Server Handlers

  1. airbyte-commons-server/src/main/java/io/airbyte/commons/server/handlers/ConnectionsHandler.java
  2. airbyte-commons-server/src/main/java/io/airbyte/commons/server/handlers/DestinationDefinitionsHandler.java
  3. airbyte-commons-server/src/main/java/io/airbyte/commons/server/handlers/DestinationHandler.java
  4. airbyte-commons-server/src/main/java/io/airbyte/commons/server/handlers/OrganizationsHandler.java
  5. airbyte-commons-server/src/main/java/io/airbyte/commons/server/handlers/SourceDefinitionsHandler.java
  6. airbyte-commons-server/src/main/java/io/airbyte/commons/server/handlers/SourceHandler.java
  7. airbyte-commons-server/src/main/java/io/airbyte/commons/server/handlers/UserHandler.java
  8. airbyte-commons-server/src/main/java/io/airbyte/commons/server/handlers/WebBackendConnectionsHandler.java
  9. airbyte-commons-server/src/main/java/io/airbyte/commons/server/handlers/WorkspacesHandler.java
  10. airbyte-commons-server/src/test/java/io/airbyte/commons/server/handlers/DestinationDefinitionsHandlerTest.java
  11. airbyte-commons-server/src/test/java/io/airbyte/commons/server/handlers/SourceDefinitionsHandlerTest.java
  12. airbyte-commons-server/src/test/java/io/airbyte/commons/server/handlers/WebBackendConnectionsHandlerTest.java

Config Model Updates

  1. airbyte-config/config-models/src/main/resources/types/DestinationConnection.yaml
  2. airbyte-config/config-models/src/main/resources/types/IdempotencyKey.yaml
  3. airbyte-config/config-models/src/main/resources/types/Organization.yaml
  4. airbyte-config/config-models/src/main/resources/types/SourceConnection.yaml
  5. airbyte-config/config-models/src/main/resources/types/StandardDestinationDefinition.yaml
  6. airbyte-config/config-models/src/main/resources/types/StandardSourceDefinition.yaml
  7. airbyte-config/config-models/src/main/resources/types/StandardSync.yaml
  8. airbyte-config/config-models/src/main/resources/types/StandardWorkspace.yaml
  9. airbyte-config/config-models/src/main/resources/types/User.yaml

Persistence Changes

  1. airbyte-config/config-persistence/src/main/java/io/airbyte/config/persistence/OrganizationPersistence.java
  2. airbyte-config/config-persistence/src/main/java/io/airbyte/config/persistence/UserPersistence.java
  3. airbyte-data/src/main/java/io/airbyte/data/services/ConnectionService.java
  4. airbyte-data/src/main/java/io/airbyte/data/services/DestinationService.java
  5. airbyte-data/src/main/java/io/airbyte/data/services/SourceService.java
  6. airbyte-data/src/main/java/io/airbyte/data/services/WorkspaceService.java
  7. airbyte-data/src/main/java/io/airbyte/data/services/impls/jooq/ConnectionServiceJooqImpl.java
  8. airbyte-data/src/main/java/io/airbyte/data/services/impls/jooq/DestinationServiceJooqImpl.java
  9. airbyte-data/src/main/java/io/airbyte/data/services/impls/jooq/OrganizationServiceJooqImpl.java
  10. airbyte-data/src/main/java/io/airbyte/data/services/impls/jooq/SourceServiceJooqImpl.java
  11. airbyte-data/src/main/java/io/airbyte/data/services/impls/jooq/WorkspaceServiceJooqImpl.java

Can this PR be safely reverted / rolled back?

  • YES 💚
  • NO ❌

Contains API changes and DB migrations.

🚨 User Impact 🚨

This shouldn't be a breaking change, as the new idempotencyKey is an optional parameter; and behavior should
not be different than it currently is if it is not given in the API Calls.

Copy link

sonarqubecloud bot commented Jan 1, 2024

Quality Gate Passed Quality Gate passed

The SonarCloud Quality Gate passed, but some issues were introduced.

5 New issues
0 Security Hotspots
No data about Coverage
0.0% Duplication on New Code

See analysis details on SonarCloud

@JonsSpaghetti
Copy link
Contributor

Hi @justenwalker, thank you for your contribution! We like what this change allows and discussed internally - our one question is: was there a reason a new column was created instead of using the already existing id column? That serves as our PK in our database currently and it makes sense to add that to the data models and use it as our idempotency key for this feature.

@justenwalker
Copy link
Contributor Author

@JonsSpaghetti

Hi @justenwalker, thank you for your contribution! We like what this change allows and discussed internally - our one question is: was there a reason a new column was created instead of using the already existing id column? That serves as our PK in our database currently and it makes sense to add that to the data models and use it as our idempotency key for this feature.

If you check the discussion; that was an option that I presented; however my concerns are:

  1. Allowing the user to supply a PK might have security implications that I cannot foresee.
  2. A separate key allows potentially using this column to make some update operations idempotent too, since the ID cannot change; but the idempotency key is free to change.
  3. The existence of https://airbyte-public-api-docs.s3.us-east-2.amazonaws.com/rapidoc-api-docs.html#post-/v1/workspaces/create_if_not_exist seems to imply that a user-supplied ID is only for testing, and is not a production-intended use-case

    Creates a workspace with an explicit workspace ID. This should be use in acceptance tests only.

  4. The idempotency key is more ergonomic for the user ; as it allows the response to be the same for both collision and no-collision, making the client's job a lot easier; supplying the PK means that ID conflict error case must be handled separately.
  5. There is some nuance around the actors and actor_definitions in that their Idempotency Key is compound with their actor_type -- not just another PK. This is needed to distinguish between the two types source and destination. Otherwise, you could conflict on a source create with a destination and vice versa. If there were separate tables for source/destination and their definitions, this would be less of an issue.

final String indexName = String.format("%s_actor_type_idempotency_key", table);
final Field<UUID> idempotencyKey = DSL.field(columnName, SQLDataType.UUID.nullable(true));
ctx.alterTable(table).addColumnIfNotExists(idempotencyKey).execute();
ctx.createUniqueIndex(indexName).on(table, columnName, "actor_type").execute();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps actor_type should come first in the index instead of the idempotency_key; as the actor_type is sort of like a namespace for the idempotency_key.

Comment on lines +250 to +252
} catch (ConfigNotFoundException e) {
// not found, so must create
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not too happy with this. I suspect this is not a situation that you can get in, since the existence of the sourcedef implies it has at least one version (i would think). But i have to catch this due to buildSourceDefinitionRead throwing it.

Comment on lines +250 to +252
} catch (ConfigNotFoundException e) {
// not found, so must create
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not too happy with this. I suspect this is not a situation that you can get in, since the existence of the destdef implies it has at least one version (i would think). But i have to catch this due to buildDestinationDefinitionRead throwing it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants