Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Destination Databricks: Add generation_id+sync_id #40689

Merged
merged 2 commits into from
Jul 15, 2024

Conversation

edgao
Copy link
Contributor

@edgao edgao commented Jul 2, 2024

closes https://github.com/airbytehq/airbyte-internal-issues/issues/8533. We'll release this as a breaking change, so no migration necessary.

stacked on #40567 to pull in a fix for one of the sqlgenerator test cases.

Copy link

vercel bot commented Jul 2, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
airbyte-docs ✅ Ready (Inspect) Visit Preview 💬 Add feedback Jul 12, 2024 9:53pm

@edgao edgao force-pushed the edgao/databricks_generation_id branch from 3c92fd1 to 5ea3ddb Compare July 2, 2024 23:18
@edgao edgao force-pushed the edgao/databricks_generation_id branch 2 times, most recently from 4aa4151 to d7589bf Compare July 3, 2024 19:58
@edgao edgao changed the base branch from edgao/redshift_refreshes to edgao/databricks_password_auth July 3, 2024 19:58
@edgao edgao force-pushed the edgao/databricks_password_auth branch from 6ceee39 to a7384fd Compare July 3, 2024 21:08
@edgao edgao force-pushed the edgao/databricks_generation_id branch from d7589bf to ee4d4d4 Compare July 3, 2024 21:08
@edgao edgao force-pushed the edgao/databricks_password_auth branch from a7384fd to aa5f4d5 Compare July 3, 2024 21:10
@edgao edgao force-pushed the edgao/databricks_generation_id branch from ee4d4d4 to 0c533f1 Compare July 3, 2024 21:10
@edgao edgao force-pushed the edgao/databricks_password_auth branch from aa5f4d5 to 3212d3b Compare July 3, 2024 21:13
@edgao edgao force-pushed the edgao/databricks_generation_id branch from 0c533f1 to 100d7ad Compare July 3, 2024 21:13
@edgao edgao force-pushed the edgao/databricks_password_auth branch from 3212d3b to 035032d Compare July 3, 2024 21:51
@edgao edgao force-pushed the edgao/databricks_generation_id branch from 100d7ad to 8d8804d Compare July 3, 2024 21:51
@edgao edgao force-pushed the edgao/databricks_password_auth branch 2 times, most recently from f8228c3 to 06ee74e Compare July 5, 2024 22:54
@edgao edgao force-pushed the edgao/databricks_generation_id branch 2 times, most recently from a08885a to 859842c Compare July 9, 2024 16:58
@edgao edgao changed the base branch from edgao/databricks_password_auth to edgao/redshift_refreshes July 9, 2024 16:58
@edgao edgao force-pushed the edgao/redshift_refreshes branch from 68d8bb2 to 3314670 Compare July 9, 2024 17:11
@edgao edgao force-pushed the edgao/databricks_generation_id branch from 859842c to 76745ef Compare July 9, 2024 17:11
@edgao edgao changed the title Destination Databricks: Add generation_id Destination Databricks: Add generation_id+sync_id Jul 9, 2024
@edgao edgao force-pushed the edgao/databricks_generation_id branch 2 times, most recently from b4d4784 to 6dfd22e Compare July 9, 2024 17:54
@edgao edgao force-pushed the edgao/databricks_generation_id branch from 43e5322 to cfa19a5 Compare July 10, 2024 17:32
const val AB_EXTRACTED_AT = constants.COLUMN_NAME_AB_EXTRACTED_AT
const val AB_LOADED_AT = constants.COLUMN_NAME_AB_LOADED_AT
const val AB_DATA = constants.COLUMN_NAME_DATA
const val AB_META = constants.COLUMN_NAME_AB_META
Copy link
Contributor Author

@edgao edgao Jul 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could be convinced to undo this, or even just use the actual constant names evereywhere, but defining consts to alias other constants feels weird - I replaced those with just import aliases.

(also, fyi constants was previously an import alias for JavaBaseConstants)

@edgao edgao force-pushed the edgao/databricks_generation_id branch 3 times, most recently from 92f605e to eb7e421 Compare July 10, 2024 20:45
private val databaseName: String,
private val jdbcDatabase: JdbcDatabase,
) : DestinationHandler<MinimumDestinationState.Impl> {

private val log = KotlinLogging.logger {}
private val abRawId = DatabricksSqlGenerator.AB_RAW_ID
private val abExtractedAt = DatabricksSqlGenerator.AB_EXTRACTED_AT
private val abMeta = DatabricksSqlGenerator.AB_META
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see DatabricksSqlGenerator for more explanation of what these constants were / why I'm removing them

@edgao edgao force-pushed the edgao/databricks_generation_id branch 2 times, most recently from 49c5c79 to f7c724a Compare July 11, 2024 17:54
# increased timeout required during parallel workload on our small warehouse
JunitMethodExecutionTimeout=10 m
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

saw some timeouts in CI. I'm pretty sure that's a consequence of having two stacked PRs (i.e. our cluster just isn't beefy enough to run multiple PRs' tests at the same time); if I push a single branch at a time, we don't hit the timeout.

(not opinionated on whether we should just increase our cluster size, I don't know how databricks bills us)

@@ -1,8 +0,0 @@
{"id1": 1, "id2": 100, "updated_at": "2023-01-01T01:00:00.000000Z", "array": ["foo"], "struct": {"foo": "bar"}, "string": "foo", "number": 42.1, "integer": 42, "boolean": true, "timestamp_with_timezone": "2023-01-23T12:34:56.000000Z", "timestamp_without_timezone": "2023-01-23T12:34:56", "time_with_timezone": "12:34:56Z", "time_without_timezone": "12:34:56", "date": "2023-01-23", "unknown": {}, "_airbyte_extracted_at": "2023-01-01T00:00:00.000000Z", "_airbyte_meta": {"changes": []}}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's no v1v2 migration test (since we never implemented the v1->v2 migration to begin with), so deleting these files

@edgao edgao marked this pull request as ready for review July 11, 2024 19:54
@edgao edgao requested a review from a team as a code owner July 11, 2024 19:54
@johnny-schmidt johnny-schmidt self-assigned this Jul 12, 2024
@edgao edgao force-pushed the edgao/redshift_refreshes branch 2 times, most recently from ed23537 to 4b5fff5 Compare July 12, 2024 17:59
Copy link
Contributor

@johnny-schmidt johnny-schmidt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, one question

Base automatically changed from edgao/redshift_refreshes to master July 12, 2024 18:23
@edgao edgao force-pushed the edgao/databricks_generation_id branch 2 times, most recently from 230bd8f to c5d7f91 Compare July 12, 2024 18:43
@octavia-squidington-iii octavia-squidington-iii added the area/documentation Improvements or additions to documentation label Jul 12, 2024
@edgao
Copy link
Contributor Author

edgao commented Jul 12, 2024

@airbytehq/destinations I forgot to write the breaking change text 🤦 can I get a review on the upgrade guide stuff?

@edgao edgao force-pushed the edgao/databricks_generation_id branch from c5d7f91 to c019d19 Compare July 12, 2024 21:49
@edgao edgao merged commit 4ad05cd into master Jul 15, 2024
33 checks passed
@edgao edgao deleted the edgao/databricks_generation_id branch July 15, 2024 15:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/connectors Connector related issues area/documentation Improvements or additions to documentation connectors/destination/databricks
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants