Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recover partitions for Spark #136

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

jarno-r
Copy link

@jarno-r jarno-r commented Mar 1, 2022

Description & motivation

resolves: #126
Fix for issue Running RECOVER PARTITIONS without defining partitions #126

Add 'recover_partitions' option for Spark to run ALTER TABLE RECOVER PARTITIONS even if partitions are not explicitly specified. This makes using inferred partitions possible.

Checklist

  • I have verified that these changes work locally
  • I have updated the README.md (if applicable)
  • I have added an integration test for my fix/feature (if applicable)

@jarno-r jarno-r requested a review from jtcohen6 as a code owner March 1, 2022 06:50
@jarno-r jarno-r force-pushed the recover_partitions branch from 41a84ea to fe7a599 Compare March 1, 2022 07:07
Add 'recover_partitions' option for Spark to run ALTER TABLE RECOVER PARTITIONS even if partitions are not explicitly specified. This makes using inferred partitions possible.
@jarno-r jarno-r force-pushed the recover_partitions branch from fe7a599 to fbc766f Compare March 1, 2022 07:09
@jarno-r
Copy link
Author

jarno-r commented Mar 1, 2022

I'm not sure the integration test is meaningful, since the test seed does not contain the partition column. That also applies to the previous people_csv_partitioned_using test. If 'section' was added to the people.csv, the two tests could be updated to include the partitioning.

@jtcohen6
Copy link
Collaborator

jtcohen6 commented Mar 2, 2022

Thanks @jarno-r!

If 'section' was added to the people.csv, the two tests could be updated to include the partitioning.

Could you say slightly more about that? Would section need to be included in the contents of the CSVs themselves? It's already included in the file paths, using Hive-style formatting: https://dbt-external-tables-testing.s3.us-east-2.amazonaws.com/

@jarno-r
Copy link
Author

jarno-r commented Mar 4, 2022

If people.csv also had 'section' column, the two tests could be changed to include 'section' in 'compare_columns' . Then both tests would show that the partitions are read correctly.

@jarno-r
Copy link
Author

jarno-r commented Aug 1, 2022

@jtcohen6 Could this be merged?
Adding the section to people.csv would somewhat improve the test coverage, but it is not necessary.

@jarno-r jarno-r requested a review from jeremyyeo as a code owner April 13, 2023 08:58
@github-actions
Copy link

This PR has been marked as Stale because it has been open with no activity as of late. If you would like the PR to remain open, please comment on the PR or else it will be closed in 7 days.

@github-actions github-actions bot added the Stale label Oct 11, 2023
@jarno-r
Copy link
Author

jarno-r commented Oct 11, 2023

This is still relevant. FYI, the integration-snowflake test is failing for reasons that have nothing to do with this PR. I would probably succeed if retriggered.

@github-actions github-actions bot removed the Stale label Oct 12, 2023
@jarno-r
Copy link
Author

jarno-r commented Jan 2, 2024

@jeremyyeo Could this be merged?

Copy link

github-actions bot commented Oct 9, 2024

This PR has been marked as Stale because it has been open with no activity as of late. If you would like the PR to remain open, please comment on the PR or else it will be closed in 7 days.

@github-actions github-actions bot added the Stale label Oct 9, 2024
@jarno-r
Copy link
Author

jarno-r commented Oct 9, 2024

This is still relevant.

@github-actions github-actions bot removed the Stale label Oct 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Running RECOVER PARTITIONS without defining partitions
3 participants