Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spark 3.5: Implement RewriteTablePath #11555

Merged
merged 8 commits into from
Jan 8, 2025
Merged
Show file tree
Hide file tree
Changes from 7 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 19 additions & 6 deletions api/src/main/java/org/apache/iceberg/actions/RewriteTablePath.java
Original file line number Diff line number Diff line change
Expand Up @@ -38,9 +38,10 @@
* <ol>
* <li>The name of the latest metadata.json rewritten to staging location. After the files are
* copied, this will be the root of the copied table.
* <li>A list of all files added to the table between startVersion and endVersion, including their
* original and target paths under the target prefix. This list covers both original and
* rewritten files, allowing for copying to the target paths to form the copied table.
* <li>A 'copy-plan'. This is a list of all files added to the table between startVersion and
* endVersion, including their original and target paths under the target prefix. This list
* covers both original and rewritten files, allowing for copying a functioning version of the
* source table to the target prefix.
* </ol>
*/
public interface RewriteTablePath extends Action<RewriteTablePath, RewriteTablePath.Result> {
Expand Down Expand Up @@ -91,9 +92,21 @@ interface Result {
String stagingLocation();

/**
* Path to a comma-separated list of source and target paths for all files added to the table
* between startVersion and endVersion, including original data files and metadata files
* rewritten to staging.
* Result file list location. This file contains a listing of all files added to the table
* between startVersion and endVersion, comma-separated. <br>
* For each file, it will include the source path (either the original path in the table, or in
* the staging location if rewritten), and the target path (under the new prefix).
*
* <p>Example file content:
*
* <pre><code>
* sourcepath/datafile1.parquet,targetpath/datafile1.parquet
* sourcepath/datafile2.parquet,targetpath/datafile2.parquet
* stagingpath/manifest.avro,targetpath/manifest.avro
* </code></pre>
*
* <br>
* This allows for copying a functioning version of the table to the target prefix.
*/
String fileListLocation();

Expand Down
Loading
Loading