How to overwrite batch transform output in S3 #68

BaoshengHeTR · 2020-09-04T05:13:54Z

I did not find the doc on overwrite batch transform output
If I try to run the same batch transform job multiple times along the time, how should I set the transformer to overwrite the output results (i.e., I don not change the output_path)

The text was updated successfully, but these errors were encountered:

chuyang-deng · 2020-09-22T18:36:44Z

Hi @BaoshengHeTR, are you using Python SDK? If so, if you use the same path (https://github.com/aws/sagemaker-python-sdk/blob/master/src/sagemaker/transformer.py#L59) for multiple different times, you should have the results stored in the same location in S3.

BaoshengHeTR · 2020-09-22T19:01:38Z

Hi @BaoshengHeTR, are you using Python SDK? If so, if you use the same path (https://github.com/aws/sagemaker-python-sdk/blob/master/src/sagemaker/transformer.py#L59) for multiple different times, you should have the results stored in the same location in S3.

Yes. Doing that way makes new results append to the old ones, right? So can we set up an overwritting way? Like in Spark, we have write.mode("overwrite").

haoransh · 2021-02-19T22:59:01Z

Any update on this? I also need an overwrite mode especially when the input S3 path is the output from a spark job.

matiassciencenow · 2021-02-23T10:05:02Z

Same issue here. It would be ideal to be able to overwrite previous results from batch inferences instead of appending them, and the same feature for processing jobs.

melaniemoy · 2021-06-25T21:32:35Z

Throwing in another vote for this functionality. We had to modify our Airflow task to clean the directory before starting the prediction task, but it'd be nicer to be able to use .mode("overwrite") instead.

chuyang-deng added the type: documentation label Sep 22, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to overwrite batch transform output in S3 #68

How to overwrite batch transform output in S3 #68

BaoshengHeTR commented Sep 4, 2020

chuyang-deng commented Sep 22, 2020

BaoshengHeTR commented Sep 22, 2020

haoransh commented Feb 19, 2021

matiassciencenow commented Feb 23, 2021

melaniemoy commented Jun 25, 2021 •

edited

Loading

How to overwrite batch transform output in S3 #68

How to overwrite batch transform output in S3 #68

Comments

BaoshengHeTR commented Sep 4, 2020

chuyang-deng commented Sep 22, 2020

BaoshengHeTR commented Sep 22, 2020

haoransh commented Feb 19, 2021

matiassciencenow commented Feb 23, 2021

melaniemoy commented Jun 25, 2021 • edited Loading

melaniemoy commented Jun 25, 2021 •

edited

Loading