Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gsutil_cp errors with large vector #43

Open
LiNk-NY opened this issue Dec 29, 2021 · 7 comments
Open

gsutil_cp errors with large vector #43

LiNk-NY opened this issue Dec 29, 2021 · 7 comments

Comments

@LiNk-NY
Copy link
Contributor

LiNk-NY commented Dec 29, 2021

Hi Martin, @mtmorgan

I am unable to see the error message from gsutil cp -m:
It is somehow cut off.

    gsutil_cp(assayfiles, rpath)
Error: 'gsutil -m cp gs://firecloud-tcga-open-access/tcga/dcc/coad/snp__genome_wide_snp_6__broad_mit_edu__Level_3__segmented_scna_minus_germline_cnv_hg18__seg/broad.mit.edu_COAD.Genome_Wide_SNP_6.Level_3.385.2008.0/TCGA-3L-AA1B-10A-01D-A36Z-01.seg.txt gs://firecloud-tcga-open-access/tcga/dcc/coad/snp__genome_wide_snp_6__broad_mit_edu__Level_3__segmented_scna_minus_germline_cnv_hg18__seg/broad.mit.edu_COAD.Genome_Wide_SNP_6.Level_3.385.2008.0/TCGA-3L-AA1B-01A-11D-A36W-01.seg.txt gs://firecloud-tcga-open-access/tcga/dcc/coad/snp__genome_wide_snp_6__broad_mit_edu__Level_3__segmented_scna_minus_germline_cnv_hg18__seg/broad.mit.edu_COAD.Genome_Wide_SNP_6.Level_3.385.2008.0/TCGA-4N-A93T-10A-01D-A36Z-01.seg.txt gs://firecloud-tcga-open-access/tcga/dcc/coad/snp__genome_wide_snp_6__broad_mit_edu__Level_3__segmented_scna_minus_germline_cnv_hg18__seg/broad.mit.edu_COAD.Genome_Wide_SNP_6.Level_3.385.2008.0/TCGA-4N-A93T-01A-11D-A36W-01.seg.txt gs://firecloud-tcga-open-access/tcga/dcc/coad/snp__geno

The vector assayfiles is of length 918. It works when I break the vector up into three components:

splits <- cut(seq_along(assayfiles), 3, labels = letters[1:3])
lapply(split(assayfiles, splits), gsutil_cp, destination = rpath)

I suspect it has something to do with the length?

Here is a text file, if you'd like to test with the vector:
assayfiles.txt

@mtmorgan
Copy link
Collaborator

mtmorgan commented Dec 29, 2021

This uses system2() and then R's condition handling, and it looks like there is a limit to the length of the condition message display (described in ?stop, limiting to 1000 characters by default) but also apparently to the message itself

> assays = readLines("~/Downloads/assayfiles.txt")
> stop(assays)
Error: gs://firecloud-tcga-open-access/tcga/dcc/coad/snp__genome_wide_snp_6__broad_mit_edu__Level_3__segmented_scna_minus_germline_cnv_hg18__seg/broad.mit.edu_COAD.Genome_Wide_SNP_6.Level_3.385.2008.0/TCGA-3L-AA1B-10A-01D-A36Z-01.seg.txtgs://firecloud-tcga-open-access/tcga/dcc/coad/snp__genome_wide_snp_6__broad_mit_edu__Level_3__segmented_scna_minus_germline_cnv_hg18__seg/broad.mit.edu_COAD.Genome_Wide_SNP_6.Level_3.385.2008.0/TCGA-3L-AA1B-01A-11D-A36W-01.seg.txtgs://firecloud-tcga-open-access/tcga/dcc/coad/snp__genome_wide_snp_6__broad_mit_edu__Level_3__segmented_scna_minus_germline_cnv_hg18__seg/broad.mit.edu_COAD.Genome_Wide_SNP_6.Level_3.385.2008.0/TCGA-4N-A93T-10A-01D-A36Z-01.seg.txtgs://firecloud-tcga-open-access/tcga/dcc/coad/snp__genome_wide_snp_6__broad_mit_edu__Level_3__segmented_scna_minus_germline_cnv_hg18__seg/broad.mit.edu_COAD.Genome_Wide_SNP_6.Level_3.385.2008.0/TCGA-4N-A93T-01A-11D-A36W-01.seg.txtgs://firecloud-tcga-open-access/tcga/dcc/coad/snp__genome_wide_snp_6__bro
> nchar(tryCatch(stop(assays), error =identity))
message    call
   8190      50

I'm not sure if that limit is R or gsutil... what happens when you run gsutil -m cp <content of assayfiles as single test string> at the command line?

@LiNk-NY
Copy link
Contributor Author

LiNk-NY commented Dec 30, 2021

I am unable to paste the command into the RStudio terminal on Terra. Some mechanism is cancelling my long paste command with Ctrl + C :

[...]
gs://firecloud-tcga-open-access/tcga/dcc/coad/snp__genome_wide_snp_6__broad_mit_edu__Level_3__segmented_scna_minus_germline_cnv_hg18__seg/broad.mit.edu_COAD.Genome_Wide_SNP_6.Level_3.422.2009.0/TCG^CCaught CTRL-C (signal 2) - exiting

Is there a way to ssh into these instances?

@LiNk-NY
Copy link
Contributor Author

LiNk-NY commented Dec 30, 2021

I created a shell script with the text in RStudio and was unable to run via the Run button which seems to do a copy and paste into the terminal with the same Ctrl + C issue. BUT using the Run Script button, it does system("./test.sh") and runs successfully.

@mtmorgan
Copy link
Collaborator

mtmorgan commented Jan 2, 2022

Can you try the gcloud-error-abbrev-command branch? It truncates the command when reporting the error, so there should be enough space for the text of the message to be visible.

@LiNk-NY
Copy link
Contributor Author

LiNk-NY commented Jan 3, 2022

Thanks Martin! This does not work for me.
It seems like the code to truncate the command should either be in the tryCatch function or the stop within the error handler removed so that the output gets formatted.

@mtmorgan
Copy link
Collaborator

mtmorgan commented Jan 3, 2022

Yes, thanks, making that change now leads to

Error: 'gsutil -m cp 'gs://firecloud-tcga-open-access/tcga/dcc/coad/snp__genome_wide_snp_6__broad_mit_edu__Level_3__segmented_scna_minus_germline_cnv_hg18__seg/broad.mit.edu_COAD.Genome_Wide_SNP_6.Level_3....' failed:

  cannot popen ''/usr/lib/google-cloud-sdk/bin/gsutil' -m cp 'gs://firecloud-tcga-open-access/tcga/dcc/coad/snp__genome_wide_snp_6__broad_mit_edu__Level_3__segmented_scna_minus_germline_cnv_hg18__seg/broad.mit.edu_COAD.Genome_Wide_SNP_6.Level_3.385.2008.0/TCGA-3L-AA1B-10A-01D-A36Z-01.seg.txt' 'gs://firecloud-tcga-open-access/tcga/dcc/coad/snp__genome_wide_snp_6__broad_mit_edu__Level_3__segmented_scna_minus_germline_cnv_hg18__seg/broad.mit.edu_COAD.Genome_Wide_SNP_6.Level_3.385.2008.0/TCGA-3L-AA1B-01A-11D-A36W-01.seg.txt' 'gs://firecloud-tcga-open-access/tcga/dcc/coad/snp__genome_wide_snp_6__broad_mit_edu__Level_3__segmented_scna_minus_germline_cnv_hg18__seg/broad.mit.edu_COAD.Genome_Wide_SNP_6.Level_3.385.2008.0/TCGA-4N-A93T-10A-01D-A36Z-01.seg.txt' 'gs://firecloud-tcga-

where this appears to come from https://github.com/wch/r-source/blob/02e4ee85542c87153eab3bed860fdd6ffd9920be/src/unix/sys-unix.c#L733 which suffers from the same problem -- a long command hides the 'probable reason'.

Reported here https://bugs.r-project.org/show_bug.cgi?id=18274 . It seems very difficult to find out what the underlying reason is (file or memory limits on the user?) without getting the 'probable reason'.

@mtmorgan
Copy link
Collaborator

mtmorgan commented Jan 3, 2022

Here's a simpler example with the system2() call like that in gsutil_cp()

> system2("echo", c("-m", "cp", assay), stderr = TRUE, stdout = TRUE)
Error in system2("echo", c("-m", "cp", assay), stderr = TRUE, stdout = TRUE) :
  cannot popen ''echo' -m cp gs://firecloud-tcga-open-access/tcga/dcc/coad/snp__genome_wide_snp_6__broad_mit_edu__Level_3__segmented_scna_minus_germline_cnv_hg18__seg/broad.mit.edu_COAD.Genome_Wide_SNP_6.Level_3.385.2008.0/TCGA-3L-AA1B-10A-01D-A36Z-01.seg.txt gs://firecloud-tcga-open-access/tcga/dcc/coad/snp__genome_wide_snp_6__broad_mit_edu__Level_3__segmented_scna_minus_germline_cnv_hg18__seg/broad.mit.edu_COAD.Genome_Wide_SNP_6.Level_3.385.2008.0/TCGA-3L-AA1B-01A-11D-A36W-01.seg.txt gs://firecloud-tcga-open-access/tcga/dcc/coad/snp__genome_wide_snp_6__broad_mit_edu__Level_3__segmented_scna_minus_germline_cnv_hg18__seg/broad.mit.edu_COAD.Genome_Wide_SNP_6.Level_3.385.2008.0/TCGA-4N-A93T-10A-01D-A36Z-01.seg.txt gs://firecloud-tcga-open-access/tcga/dcc/coad/snp__genome_wide_snp_6__broad_mit_edu__Level_3__segmented_scna_minus_germline_cnv_hg18__seg/broad.mit.edu_COAD.Genome_Wide_SNP_6.Level_3.385.2008.0/TCGA-4N-A93T-01A-11D-A36W-01.seg.txt gs://firecloud-tcga-open-access/tcga/dcc
>

But running without internalizing stderr / stdout gives

> system2("echo", c("-m", "cp", assay))
Warning messages:
1: In system2("echo", c("-m", "cp", assay)) :
  system call failed: Argument list too long
2: In system2("echo", c("-m", "cp", assay)) : error in running command

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants