Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Large export is not re-importable (ImportError: You tried to import too many files, max. is 1000) #48

Open
moi90 opened this issue Aug 17, 2022 · 2 comments

Comments

@moi90
Copy link
Contributor

moi90 commented Aug 17, 2022

This project consists of many samples (4060 to be exact):
https://ecotaxa.obs-vlfr.fr/prj/6433

When exporting with exp_type=BAK, split_by=S, the archive contains as many individual TSV files.
(Guessing from the UI, split_by should be ignored when doing a BAK export, but I consider this a feature.)

However, when re-importing the same data, I get an import error:

 	You tried to import too many files, max. is 1000

    File "/usr/lib/python3.8/threading.py", line 890, in _bootstrap self._bootstrap_inner()
    File "/usr/lib/python3.8/threading.py", line 932, in _bootstrap_inner self.run()
    File "/app/BG_operations/JobScheduler.py", line 40, in run sce.run_in_background()
    File "/app/API_operations/helpers/JobService.py", line 73, in run_in_background self.do_background()
    File "/app/API_operations/imports/Import.py", line 76, in do_background self.do_validate()
    File "/app/API_operations/imports/Import.py", line 115, in do_validate how, diag, nb_rows = self._collect_existing_and_validate(source_dir_or_zip, loaded_files)
    File "/app/API_operations/imports/Import.py", line 142, in _collect_existing_and_validate source_bundle = InBundle(source_dir_or_zip, bundle_temp_dir)
    File "/app/BO/Bundle.py", line 54, in __init__ one_more()
    File "/app/BO/Bundle.py", line 49, in one_more raise ImportError("You tried to import too many files, max. is %d" % self.MAX_FILES)
    ImportError: You tried to import too many files, max. is 1000

This limitation seems somewhat arbitrary and I think, EcoTaxa should be able to read the data it itself has emitted.

@grololo06
Copy link
Member

grololo06 commented Feb 1, 2023

Hello, the commit for this limit is linked to ecotaxa/ecotaxa_front#675 which is a legitimate attempt to protect the system from some kinds of errors. But indeed a BAK should be readable.

@moi90
Copy link
Contributor Author

moi90 commented Feb 6, 2023

Maybe, this can be resolved on the export side then? Split files if containing more than 1000 objects? But this might break things on the user's side...

@moi90 moi90 changed the title ImportError: You tried to import too many files, max. is 1000 Large export is not re-importable (ImportError: You tried to import too many files, max. is 1000) Feb 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants