-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implementation of instance-level download #113
Implementation of instance-level download #113
Conversation
DanielaSchacherer
commented
Aug 13, 2024
- added support of instance-level download as described in Add support for download of single instances instead of whole series #97 (comment)
- Open problem: we can not estimate / inform the user about download size as we only know the whole series' sizes
- Open problem: especially the IDCClient.download_from_selection() now contains a lot of if-statements and performs a lot of tasks, which makes the code uneasy to read. But this might also be tackled later.
- Added printing configurations for codespell hook.
We do have instance size in |
idc_index/index.py
Outdated
@@ -1314,6 +1343,17 @@ def _format_size(size_MB): | |||
return f"{round(size_GB, 2)} GB" | |||
return f"{round(size_MB, 2)} MB" | |||
|
|||
@staticmethod |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's combine this with the previous function!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did that.
eda4f82
to
6b7b41c
Compare
In the future, we can add convenience conversion util function that could serve same purpose without being attached to a specific class variable parent aa021d5 author ds_93 <daniela.schacherer@web.de> 1723195062 +0200 committer ds_93 <daniela.schacherer@web.de> 1726760294 +0200 parent aa021d5 author ds_93 <daniela.schacherer@web.de> 1723195062 +0200 committer ds_93 <daniela.schacherer@web.de> 1726760288 +0200 BUG: fixed codespell complaint BUG: fixed pylint errors BUG: fixed pylint errors BUG: fixed test ENH: added printing configurations for codespell added download size calculation for single instance download. Changed tqdm bar data to be displayed in GB/MB instead of bytes ENH: enable downloading data in manifests from previous idc versions ENH: add description for the previous versions index ENH: fix error messages for items not identified in the current version ENH: add clinical_index also added checks for existence of the URLs containing remote indices BUG: use trim to remove any extraneous spaces while parsing s3 url in manifest ENH: simplify s3_url extraction update to simplify to use clinical_index from idc-index clarify wording of a section BUG: remove notebook comitted by accident from Colab BUG: fix viewer series selection parameter SeriesInstanceUID changed to SeriesInstanceUIDs, see https://docs.ohif.org/migration-guide/from-3p7-to-3p8#studyinstanceuid-in-the-url-param ENH: simplify download and create destination directory if needed Automatic creation of the destination directory mimics the behavior of s5cmd and simplifies usage DOC: fix explanation of the dirTemplate parameter fixed bug fixed bug second try wip ENH: update to IDC v19 ENH: upgrade idc-index-data and use query-based clinical_index removed hardcoded idc-index-data version
This reverts commit eda4f82.
This reverts commit 111c01c01ac37ed9943ea5ecf762ebbcffb96a74.
This reverts commit 585c859.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have just a couple of questions. Let's discuss when you are back next week!
idc_index/index.py
Outdated
with open(filepath, mode="wb") as file: | ||
file.write(response.content) | ||
setattr(self.__class__, index, pd.read_parquet(filepath)) | ||
|
||
# Join new index with main index |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would think that here we just need the series_aws_url
column from index
, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We would also need all the attributes necessary for building the hierarchy, but I fixed this in the respective SQL query and do not merge anything when adding a new index.
I am wondering why checks passed for commit bbef810, where they shouldn't in my opinion and also did not when I executed them manually.
…nload to be able to catch attributes necessary for hierarchy template
b4b4e34
to
e93bad2
Compare
@fedorov It's working again now. |
Hello @vkt1414 :)
|
Hi @DanielaSchacherer ! |
Hi @vkt1414, |
I'll take a look at how you are implementing this feature, and get back to you! |
Hi @vkt1414, right now, it only supports what is in the sm_instance_index (which is only the current version). |
24380b1
to
f19ff92
Compare
f19ff92
to
ef0101d
Compare
We do not currently support instance-level manifests as input to the command-line download tool. We also do not provide any mechanism to generate instance-level manifests from the portal. Sync operation is using series level size for estimating progress. Implementing support for syncing instance-level manifests will take work, which would not be justified: for now, I think it is safe to assume instance-level download will only be invoked by passing SOPInstanceUID to the functions/download tool, and with just a few instances/files, syncing those instead of copy won't bring much benefit.
9cafab9
to
5af1112
Compare