Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RE2022-209: workspace uploader (working script) #381

Merged
merged 97 commits into from
Aug 17, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
97 commits
Select commit Hold shift + click to select a range
24ce4ad
test uploader
Xiangs18 Jun 30, 2023
366376e
update params
Xiangs18 Jun 30, 2023
534d176
update assembly name
Xiangs18 Jun 30, 2023
fa9ee86
get_assembly_as_fasta before upload
Xiangs18 Jun 30, 2023
ec3be6e
testing
Xiangs18 Jun 30, 2023
b822ef8
restructure
Xiangs18 Jun 30, 2023
201c0fb
update workspace_name
Xiangs18 Jun 30, 2023
b42ba13
add .fasta
Xiangs18 Jun 30, 2023
75c3caf
test again
Xiangs18 Jun 30, 2023
a87d171
assert path exist
Xiangs18 Jul 6, 2023
b68792c
workspace name
Xiangs18 Jul 6, 2023
06af0e6
modify file path
Xiangs18 Jul 6, 2023
b57806c
update path
Xiangs18 Jul 6, 2023
aa4a22a
switch to .fna file extension
Xiangs18 Jul 6, 2023
54583d9
remove underscore
Xiangs18 Jul 6, 2023
838530d
use the latest AssemblyUtilClient.py file
Xiangs18 Jul 6, 2023
2def748
use the bulk method
Xiangs18 Jul 6, 2023
dc2e39b
add {}
Xiangs18 Jul 6, 2023
f3961eb
remove dict for path
Xiangs18 Jul 6, 2023
0b67670
Merge branch 'main' into dev-worksapce_uploader
Xiangs18 Jul 13, 2023
a6fa6e9
finish majority of script; need add vol and test
Xiangs18 Jul 14, 2023
9302c50
refactor for easier in parallel setup
Xiangs18 Jul 16, 2023
5c2bc13
initial test for workspace_uploader
Xiangs18 Jul 16, 2023
b7a7b75
update image name
Xiangs18 Jul 17, 2023
6769b67
fix incorrect env for ncbi
Xiangs18 Jul 17, 2023
1b4ea90
use Tuple[]
Xiangs18 Jul 17, 2023
32e9bae
mount work_dir
Xiangs18 Jul 17, 2023
1996a05
make message more descrptive
Xiangs18 Jul 17, 2023
5e84b3b
give it the path from the mount point inside the container
Xiangs18 Jul 17, 2023
514a609
hard link files of interest and update script
Xiangs18 Jul 19, 2023
bab7d86
add timing
Xiangs18 Jul 19, 2023
17b6dcd
rename var
Xiangs18 Jul 19, 2023
a09939c
more testing and softlink
Xiangs18 Jul 19, 2023
3e3e492
remove duplicate logic
Xiangs18 Jul 19, 2023
08a698c
add help message
Xiangs18 Jul 19, 2023
6ccdfeb
clean up
Xiangs18 Jul 19, 2023
031fe4f
keep 2 decimal places for speed
Xiangs18 Jul 19, 2023
7c4ef50
update print message
Xiangs18 Jul 19, 2023
dbeb3bd
diplay failed_paths
Xiangs18 Jul 19, 2023
f57c84b
change upa to genome_id
Xiangs18 Jul 19, 2023
074fb3e
add assemby id
Xiangs18 Jul 19, 2023
256bf2c
Merge branch 'main' into dev-worksapce_uploader
Xiangs18 Jul 20, 2023
202c3c2
add uploaded.yaml file to track uploaded assemblies
Xiangs18 Jul 21, 2023
94281b2
update logic
Xiangs18 Jul 21, 2023
9278a69
update _update_yaml_file logic with overwrite params
Xiangs18 Jul 21, 2023
4e39672
testing
Xiangs18 Jul 21, 2023
35e841f
Merge branch 'main' into dev-worksapce_uploader
Xiangs18 Jul 21, 2023
b4fa459
fix bug
Xiangs18 Jul 21, 2023
61c7bb7
Merge branch 'main' into dev-worksapce_uploader
Xiangs18 Jul 21, 2023
02cfbb9
clean up & finish
Xiangs18 Jul 21, 2023
8bbb3d0
add type hint for data
Xiangs18 Jul 21, 2023
4b663ad
1. add upa; 2. add ymal in each genome_dir; 3. address all comments
Xiangs18 Jul 26, 2023
fd41f60
remove duplicate code
Xiangs18 Jul 26, 2023
45c28f1
clean up
Xiangs18 Jul 26, 2023
35d6309
Merge branch 'main' into dev-worksapce_uploader
Xiangs18 Jul 26, 2023
2cfcfcf
add the string to the original exception
Xiangs18 Jul 27, 2023
c6891e8
decouple methods
Xiangs18 Jul 27, 2023
0959438
update comments
Xiangs18 Jul 28, 2023
500d7f4
update usage, function names, and _get_yaml_file_path
Xiangs18 Jul 31, 2023
cc46717
get upa through _upload_assembly_to_workspace fun
Xiangs18 Aug 1, 2023
bcacd14
add parallel processing
Xiangs18 Aug 1, 2023
4bf9c66
use queue for parallized uploading
Xiangs18 Aug 2, 2023
00eae81
more descriptive var names
Xiangs18 Aug 2, 2023
b5d6945
update comments
Xiangs18 Aug 2, 2023
291b6ff
move Queue and Pool inside Conf
Xiangs18 Aug 2, 2023
0863868
clean up
Xiangs18 Aug 3, 2023
54aee87
use conf.workers
Xiangs18 Aug 3, 2023
3615f82
remove unused module
Xiangs18 Aug 3, 2023
e04a6d3
Merge branch 'main' into dev-worksapce_uploader
Xiangs18 Aug 3, 2023
500e288
1. add a temp directory that's unique per instance
Xiangs18 Aug 7, 2023
d4cb2c5
Merge branch 'main' into dev-worksapce_uploader
Xiangs18 Aug 7, 2023
382e040
update root_dir globally
Xiangs18 Aug 7, 2023
dabbd72
Merge branch 'main' into dev-worksapce_uploader
Xiangs18 Aug 8, 2023
dd39b7b
move Conf into a common module
Xiangs18 Aug 8, 2023
a70592f
update type hint
Xiangs18 Aug 8, 2023
2329cb9
add % of how many assemblies have been processed
Xiangs18 Aug 8, 2023
684f47c
clean up and remove extra output_dir
Xiangs18 Aug 8, 2023
1c48e38
add print message
Xiangs18 Aug 8, 2023
65490da
fix poor wording
Xiangs18 Aug 8, 2023
b251f04
use UPA.fna.gz rather than assembly name
Xiangs18 Aug 8, 2023
ebbd0a6
Merge branch 'main' into dev-worksapce_uploader
Xiangs18 Aug 8, 2023
7fc90cc
Merge branch 'main' into dev-worksapce_uploader
Xiangs18 Aug 8, 2023
1346dbd
include uuid as part of job_dir and testing
Xiangs18 Aug 11, 2023
90f93bb
add default for conf
Xiangs18 Aug 12, 2023
959fbe5
move workdir/tmp into conf
Xiangs18 Aug 14, 2023
c04c9b3
pass in conf to get data_dir
Xiangs18 Aug 14, 2023
db68011
Merge branch 'main' into dev-worksapce_uploader
Xiangs18 Aug 14, 2023
18f8c67
add pydoc
Xiangs18 Aug 14, 2023
897c6b5
Merge branch 'main' into dev-worksapce_uploader
Xiangs18 Aug 15, 2023
14ba71d
fix docs for conf class
Xiangs18 Aug 16, 2023
5e06173
update make_job_data_dir function with gavin's comment
Xiangs18 Aug 16, 2023
851d3e5
update to <job_dir>/workdir
Xiangs18 Aug 16, 2023
fa76837
Merge branch 'main' into dev-worksapce_uploader
Xiangs18 Aug 16, 2023
1f282ef
update output type hint
Xiangs18 Aug 16, 2023
b767bb1
Merge branch 'main' into dev-worksapce_uploader
Xiangs18 Aug 16, 2023
118a8a2
Merge branch 'main' into dev-worksapce_uploader
Xiangs18 Aug 17, 2023
ad1e29d
Merge branch 'main' into dev-worksapce_uploader
Xiangs18 Aug 17, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
271 changes: 177 additions & 94 deletions src/clients/AssemblyUtilClient.py

Large diffs are not rendered by default.

13 changes: 12 additions & 1 deletion src/loaders/common/loader_common_names.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,11 @@

# Arguments Descriptions

# Name for root directory argument
ROOT_DIR_ARG_NAME = "root_dir"
# Description of the --root_dir argument in various loaders programs.
ROOT_DIR_DESCR = "Root directory for the collections project."

# Name for load version argument
LOAD_VER_ARG_NAME = "load_ver"
# Description of the --load_ver argument in various loaders programs.
Expand Down Expand Up @@ -33,6 +38,7 @@
"""
File structure at NERSC for loader programs
"""
WS = "WS" # workspace

ROOT_DIR = (
"/global/cfs/cdirs/kbase/collections" # root directory for the collections project
Expand Down Expand Up @@ -71,7 +77,7 @@
# JSON keys in the download metadata file in a download directory
SOURCE_METADATA_FILE_KEYS = ["upa", "name", "type", "timestamp"]
# callback server docker image name
CALLBACK_IMAGE_NAME = "scanon/callback"
CALLBACK_IMAGE_NAME = "kbase/callback:test" #TODO switch to kbase/callback:latest

# a list of IDs provided to the computation script
DATA_ID_COLUMN_HEADER = "genome_id" # TODO DATA_ID change to data ID for generality
Expand Down Expand Up @@ -120,3 +126,8 @@
# TODO DOWNLOAD if we settle on a standard file name scheme for downloaders we can get
# rid of this
STANDARD_FILE_EXCLUDE_SUBSTRINGS = ['cds_from', 'rna_from', 'ERR']

KB_BASE_URL_MAP = {'CI': 'https://ci.kbase.us/services/',
'NEXT': 'https://next.kbase.us/services/',
'APPDEV': 'https://appdev.kbase.us/services/',
'PROD': 'https://kbase.us/services/'}
155 changes: 121 additions & 34 deletions src/loaders/common/loader_helper.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,12 @@
import argparse
import itertools
import json
import os
import socket
import stat
import subprocess
import time
import uuid
from collections import defaultdict
from contextlib import closing
from pathlib import Path
Expand All @@ -13,15 +16,18 @@

import src.common.storage.collection_and_field_names as names
from src.common.storage.db_doc_conversions import collection_data_id_key
from src.loaders.common import loader_common_names
from src.loaders.common.loader_common_names import (
COLLECTION_SOURCE_DIR,
DOCKER_HOST,
FATAL_ERROR,
FATAL_STACKTRACE,
FATAL_TOOL,
IMPORT_DIR,
KB_AUTH_TOKEN,
SDK_JOB_DIR,
SOURCE_DATA_DIR,
SOURCE_METADATA_FILE_KEYS,
WS,
)

"""
Expand Down Expand Up @@ -144,7 +150,7 @@

def start_podman_service(uid: int):
"""
Start podman service. Used by workspace_downloader.py script.
Start podman service. Used by workspace_downloader.py and workspace_uploader.py scripts.

uid - the integer unix user ID of the user running the service.
"""
Expand All @@ -154,7 +160,8 @@
time.sleep(1)
return_code = proc.poll()
if return_code:
raise ValueError(f"The command {command} failed with return code {return_code}")
raise ValueError(f"The command {command} failed with return code {return_code}. "

Check warning on line 163 in src/loaders/common/loader_helper.py

View check run for this annotation

Codecov / codecov/patch

src/loaders/common/loader_helper.py#L163

Added line #L163 was not covered by tests
f"Podman service failed to start")
os.environ["DOCKER_HOST"] = DOCKER_HOST.format(uid)
return proc

Expand All @@ -179,69 +186,149 @@
return True


def make_collection_source_dir(
root_dir: str,
env: str,
collection: str,
source_ver: str
) -> str:
def make_job_dir(root_dir, username):
"""Helper function that creates a job_dir for a user under root directory."""
job_dir = os.path.join(root_dir, SDK_JOB_DIR, username, uuid.uuid4().hex)
os.makedirs(job_dir, exist_ok=True)

Check warning on line 192 in src/loaders/common/loader_helper.py

View check run for this annotation

Codecov / codecov/patch

src/loaders/common/loader_helper.py#L191-L192

Added lines #L191 - L192 were not covered by tests
# only user can read, write, or execute
os.chmod(job_dir, stat.S_IRWXU)
return job_dir

Check warning on line 195 in src/loaders/common/loader_helper.py

View check run for this annotation

Codecov / codecov/patch

src/loaders/common/loader_helper.py#L194-L195

Added lines #L194 - L195 were not covered by tests


def make_job_data_dir(job_dir):
"""
Helper function that creates a temporary directory for sharing files between the host, callback server, and container.

SDK modules (like AssemblyUtil) have the shared directory mounted in the container at `/kb/module/work`. The
scratch directory provided to the SDK module `*Impl.py` code is `/kb/module/work/tmp`. The SDK code is expected
to read and write shared files there.

The callback server mounts `<job_dir>/workdir` as the host shared directory into the SDK module.

`<job_dir>` is also mounted into the callback server and it writes job information (e.g. the token and job configuration)
into `<job_dir>/workdir`
"""
data_dir = os.path.join(job_dir, "workdir/tmp")
os.makedirs(data_dir)
return data_dir

Check warning on line 213 in src/loaders/common/loader_helper.py

View check run for this annotation

Codecov / codecov/patch

src/loaders/common/loader_helper.py#L211-L213

Added lines #L211 - L213 were not covered by tests


def make_sourcedata_ws_dir(root_dir, env, workspace_id):
"""Helper function that creates a output directory for a specific workspace id under root directory."""
output_dir = os.path.join(root_dir, SOURCE_DATA_DIR, WS, env, str(workspace_id))
os.makedirs(output_dir, exist_ok=True)
return output_dir

Check warning on line 220 in src/loaders/common/loader_helper.py

View check run for this annotation

Codecov / codecov/patch

src/loaders/common/loader_helper.py#L218-L220

Added lines #L218 - L220 were not covered by tests


def make_collection_source_dir(root_dir: str, env: str, collection: str, source_ver: str) -> str:
"""
Helper function that creates a collection & source_version and link in data
to that collection from the overall source data dir.
"""
csd = os.path.join(root_dir, loader_common_names.COLLECTION_SOURCE_DIR, env, collection, source_ver)
os.makedirs(csd, exist_ok=True)
return csd
collection_source_dir = os.path.join(root_dir, COLLECTION_SOURCE_DIR, env, collection, source_ver)
os.makedirs(collection_source_dir, exist_ok=True)
return collection_source_dir

Check warning on line 230 in src/loaders/common/loader_helper.py

View check run for this annotation

Codecov / codecov/patch

src/loaders/common/loader_helper.py#L228-L230

Added lines #L228 - L230 were not covered by tests


def create_softlinks_in_csd(csd: str, work_dir: str, genome_ids: list[str], taxonomy_files: list[str] = None) -> None:
def create_softlinks_in_collection_source_dir(
collection_source_dir: str,
work_dir: str,
genome_ids: list[str],
taxonomy_files:list[str] = None
) -> None:
"""
Create softlinks in the collection source dir to the genome files in the work dir.
"""
if not taxonomy_files:
taxonomy_files = []

for genome_id in genome_ids:
genome_dir = os.path.join(work_dir, genome_id)
csd_genome_dir = os.path.join(csd, genome_id)
create_softlink_between_dirs(csd_genome_dir, genome_dir)
target_dir = os.path.join(work_dir, genome_id)
new_dir = os.path.join(collection_source_dir, genome_id)
create_softlink_between_dirs(new_dir, target_dir)

Check warning on line 248 in src/loaders/common/loader_helper.py

View check run for this annotation

Codecov / codecov/patch

src/loaders/common/loader_helper.py#L246-L248

Added lines #L246 - L248 were not covered by tests

for taxonomy_file in taxonomy_files:
csd_file = os.path.join(csd, taxonomy_file)
sd_file = os.path.join(work_dir, taxonomy_file)
create_softlink_between_files(csd_file, sd_file)
new_file = os.path.join(collection_source_dir, taxonomy_file)
target_file = os.path.join(work_dir, taxonomy_file)
create_softlink_between_files(new_file, target_file)

Check warning on line 253 in src/loaders/common/loader_helper.py

View check run for this annotation

Codecov / codecov/patch

src/loaders/common/loader_helper.py#L251-L253

Added lines #L251 - L253 were not covered by tests

print(f"Genome files in {csd} \nnow link to {work_dir}")
print(f"Genome files in {collection_source_dir} \nnow link to {work_dir}")

Check warning on line 255 in src/loaders/common/loader_helper.py

View check run for this annotation

Codecov / codecov/patch

src/loaders/common/loader_helper.py#L255

Added line #L255 was not covered by tests


def create_softlink_between_dirs(csd_dir, sd_dir):
def create_softlink_between_dirs(new_dir, target_dir):
"""
Creates a softlink between two directories.
Creates a softlink from new_dir to the contents of target_dir.
"""
if os.path.exists(csd_dir):
if os.path.exists(new_dir):

Check warning on line 262 in src/loaders/common/loader_helper.py

View check run for this annotation

Codecov / codecov/patch

src/loaders/common/loader_helper.py#L262

Added line #L262 was not covered by tests
if (
os.path.isdir(csd_dir)
and os.path.islink(csd_dir)
and os.readlink(csd_dir) == sd_dir
os.path.isdir(new_dir)
and os.path.islink(new_dir)
and os.readlink(new_dir) == target_dir
):
return
raise ValueError(
f"{csd_dir} already exists and does not link to {sd_dir} as expected"
f"{new_dir} already exists and does not link to {target_dir} as expected"
)
os.symlink(target_dir, new_dir, target_is_directory=True)

Check warning on line 272 in src/loaders/common/loader_helper.py

View check run for this annotation

Codecov / codecov/patch

src/loaders/common/loader_helper.py#L272

Added line #L272 was not covered by tests


def create_softlink_between_files(new_file, target_file):
"""
Creates a softlink from new_file to the contents of target_file.
"""
if os.path.exists(new_file):
if (os.path.islink(new_file) and os.readlink(new_file) == target_file):
return
raise ValueError(

Check warning on line 282 in src/loaders/common/loader_helper.py

View check run for this annotation

Codecov / codecov/patch

src/loaders/common/loader_helper.py#L279-L282

Added lines #L279 - L282 were not covered by tests
f"{new_file} already exists and does not link to {target_file} as expected"
)
os.symlink(sd_dir, csd_dir, target_is_directory=True)
os.symlink(target_file, new_file)

Check warning on line 285 in src/loaders/common/loader_helper.py

View check run for this annotation

Codecov / codecov/patch

src/loaders/common/loader_helper.py#L285

Added line #L285 was not covered by tests


def create_softlink_between_files(csd_file, sd_file):
def create_hardlink_between_files(new_file, target_file):
"""
Creates a softlink between two files.
Creates a hardlink from new_file to the contents of target_file.
"""
if os.path.exists(csd_file):
if (os.path.islink(csd_file) and os.readlink(csd_file) == sd_file):
if os.path.exists(new_file):
if os.path.samefile(target_file, new_file):

Check warning on line 293 in src/loaders/common/loader_helper.py

View check run for this annotation

Codecov / codecov/patch

src/loaders/common/loader_helper.py#L292-L293

Added lines #L292 - L293 were not covered by tests
return
raise ValueError(
f"{csd_file} already exists and does not link to {sd_file} as expected"
f"{new_file} already exists and does not link to {target_file} as expected"
)
os.symlink(sd_file, csd_file)
os.link(target_file, new_file)

Check warning on line 298 in src/loaders/common/loader_helper.py

View check run for this annotation

Codecov / codecov/patch

src/loaders/common/loader_helper.py#L298

Added line #L298 was not covered by tests


def list_objects(wsid, conf, object_type, include_metadata=False, batch_size=10000):
"""
List all objects information given a workspace ID.
"""
if batch_size > 10000:
raise ValueError("Maximum value for listing workspace objects is 10000")

Check warning on line 306 in src/loaders/common/loader_helper.py

View check run for this annotation

Codecov / codecov/patch

src/loaders/common/loader_helper.py#L305-L306

Added lines #L305 - L306 were not covered by tests

maxObjectID = conf.ws.get_workspace_info({"id": wsid})[4]
batch_input = [

Check warning on line 309 in src/loaders/common/loader_helper.py

View check run for this annotation

Codecov / codecov/patch

src/loaders/common/loader_helper.py#L308-L309

Added lines #L308 - L309 were not covered by tests
[idx + 1, idx + batch_size] for idx in range(0, maxObjectID, batch_size)
]
objs = [

Check warning on line 312 in src/loaders/common/loader_helper.py

View check run for this annotation

Codecov / codecov/patch

src/loaders/common/loader_helper.py#L312

Added line #L312 was not covered by tests
conf.ws.list_objects(
_list_objects_params(wsid, min_id, max_id, object_type, include_metadata)
)
for min_id, max_id in batch_input
]
res_objs = list(itertools.chain.from_iterable(objs))
return res_objs

Check warning on line 319 in src/loaders/common/loader_helper.py

View check run for this annotation

Codecov / codecov/patch

src/loaders/common/loader_helper.py#L318-L319

Added lines #L318 - L319 were not covered by tests


def _list_objects_params(wsid, min_id, max_id, type_str, include_metadata):
"""Helper function that creates params needed for list_objects function."""
params = {

Check warning on line 324 in src/loaders/common/loader_helper.py

View check run for this annotation

Codecov / codecov/patch

src/loaders/common/loader_helper.py#L324

Added line #L324 was not covered by tests
"ids": [wsid],
"minObjectID": min_id,
"maxObjectID": max_id,
"type": type_str,
"includeMetadata": int(include_metadata),
}
return params

Check warning on line 331 in src/loaders/common/loader_helper.py

View check run for this annotation

Codecov / codecov/patch

src/loaders/common/loader_helper.py#L331

Added line #L331 was not covered by tests


def get_ip():
Expand Down
14 changes: 9 additions & 5 deletions src/loaders/compute_tools/tool_common.py
Original file line number Diff line number Diff line change
Expand Up @@ -91,7 +91,8 @@
(default: PROD)
--load_ver LOAD_VER KBase load version (e.g. r207.kbase.1). (defaults to
the source version)
--root_dir ROOT_DIR Root directory.
--root_dir ROOT_DIR Root directory for the collections project.
(default: /global/cfs/cdirs/kbase/collections)
--threads THREADS Total number of threads used by the script. (default:
half of system cpu count)
--program_threads PROGRAM_THREADS
Expand Down Expand Up @@ -127,11 +128,12 @@
kbase_collection = getattr(args, loader_common_names.KBASE_COLLECTION_ARG_NAME)
source_ver = getattr(args, loader_common_names.SOURCE_VER_ARG_NAME)
load_ver = getattr(args, loader_common_names.LOAD_VER_ARG_NAME)
root_dir = getattr(args, loader_common_names.ROOT_DIR_ARG_NAME)

Check warning on line 131 in src/loaders/compute_tools/tool_common.py

View check run for this annotation

Codecov / codecov/patch

src/loaders/compute_tools/tool_common.py#L131

Added line #L131 was not covered by tests
if not load_ver:
load_ver = source_ver

self._allow_missing_files = kbase_collection in _IGNORE_MISSING_FILES_COLLECTIONS
self._source_data_dir = Path(args.root_dir,
self._source_data_dir = Path(root_dir,

Check warning on line 136 in src/loaders/compute_tools/tool_common.py

View check run for this annotation

Codecov / codecov/patch

src/loaders/compute_tools/tool_common.py#L136

Added line #L136 was not covered by tests
loader_common_names.COLLECTION_SOURCE_DIR,
env,
kbase_collection,
Expand All @@ -149,7 +151,7 @@
self._threads = max(1, self._threads)

self._work_dir = Path(
Path(args.root_dir),
Path(root_dir),
loader_common_names.COLLECTION_DATA_DIR,
env,
kbase_collection,
Expand Down Expand Up @@ -186,9 +188,11 @@
f'--{loader_common_names.LOAD_VER_ARG_NAME}', type=str,
help=loader_common_names.LOAD_VER_DESCR + ' (defaults to the source version)'
)

optional.add_argument(
'--root_dir', type=str, default=loader_common_names.ROOT_DIR, help='Root directory.'
f'--{loader_common_names.ROOT_DIR_ARG_NAME}',
type=str,
default=loader_common_names.ROOT_DIR,
help=f'{loader_common_names.ROOT_DIR_DESCR} (default: {loader_common_names.ROOT_DIR})'
)
optional.add_argument(
'--threads', type=int,
Expand Down
12 changes: 8 additions & 4 deletions src/loaders/genome_collection/compute_genome_taxa_count.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@
optional arguments:
--env {CI,NEXT,APPDEV,PROD,NONE}
Environment containing the data to be processed. (default: PROD)
--root_dir ROOT_DIR Root directory for the collections project (default: /global/cfs/cdirs/kbase/collections)
--root_dir ROOT_DIR Root directory for the collections project. (default: /global/cfs/cdirs/kbase/collections)
--input_source {GTDB,genome_attributes}
Input file source

Expand Down Expand Up @@ -122,15 +122,19 @@ def main():
default='PROD',
help="Environment containing the data to be processed. (default: PROD)",
)
optional.add_argument('--root_dir', type=str, default=loader_common_names.ROOT_DIR,
help=f'Root directory for the collections project (default: {loader_common_names.ROOT_DIR})')
optional.add_argument(
f'--{loader_common_names.ROOT_DIR_ARG_NAME}',
type=str,
default=loader_common_names.ROOT_DIR,
help=f'{loader_common_names.ROOT_DIR_DESCR} (default: {loader_common_names.ROOT_DIR})'
)

optional.add_argument('--input_source', type=str, choices=VALID_SOURCE, default='GTDB',
help='Input file source')

args = parser.parse_args()
load_files = args.load_files
root_dir = args.root_dir
root_dir = getattr(args, loader_common_names.ROOT_DIR_ARG_NAME)
load_version = getattr(args, loader_common_names.LOAD_VER_ARG_NAME)
kbase_collection = getattr(args, loader_common_names.KBASE_COLLECTION_ARG_NAME)
env = getattr(args, loader_common_names.ENV_ARG_NAME)
Expand Down
10 changes: 7 additions & 3 deletions src/loaders/genome_collection/parse_tool_results.py
Original file line number Diff line number Diff line change
Expand Up @@ -843,8 +843,12 @@
help=f'Extract results from tools. '
f'(default: retrieve all available sub-directories in the '
f'[{loader_common_names.LOAD_VER_ARG_NAME}] directory)')
optional.add_argument('--root_dir', type=str, default=loader_common_names.ROOT_DIR,
help=f'Root directory for the collections project. (default: {loader_common_names.ROOT_DIR})')
optional.add_argument(

Check warning on line 846 in src/loaders/genome_collection/parse_tool_results.py

View check run for this annotation

Codecov / codecov/patch

src/loaders/genome_collection/parse_tool_results.py#L846

Added line #L846 was not covered by tests
f'--{loader_common_names.ROOT_DIR_ARG_NAME}',
type=str,
default=loader_common_names.ROOT_DIR,
help=f'{loader_common_names.ROOT_DIR_DESCR} (default: {loader_common_names.ROOT_DIR})'
)
optional.add_argument('--check_genome', action="store_true",
help='Ensure a corresponding genome exists for every assembly')
optional.add_argument(
Expand All @@ -859,9 +863,9 @@
kbase_collection = getattr(args, loader_common_names.KBASE_COLLECTION_ARG_NAME)
source_ver = getattr(args, loader_common_names.SOURCE_VER_ARG_NAME)
load_ver = getattr(args, loader_common_names.LOAD_VER_ARG_NAME)
root_dir = getattr(args, loader_common_names.ROOT_DIR_ARG_NAME)

Check warning on line 866 in src/loaders/genome_collection/parse_tool_results.py

View check run for this annotation

Codecov / codecov/patch

src/loaders/genome_collection/parse_tool_results.py#L866

Added line #L866 was not covered by tests
if not load_ver:
load_ver = source_ver
root_dir = args.root_dir
check_genome = args.check_genome

if not args.skip_retrieve_sample:
Expand Down
Loading
Loading