Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve dmod.client and update CLI #424

Merged

Conversation

robertbartel
Copy link
Contributor

General improvements to the CLI, as well as updates to bring in line with more recent DMOD updates to serialization, communication client classes, and user config representation.

Note that this PR depends on #409 and should remain in draft status until that is merged.

@robertbartel robertbartel added enhancement New feature or request maas MaaS Workstream labels Sep 8, 2023
@robertbartel robertbartel force-pushed the f/cli_updates/client_pkg_improvements branch from 6f55b6d to 8d29ffe Compare September 8, 2023 20:34
@robertbartel robertbartel force-pushed the f/cli_updates/client_pkg_improvements branch from 8d29ffe to 5caf4a3 Compare September 12, 2023 18:23
@robertbartel robertbartel marked this pull request as ready for review September 12, 2023 18:23
Copy link
Member

@aaraney aaraney left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @robertbartel! I have a few fairly minor comments that more or less boil down to API parameter choices. There is one bug that will need to be addressed but other than that, this should be pretty easy to get through. Thanks!

python/lib/client/dmod/client/dmod_client.py Show resolved Hide resolved
python/lib/client/dmod/client/dmod_client.py Show resolved Hide resolved
python/lib/client/dmod/client/dmod_client.py Show resolved Hide resolved
async def upload_to_dataset(self, dataset_name: str, paths: List[Path]) -> bool:
chunk_size = 1024

with source.open('r') as file:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to open this in reading bytes, right? Since we dont know what kind of item is being uploaded.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm ... yeah, I believe you are correct, and as such we should do similarly for writes as you suggested above. But this actually becomes a larger undertaking. All the links in the chain will need to be modified to accommodate, from DataTransmitMessage.data to the declaration and implementations of _DatasetManager.add_data()` (and associated helper methods).

It probably makes sense to do this as part of #432. What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ping again here also. This probably relates to the earlier item and my old follow-up questions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I agree. This is a larger undertaking and not directly related so should be in its own PR.

I opened #486 to track changing DataTransmitMessage's data field to a more appropriate type. Im not sure if there are other models off the top of my head that will also need to be changed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It didn't occur to me back in September, but (unless I'm missing something) we can't actually do this because it will break JSON serialization.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeahhhhh, you are right. I forgot about that. Does that mean you can only upload and download unicode encoded data at this moment? I don't recall seeing any logic to base64 / base85 encode binary data.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just as a sanity check, if we decode and then encode binary data using the "wrong" encoding, but it's the same encoding on both ends, won't we get the same data we started with?

Assuming more discussion and work is needed, I suggest that we defer this problem. A solution will probably be complex, and we can find workarounds as needed to get datasets into the system (worst case: temporarily opening up MinIO externally and uploading directly).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just as a sanity check, if we decode and then encode binary data using the "wrong" encoding, but it's the same encoding on both ends, won't we get the same data we started with?

Yeah! You are right. I may be blowing this out of proportion and it might not be a problem after all. I thought to transmit any data to and from the request service (and therefore the data service) it must be in json and therefore in some unicode flavor (e.g. utf-8). So, for example if I wanted to upload a netCDF forcing file to a dataset, I think I would first need to encode and chunk the data using base64 or base85, package that up in json, and send it along, right? I agree that we should move this discussion elsewhere and just focus on getting the PR through lol. Sorry to trying to derail us 😅!

python/lib/client/dmod/client/request_clients.py Outdated Show resolved Hide resolved
python/lib/client/dmod/client/request_clients.py Outdated Show resolved Hide resolved
"""

def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
#super().__init__(*args, **kwargs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: could be removed.

@robertbartel
Copy link
Contributor Author

Trying to get this one moving again after it has sat for a while. I think there are some outstanding follow-up questions, and possible tasks resulting from those discussions, but everything else is settled. @aaraney, when you are able, can you confirm all the other conversations can be marked resolved?

@aaraney
Copy link
Member

aaraney commented Jan 23, 2024

@robertbartel, thanks for revitalizing this! I'd forgotten about it and let it get stale. I've resolved relevant comments and contributed to existing conversation to move this forward. We are getting really close, just a few minor things.

aaraney
aaraney previously approved these changes Jan 24, 2024
Replacing ngen-job-specific, deprecated client classes with more general
JobClient.
robertbartel and others added 24 commits January 24, 2024 14:55
Adding more help detail for various subcommands, and renaming some
subcommands to be more intuitive.
Replacing pyyaml with Pydantic for ClientConfig changes, and bumping to
latest communication package version.
Fixing bug with misnamed initial variable usage for setting up
first message to send before look-ahead sending loop.
Having usage wrap raised exceptions and raise new ones that include the
message of the former, rather than crudely pass the name of the user
function to DataServiceClient._process_request().
Rename get_dataset_items method in DataServiceClient to
get_dataset_item_names.
Fixing mismatched strings between CLI prepared args and supported
action strings in DmodClient data_service_action() func.
Commenting out seemingly unneeded SchedulerClient init in RequestService.
Ensuring ConnectionContextClient async_recv() uses an async context
manager.
Adding support to package __main__ routine for turning on remote
debugging when properly configured.
Re-add DEFAULT_CLIENT_CONFIG_BASENAME definition (that was still being
used) and refactor an error message.
Updating Dockerfile for app_server image to create a client config file
using the new JSON syntax, complete with optional debugging config if
env is set appropriately.
Fixing methods involved in making requests and processing responses to
support collections of response types, rather than only a single, to
account for scenarios like data transfer.
Updating classes to properly use async context management in transport
client when appropriate.
@robertbartel
Copy link
Contributor Author

The only thing different was manually resolving a conflict for a version bump to the communication package, and the result of that ended up the same. As such, I'm going to bypass waiting for another approval this time.

@robertbartel robertbartel merged commit 406a9e5 into NOAA-OWP:master Jan 24, 2024
0 of 4 checks passed
@robertbartel robertbartel deleted the f/cli_updates/client_pkg_improvements branch January 24, 2024 20:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request maas MaaS Workstream
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants