-
Notifications
You must be signed in to change notification settings - Fork 221
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comand-Line Interface for Retrieving Data #1366
base: main
Are you sure you want to change the base?
Comand-Line Interface for Retrieving Data #1366
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot @virio-andreyana! As discussed, that is a really crucial part and addressing it would very much improve usability. Personally, I like the concept you suggest. Added some comments on the implementation.
Given the fact that the PR is addressing the major usability issue, it would be great to ensure that we don't miss anything important. @energyLS @hazemakhalek @davide-f may you have any comments on the solution proposed in this PR?
# Command Line Interface | ||
- rich | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to be sure we understand everything properly, a couple of technical questions:
- by any chance, are you aware if
rich
his major magic capabilities under all popular operation system, right? [Also, would be great to if it could work also in VSCode terminal] - it can be installed also with
conda
, not onlypip
which seem to be a recommended installation approach but which can lead to problem when creating the virtual environment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I've tested in on Linux, VSCode, Windows terminal and cmd and even the coloring scheme is still there.
- I don't really know how much difference does using conda or pip make. You can change that based on your preference. What's important is that it has minimal dependencies, so it won't impact other packages.
# -*- coding: utf-8 -*- | ||
# SPDX-FileCopyrightText: PyPSA-Earth and PyPSA-Eur Authors | ||
# | ||
# SPDX-License-Identifier: AGPL-3.0-or-later | ||
import textwrap |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Completely support a modular approach: we have a single helpers
file to store technical functions but it's definitely worth to use a more differentiated approach. I'd suggest though to keep helpers
as a prefix for consistency. What do you think?
scripts/_cli.py
Outdated
configfile = ["config.default.yaml", "configs/bundle_config.yaml", "config.yaml"] | ||
|
||
config = {} | ||
for c in configfile: | ||
if os.path.isfile(c): | ||
with open(c) as file: | ||
config_append = yaml.safe_load(file) | ||
|
||
config.update(config_append) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this fragment we are duplicating configuration management done in Snakefile: those lines create a global variable config
consequently merging the dictionaries from yaml files. Can we use this variable directly also here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The workaround is using the _helper
function mock_snakemake("retrieve_databundle_light")
. That way you can get the config from Snakefile. But will there be any consequences from this?
def console_markdown(markdown, lvl=1): | ||
|
||
console = Console() | ||
for i in range(lvl): | ||
markdown = textwrap.dedent(markdown) | ||
md = Markdown(markdown) | ||
|
||
return console.print(md) | ||
|
||
|
||
def console_table(dataframe, table_kw={}): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the whole this part (up to the line 146) is planned to be moved into _cli
file, right? [I recognise that it's a draft, so apologies for a potentially pre-mature comment!]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've intentionally placed it inside _cli
because its command line related things, or should I place this somewhere else like in _helper
?
scripts/_cli.py
Outdated
Options: | ||
|
||
- **check**: update the Databundle Checklist to see if the file is included | ||
- **all**: retrieve all missing databundles, namely **{", ".join(missing_bundles)}** | ||
- **rerun**: retrieve all databundles again, namely **{", ".join(bundles_to_download)}** | ||
- **bundle_...**: retrieve the selected databundles, can be more than one | ||
- Press **ENTER** to end this loop | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A love the idea of having such a report! May it be a good idea to try make the message more human-readable? The intended audience of this script are people who are starting to use the model and are not aware of our jargon 🙂 Though, that is just a detail which will be probably more relevant when polishing the implementation
When a cutout is being used, we are also checking that it covers the whole requested area with check_cutout_match. Btw, you have made me recognise that it would be good to add this check also in others scripts where a cutout is being used. Though, it's another task.
Not sure there are any substantial issues with retrieval OSM and costs data. Also, the toolset we are using for that is quite different, so working of these parts is definitely worth dedicated PRs. On my side, I'd say that it would be great to improve further the way how the console output is being managed. We still have a bit too much of irrelevant infos and warnings in the console which should be cleaned, while it may be also a good idea to add some colors to the meaningful console output (e.g. solution status, the objective values). But that is also a bit different story. |
Hmm, On the other hand |
Closes #1361
Changes proposed in this Pull Request
Users have an option alternative to
retrieve_databundle_light
to download the databundle through a command-line interface (CLI). To do that, write:For example. Here we are missing some data in
bundle_data_earth
andbundle_hydrobasins
.Once the option is selected, the bundle list will pas through the
retrieve_databundle_light
to retrieve that specific file. No need to interact with snakemake for this.Using the python package
rich
, you can make good looking command line visualization, suitable for computer clusters. You have to install python packagerich
first for this to work.Notes:
download_osm_data
andretrieve_cost_data
?Checklist
envs/environment.yaml
anddoc/requirements.txt
.config.default.yaml
andconfig.tutorial.yaml
.test/
(note tests are changing the config.tutorial.yaml)doc/configtables/*.csv
and line references are adjusted indoc/configuration.rst
anddoc/tutorial.rst
.doc/release_notes.rst
is amended in the format of previous release notes, including reference to the requested PR.