We follow the PEP8 style guide for Python. Docstrings follow PEP257 and use Sphinx style.
We use the following tools in the continuous integration to ensure compliance with PEP8, consistent formatting, and readable and error-free code:
- Flake8 tool to ensure compliance with PEP8
- Black code formatter to ensure consistent code formatting
- Pylint static code analyser to find bad code practices or errors
Note that while the entire GRASS code base is Black formatted, full compliance with PEP8, Flake8, and Pylint practices is still work in progress.
See using pre-commit for pre-commit setup and usage to simplify performing of these checks.
Use Black to format files:
black {source_file_or_directory}
Black is configured via pyproject.toml. The line length is set to 88 characters.
Use Flake8 to check formatting and basic issues in all files:
flake8 python_file.py
The root directory contains .flake8 configuration file which contains a less strict configuration for legacy code. It will be used by default when running Flake8 within GRASS source code. For new files, you can use the default configuration:
flake8 --isolated --max-line-length=88 {path_to_python_file}
For specific, temporary issues, you can explicitly specify which errors or warnings to ignore:
flake8 --ignore=E203,E266,E501 --max-line-length=88 python_file.py
C and C++ code is formatted with
ClangFormat. Contributions are
expected to be formatted with clang-format
(currently with version 18+). The
most convenient method to install clang-format and format files is
using pre-commit.
Alternatively, using separately installed clang-format on modified files:
clang-format -i <new_or_modified_file.c>
The ClangFormat settings for the repo are defined in .clang-format.
If using pre-commit is not an option, for whatever reason, there is a helper script grass_clang_format.sh, which simplifies bulk reformatting.
In general, headers should be included in the order:
- Core system headers (stdio.h, ctype.h, ...)
- Headers for non-core system components (X11, libraries).
- GRASS headers (grass/gis.h, grass/glocale.h, ...)
- Headers for the specific library/program (geodesic.h, ...)
Each class of headers has an obligation to be compatible with those above it in the list, but not those below it. The header groups should be alphabetically sorted and separated by a newline.
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <grass/gis.h>
#include <grass/glocale.h>
#include <grass/raster.h>
#include "local_proto.h"
#include "mask.h"
Use function names which fulfill the official GNU naming
convention. Instead of
naming a function like: MyNewFunction()
use snake case: my_new_function()
.
It is highly recommended to install and use pre-commit before submitting any new or modified code or any other content. The pre-commit uses Git hooks to check validity and executes automated formatting for a range of file formats, including C/C++ and Python. Pre-commit installs all necessary tools in a virtual environment upon first use.
If you never used pre-commit before, you must start by installing it on your system. You only do it once:
python -m pip install pre-commit
Pre-commit must then be activated in the code repository. Change the directory
to the root folder and use the install
command:
cd <grass_source_dir>
# once per repo
pre-commit install
Pre-commit will then be automatically triggered by the git commit
command. If
it finds any problem it will abort the commit and try to solve it automatically.
In that case review the changes and run again git add
and
git commit
.
It is also possible to run pre-commit manually, e.g:
pre-commit run clang-format --all-files
pre-commit run black --all-files
Or to target a specific set of files:
pre-commit run --files raster/r.sometool/*
The pre-commit hooks set is defined in .pre-commit-config.yaml.
It is possible to temporally disable the pre-commit hooks in the repo, e.g. while working on older branches:
# backporting...
pre-commit uninstall
And to reactivate pre-commit again:
git switch main
pre-commit install
There are three types of documentation: C API, Python API and tool documentation.
We
use doxygen and document the functions
directly in the source code. See lib/gis/open.c
and lib/gis/gislib.dox
for
examples.
Python API documentation is written in reStructuredText (reST) which is compiled with Sphinx (see grass package documentation)
def func(arg1, arg2):
"""Summary line.
Extended description of function.
:param int arg1: Description of arg1.
:param str arg2: Description of arg2.
:raise: ValueError if arg1 is equal to arg2
:return: Description of return value
:rtype: bool
Example:
>>> a=1
>>> b=2
>>> func(a,b)
True
"""
if arg1 == arg2:
raise ValueError('arg1 must not be equal to arg2')
return True
See Sphinx docstring formatting for more details.
Documentation of a tool should come with clear descriptions, hints on the implemented algorithm and example(s) with figures.
Each tool (also called "module") comes with its own manual page written in
simple HTML. The HTML file contains no header nor footer. The complete HTML
file is autogenerated during the compilation process (with --html-description parameter
). In order to make sure that manuals build without issues, it can be
useful to validate the HTML before submitting, e.g. with
validator.w3.org.
Name the documentation file '<tool>.html'
, e.g., if the tool is named
r.example, the documentation file should be named r.example.html
.
The structure consists of several required and optional sections:
<h2>DESCRIPTION</h2>
<!-- required -->
<h2>NOTES</h2>
<!-- suggested -->
<h2>EXAMPLES</h2>
<!-- suggested -->
<h2>TODO</h2>
<!-- optional -->
<h2>KNOWN ISSUES</h2>
<!-- optional -->
<h2>REFERENCES</h2>
<!-- optional -->
<h2>SEE ALSO</h2>
<!-- required -->
<h2>AUTHORS</h2>
<!-- required -->
Sections Notes, Examples, References, and Authors can be also in singular form (e.g, Note).
Note that HTML is converted to man pages by utils/g.html2man/. Since the man conversion is limited, please use no other HTML tags than:
<a> <b> <body> <br> <code> <dd> <dl> <dt> <em>
<h2> <h3> <h4> <head> <hr> <i> <img> <li> <ol> <p>
<pre> <sup> <table> <td> <th> <title> <tr> <ul>
More notes on markup:
- Tool names (i.e., v.category) should be emphasized with
<em>v.category</em>
. - Flags and parameter names written in boldface like
<b>-f</b>
and<b>input</b>
. - Shell commands, names, values, etc. should use
<tt>42</tt>
. - Emphasized phrases
<i>should use italics</i>
. - In order to minimize potential git merge conflicts, please break a line at approximately 70-80 chars.
Examples should be coded like this:
<div class="code">
<pre>
v.to.db map=soils type=area option=area column=area_size unit=h
</pre>
</div>
The <h2>SEE ALSO</h2>
section of each page should be alphabetized:
<em>
<a href="d.shade.html">d.shade</a>, <a href="r.shade.html">r.shade</a>
</em>
Alternatively, the section can provide details on how each of the linked tools or pages is relevant:
<em>
<a href="r.shade.html">r.shade</a> for computing shaded relief,
<a href="d.shade.html">d.shade</a> for displaying shaded relief with other data,
<a href="g.region.html">g.region</a> for managing the resolution.
</em>
In this case, the list can be ordered thematically rather than alphabetically. Either all tools should have the description or none (do not mix the styles).
Naming convention: tool_name.png
or tool_name_keyword.png
(in both cases,
dots in tool name are replaced by underscores)
Examples:
d_geodesic.png
r_resamp_stats_6m_20m.png
g_gui_rlisetup_8.png
v_clean_rmsa.png
Image size: ideally 600 pixel width (height depends on that), use e.g. ImageMagic:
mogrify -resize 600x file.png
Smaller images are also possible when appropriate, e.g. when a lot of images are included or they are something special, e.g. equations, icons or simple diagrams. Larger images are supported, too, see below for an optimal inclusion into the HTML page.
Please compress PNG images with:
# color quantization
# optional, but usually worth it
# note: may change colors
pngnq -n 128 -s 3 file.png
# shuffle original and quantitized image names
mv file.png file_ORIG.png
mv file-nq8.png file.png
# compress better (lossless)
optipng -o5 file.png
Format: Images should be ideally in PNG (well, JPG and GIF is allowed as well when appropriate but usually it is not!). Vector graphics should be included in pages as raster images (i.e. PNGs) for portability but the original format (preferably SVG) should be committed to the repository as well.
Adding the image to the HTML page (r.viewshed example, the screenshot is shown with a width of 600 pixel but it is clickable in the manual page). If a larger image is displayed as shrunk, both width and height HTML parameters (values must be calculated according to the picture size!) should be set:
<div align="center" style="margin: 10px">
<a href="r_viewshed.png">
<img
src="r_viewshed.png"
width="600"
height="600"
alt="r.viewshed example"
border="0"
/></a><br/>
<i>Figure: Viewshed shown on shaded terrain (observer position in the
north-east quadrant with white dot; 5m above ground)</i>
</div>
Tools typically do not change the computational region based on the input data. Raster processing tools should respect the current computational region.
Why? Users should be able to re-run a command or workflow with different computational regions to, e.g., test processing in a small area and then move to a larger one. Also, changing the current region globally can break parallel processing.
Exceptions: Tools importing data typically import the entire dataset, respecting of the region may be implemented as an optional feature (e.g., r.in.gdal). This is to avoid, e.g., importing data under finer resolution than their native resolution. Another exception is raster processing where alignment of the cells plays a crucial role and there is a clear answer to how the alignment should be done. In that case, the tool may change the resolution. Some tools, such as r.mapcalc, opt for providing additional computational region handling policies. Finally, some operations are meant to use all the data, e.g., creating metadata, these operations should not use the current computational region.
If you need to change the computational region, there are ways to change it only within your script, not affecting the current region.
Output data should be always written to the current mapset. This is ensured by build-in GRASS mechanisms, so there is nothing which needs to be done in the tool. If a tool modifies inputs, the input must be in the current mapset.
The tool should accept inputs from any mapset in the current project. The user-provided name may or may not include mapset name and the tool needs to respect that.
An analytical tool should read and write geospatial data as GRASS raster or vector maps. Importing from and exporting to other data formats should be left to dedicated import and export tools, e.g., v.import. The exceptions are import and export of data, e.g., r.in.xyz.
Processing and analytical tools can then use simple names to refer to the data within GRASS projects instead of file paths. This follows the separation of concerns principle: format conversion and CRS transformations are separate from analysis.
A tool should not overwrite existing data unless specified by the user using the
--overwrite
flag. The GRASS command line parser automatically checks for
output data (raster, vector maps) existence and ends the tool execution with a
proper error message in case the output already exists. If the flag is set by
the user (--overwrite
in command line, overwrite=True
in Python), the parser
enables the overwriting for the tool.
The --overwrite
flag can be globally enabled by setting the environment variable
GRASS_OVERWRITE
to 1. Notably, the GRASS session from grass.jupyter sets
GRASS_OVERWRITE
to 1 to enable re-running of the cells and notebooks.
GRASS GIS has a global mask managed by the r.mask tool and represented by a raster called MASK. Raster tools called as a subprocess will automatically respect the globally set mask when reading the data. For outputs, respecting of the mask is optional.
Tools should not set or remove the global mask. If the tool cannot avoid setting the mask internally, it should check for presence of the mask and fail if the mask is present. The tools should not remove and later restore the original mask because that creates confusing behavior for interactive use and breaks parallel processing.
Generally, any mask behavior should be documented unless it is the standard case where masked cells do not participate in the computation and are represented as NULL cells (no data) in the output.
Put raster, vector maps, imagery groups etc. in brackets:
Raster map <elevation> not found.
Put file paths, SQL queries into single quotes:
File '/path/to/file.txt' not found.
First letter should be capitalized.
Avoid contractions (cannot instead of can't).
Be consistent with periods. Complete sentences or all parts of a message with multiple sentences should end with periods. Short phrases should not. Punctuated events, such as errors, deserve a period, e.g., "Operation complete." Phrases which imply ongoing action should have an ellipse, e.g., "Reading raster map...".
import grass.script as gs
gs.run_command(...)
User messages should be translatable and for formatting, use str.format()
, not
f-strings:
gs.warning(_("Raster map <{}> not found.").format(input_map))
For strings that are not translatable, use f-strings:
r_mapcalc_expression = f"{output_map} = {input_map} * 3"
To create a temporary file, use NamedTemporaryFile
with a context manager. In
this example, we open a temporary file for writing, write something and then we
can use it in another tool. Once we do not need it anymore, we need to delete it
ourselves.
import tempfile
with tempfile.NamedTemporaryFile(mode="w", delete=False) as tmp_file:
file_path = tmp_file.name
tmp_file.write(...)
gs.try_remove(file_path)
If a tool needs to change the computational region for part of the computation, temporary region in Python API is the simplest way to do it:
gs.use_temp_region() # From now on, use a separate region in the script.
# Set the computational region with g.region as needed.
grass.run_command('g.region', raster='input')
gs.del_temp_region()
# Original region applies now.
This makes any changes done in the tool local for the tool without influencing other tools running in the same session.
If you need even more control, use the GRASS_REGION environment variable which is passed to subprocesses. Python API has functions which help with the setup:
os.environ["GRASS_REGION"] = gs.region_env(raster=input_raster)
If different subprocesses need different regions, use different environments:
env = os.environ.copy()
env["GRASS_REGION"] = gs.region_env(raster=input_raster)
gs.run_command("r.slope.aspect", elevation=input_raster, slope=slope, env=env)
This approach makes the computational region completely safe for parallel processes as no region-related files are modified.
Using temporary maps is preferred over using temporary mapsets. This follows the rule that writing should be done only to the current mapset. Some users may have write permissions only for their mapsets, but not for creating other mapsets.
The following script creates a temporary name using gs.append_node_pid
which
uses node (computer) name and process identifier to create unique, but
identifiable name. The temporary maps are removed when the script ends.
import atexit
import grass.script as gs
def remove(name):
gs.run_command(
"g.remove",
type="raster",
name=name,
flags="f",
quiet=True,
errors="ignore",
)
def main():
temporary = gs.append_node_pid("tmp_mapcalc")
atexit.register(remove, temporary)
gs.mapcalc(f"{temporary} = rand(1, 10)")
if __name__ == "__main__":
main()
Use gs.findfile() when there is a need to test if a map exists.
# test for input raster map
result = gs.find_file(map_name, element='raster')
if not result['file']:
gs.fatal(_("Raster map <{}> not found").format(map_name))
# test for input vector map
result = gs.find_file(map_name, element='vector')
if not result['file']:
gs.fatal(_("Vector map <{}> not found").format(map_name))
Tools need to accommodate input map names with (elevation) and without mapset (elevation@PERMANENT). If you need only the map name without mapset, you can do:
map_name = map_name.split("@")[0]
If you need the full name or the mapset only, use gs.findfile:
file_info = gs.find_file(map_name, element="raster")
full_name = file_info["fullname"]
name = file_info["name"]
mapset = file_info["mapset"]
For any informational output, use the gs.message function or gs.verbose. For error messages, use gs.fatal (ends execution) or gs.error (just prints error, so additional code needs to handle next steps and communicate them to the user). For warnings, use gs.warning. For debugging purposes use gs.debug.
# normal message:
gs.message(_("Done."))
# verbose message:
gs.verbose(_("Computation finished successfully."))
# warning:
gs.warning(_("No input values found, using default values."))
# error:
gs.error(_("No map found."))
# fatal error:
# prints error and exits or raises exception (use set_raise_on_error to set the behavior)
gs.fatal(_("No map found, exiting."))
# debug output (users can use g.gisenv to enable/disable)
# debug level is 1 to 5 (5 is most detailed)
# debug message should not be translated
gs.debug(f"Our calculated value is: {value}."), 3)
Do not use the print
function for informational output. This is reserved for
standard tool output if it has one.
To streamline the development of a GRASS addon in python, you can use this template powered by Cookiecutter.
Use the following header in your source code.
##############################################################################
# MODULE: r.foo
#
# AUTHOR(S): John Doe <jdoe at somewhere org>
#
# PURPOSE: Provide short description of module here...
#
# COPYRIGHT: (C) 2024 by John Doe and the GRASS Development Team
#
# This program is free software under the GNU General Public
# License (>=v2). Read the file COPYING that comes with GRASS
# for details.
##############################################################################
GRASS tools must use the GRASS parser to handle its command line parameters. To make writing parameters simpler and the interfaces more unified, use standard options. See Parser standard options. For example, use this:
# %option G_OPT_V_INPUT
# %end
# %option G_OPT_R_OUTPUT
# %end
If needed, override values which need to be different:
# %option G_OPT_V_INPUT
# % key: point_input
# % label: Name of input vector map with points
# % description: Points used for sampling the raster input
# %end
# %option G_OPT_R_OUTPUT
# % key: raster_input
# % label: Name of sampled raster map
# % description: Raster map which will be sampled by the points
# %end
Do not repeat the values when a standard option defines them.
Flags are boolean options that default to false. Their names are only one character. They are defined using:
# %flag
# % key: n
# % description: Consider zeros to be null values
# %end
On the command line, the flag is used with dash as -n
. In Python, the flag
would be used in the flags parameter of run_command
:
gs.run_command(..., flags="n", ...)
However, options are often better because they improve readability, clarify the default behavior, and allow for extension of the interface.
Example: Consider a tool which by default produces human-readable plain
text output. Then you add JSON output which is
enabled by a flag j
. Later, you decide to add YAML output. This now needs to
be flag y
which needs to be exclusive with flag j
. Soon, you have several
related flags each exclusive with all the others. Using an option instead of a
flag from the beginning allows the interface to accommodate more formats. In
this example, an option named format
can have default value plain
and json
for JSON output. When you later add YAML, you simply add yaml
to the possible
values without a need for additional options or flags. The interface definition
for the example would look like:
# %option
# % key: format
# % type: string
# % required: yes
# % options: plain,json,yaml
# % label: Output format
# % descriptions: plain;Plain text output;json;JSON output;yaml;YAML output
# % answer: plain
# %end
Each tool needs to have a description and at least 3 keywords:
# %module
# % label: Generates a raster map using gaussian random number generator.
# % description: Mean and standard deviation of gaussian deviates can be expressed by the user.
# % keyword: raster
# % keyword: surface
# % keyword: random
# %end
Notes:
- the first keyword is the tool family which goes to the tool family index in the manual and should correspond to the first part of the tool name (e.g., r is for raster).
- the second keyword is the overall topic which goes to the topic index in the manual
- the third (and more) keyword goes to the keyword index in the manual
These index manual pages are autogenerated during the build process of GRASS GIS.
A tool may use a package that is not required by GRASS GIS and may not be available on a user's system. In these cases, import only after the gs.parser call. In that way the tool can be safely compiled even if the dependency is not installed.
def main():
options, flags = gs.parser()
try:
import pandas as pd # noqa: E402
except ModuleNotFoundError:
gs.fatal(_("Pandas library not installed"))
Try to use names which describe shortly the intended purpose of the tool.
The first letters for the tool name should be:
d. - display tools
db. - database tools
g. - general GIS management tools
i. - imagery tools
m. - miscellaneous tool tools
ps. - postscript tools
r. - raster tools
r3. - raster3D tools
v. - vector tools
t. - temporal tools
g.gui - GUI tools
Some additional naming conventions
- specialized export tools: (type).out.(format) eg: r.out.arc, v.out.ascii
- specialized import tools: (type).in.(format) eg: r.in.arc, v.in.ascii
- conversion tools: (type).to.(type) eg: r.to.vect, v.to.rast, r3.to.rast
Avoid tool names with more than two dots in the name. Example: instead of r.to.rast3.elev use r.to.rast3elev.
Tools should record processing history to the output data. For vectors:
gs.vector_history(output)
For rasters:
gs.raster_history(output, overwrite=True)
Follow wxPython style guide.
Please use the following docstring template:
"""!
@package dir.example
@brief Short example package description
Classes:
- example::ExampleClass
(C) 2024 by the GRASS Development Team
This program is free software under the GNU General Public License
(>=v2). Read the file COPYING that comes with GRASS for details.
@author First Author <first somewhere.com>
@author Second Author <second somewhere.com>
@author Some Other <third somewhere.com> (some particular change)
"""
To enable translating of messages to other languages, use full strings, e.g. (good example):
if ...:
win.SetLabel(_("Name for new 3D raster map to create"))
else:
win.SetLabel(_("Name for new raster map to create"))
instead of constructing string from several parts (bad example):
# don't do this
if ...:
maplabel = 'raster map'
else:
maplabel = '3D raster map'
win.SetLabel(_("Name for new {} to create").format(maplabel))
Sometimes the string might have different translation depending on the context (is it a verb or a noun? matching ending of a word for particular gender; etc). To help translators, it is suggested to add a comment explaining the context of string. The comment must start with GTC keyword and must be on a line before string:
self.bwizard = wx.Button(...,
# GTC New location
label = _("N&ew"))
# GTC %s will be replaced with name of current shell
gs.message(_("Running through {}").format(shellname))
Refer to the online GRASS Programmer's
Manual or generate it with make htmldocs
or make pdfdocs
.
Use the GRASS library functions, when available, instead of the standard C functions. The reason for this is that the following functions ensure good programming practice (e.g. always checking if memory was allocated) and/or improves portability.
- Memory management:
G_malloc()
,G_calloc()
,G_realloc()
,G_free()
- Environmental variables:
G_getenv()
,G_setenv()
,G_unsetenv()
- File seek:
G_fseek()
,G_ftell()
- Printing:
G_asprintf()
,G_vsaprintf()
,G_vfaprintf()
, ...
Please refer to the programmers manual for the proper use (e.g., determining if any casts are needed for arguments or return values) of these library functions. They may perform a task slightly different from their corresponding C library function, and thus, their use may not be the same.
Tool exit status is defined as EXIT_SUCCESS
or EXIT_FAILURE
(declared in
stdlib.h
), e.g.
{
...
if (G_parser (argc, argv))
exit (EXIT_FAILURE);
...
exit (EXIT_SUCCESS);
}
See rules for messages in Python scripts for proper usage of
G_fatal_error()
, G_warning()
, etc. Message output is not expected to be sent
to pipe or file.
For data output redirected to pipe or file, please use fprintf()
and specify
the stdout stream as follows:
fprintf(stdout, ...);
fflush(stdout);
fflush(stdout) /* always required when using fprintf(stdout, ...). */
Add a header section to file main.c of your tool and make sure you include the copyright. If you are modifying an existing file you may under no circumstances remove prior copyright or licensing text that is not your own, even for a major rewrite. If any original code or code that is in part derived from another's original work remains, it must be properly cited.
/****************************************************************************
*
* MODULE: g.foo
* AUTHOR(S): John Doe <jdoe at somewhere org>
* PURPOSE: Provide short description of module here...
* COPYRIGHT: (C) 2010 by John Doe, and the GRASS Development Team
*
* This program is free software under the GNU General Public
* License (>=v2). Read the COPYING file that comes with GRASS
* for details.
*
*****************************************************************************/