From the dx-toolkit root directory:
make python
Set the _DX_DEBUG
environment variable to a positive integer before
running a dxpy-based program (such as dx
) to display the input and
output of each API call. Supported values are 1, 2, and 3 with
increasing numbers producing successively more verbose output.
Example:
$ _DX_DEBUG=1 dx ls
- Conform to PEP-8.
- Relax the line length requirement to 120 characters per line, where you judge readability not to be compromised.
- Relax other PEP-8 requirements at your discretion if it simplifies code or is needed to follow conventions established elsewhere at DNAnexus.
- Document your code in a format usable by Sphinx Autodoc.
- Run
pylint -E
on your code before checking it in. - Do not introduce module import-time side effects.
-
Do not add module-level attributes into the API unless you are absolutely certain they will remain constants. For example, do not declare an attribute
dxpy.foo
(dxpy._foo
is OK), or any other non-private variable in the global scope of any module. This is because unless the value is a constant, it may need to be updated by an initialization method, which may need to run lazily to avoid side effects at module load time. Instead, use accessor methods that can perform the updates at call time:_foo = None def get_foo(): initialize() return _foo
-
Other useful resources:
Code going into the Python codebase should be written in Python 3.3 style, and should be compatible with Python 3.3, 3.4, and 2.7. To facilitate Python 2 compatibility, we have the compat module in https://github.com/dnanexus/dx-toolkit/blob/master/src/python/dxpy/compat.py. Also, the following boilerplate should be inserted into all Python source files:
from __future__ import absolute_import, division, print_function, unicode_literals
dxpy.compat
has some simple shims that mirror Python 3.3 builtins and redirect them to Python 2.7 equivalents when on 2.7. Most critically,from dxpy.compat import str
will import theunicode
builtin on 2.7 and thestr
builtin on 3.3. Usestr
wherever you would have usedunicode
. To convert unicode strings to bytes, use.encode('utf-8')
.- Use
from __future__ import print_function
and use print as a function. Instead ofprint >>sys.stderr
, writeprint(..., file=sys.stderr)
. - The next most troublesome gotcha after the bytes/unicode conversions is that many iterables operators return generators in Python 3. For example,
map()
returns a generator. This breaks places that expect a list, and requires either explicit casting withlist()
, or the use of list comprehensions (usually preferred). - Instead of
raw_input
, usefrom dxpy.compat import input
. - Instead of
.iteritems()
, use.items()
. If this is a performance concern on 2.7, introduce a shim in compat.py. - Instead of
StringIO.StringIO
, usefrom dxpy.compat import BytesIO
(which is StringIO on 2.7). - Instead of
<iterator>.next()
, usenext(<iterator>)
. - Instead of
x.has_key(y)
, usey in x
. - Instead of
sort(x, cmp=lambda x, y: ...)
, usex=sorted(x, key=lambda x: ...)
.
Other useful resources:
Some scripts, such as format converters, are useful both as standalone executables and as importable modules.
We have the following convention for these scripts:
-
Install the script into
src/python/dxpy/scripts
with a name likedx_useful_script.py
. This will allow importing withimport dxpy.scripts.dx_useful_script
. -
Include in the script a top-level function called
main()
, which should be the entry point processor, and conclude the script with the following stanza:if __name__ == '__main__': main()
-
The dxpy installation process (invoked through
setup.py
or withmake -C src python
at the top level) will find the script and install a launcher for it into the executable path automatically. This is done using theentry_points
facility of setuptools/distribute.- Note: the install script will replace underscores in the name of your module with dashes in the name of the launcher script.
-
Typically, when called on the command line, main() will first parse the command line arguments (sys.argv). However, when imported as a module, the arguments need to instead be passed as inputs to a function. The following is a suggestion for how to accommodate both styles simultaneously with just one entry point (
main
):def main(**kwargs): if len(kwargs) == 0: kwargs = vars(arg_parser.parse_args(sys.argv[1:])) ... if __name__ == '__main__': main()