-
Notifications
You must be signed in to change notification settings - Fork 20
Running ORAC
Three Python scripts are provided to simplify the process of running ORAC.
-
orac.py
fully processes a single file and is the main script you should use; -
single_process.py
runs one step of the ORAC processor on a single file; -
regression.py
runs a suite of regression tests.
A full list of arguments for each script can be found by calling any script with the --help
argument. This page introduces the most common arguments.
This script has a single mandatory argument: the name of a satellite imagery file to process. If not an absolute path, specify the directory with --in_dir
. Please do no rename satellite files as the formatting is used to determine much of the file's metadata. These formats are specified in the FileName class, which you may edit to accommodate new sensors.
The most commonly used arguments are,
-
--out_dir
to specify the output directory; -
--preset_settings
indicates which predefined settings (from your local defaults should be used; -
--limit X0 X1 Y0 Y1
limits processing to the rectangle specified (in 1-indexed satellite pixel count); -
--l1_land_mask
uses the land mask in the satellite data and is highly recommended for polar orbiting satellites; -
--skip_ecmwf_hr
is recommended for new users as this feature isn't very important when using new meteorological data; -
--use_oc
uses Ocean Colour CCI data in the sea-surface reflectance calculation and is highly recommended with aerosol retrievals; -
--revision
sets the revision number, which is written into the file names. It must be set if not using a Git repository; -
--procs N
to use N cores during processing; -
--clobber X
sets the clobber level: 3 overrides all existing files, 2 only overrides final results, 1 overrides everything except pre-processed files, and 0 leaves all existing files in place.
ORAC can be run through a batch queuing system rather than on your local machine with the --batch
argument. The batch system is specified at the bottom of your local_defaults.py
. Controls for that are,
-
--label
sets the name of the job; -
--dur X Y Z
specifies the maximal allowed duration (in HH:MM) for the pre, main and post processors (X, Y, Z, respectively); -
--ram X Y Z
specifies the maximal allowed memory (in Mb) for the pre, main and post processors.
The following arguments are useful for debugging,
-
--script_verbose
prints the driver files to the screen; -
--verbose
activates full verbosity, printing all progress within the program; -
--dry_run
is a dry-run, which will print driver files to the screen but not call any executables; -
--keep_driver
to keep the driver files after processing; -
--timing
prints the duration of each process; -
--available_channels
to specify the channels to read from the satellite data (though they may not necessarily be used); -
--settings
allows direct specification of settings (without--preset_settings
). It may be given multiple times to specify multiple processings; -
--settings_file FILE
works like--settings
, but each line ofFILE
specifies a processing.
If you wish to run with a different ORAC executable (e.g. because you have made some changes and wish to compare the results with and without them), use --orac_dir
to specify the root of the altered source code directory and/or --orac_lib
to specify the new library dependencies.
For users familiar with the driver file format, it is possible to directly write lines by two means. In these, SECTION
can take the values pre
, main
or post
to indicate which part of the processor to affect.
-
--additional SECTION KEY VALUE
will set variableKEY
to equalVALUE
in the driver file when runningSECTION
; -
--extra_lines SECTION FILE
will copy the contents ofFILE
into the driver file ofSECTION
.
The main job of the Python scripts is, given a satellite file, to locate the appropriate auxiliary files to pass to ORAC. It does so by searching various paths for expected filenames. To save typing them in each call, these paths are consolidated in a single file: local_defaults.py
. You will need to prepare one to describe your local environment. A description of each variable is provided in our general example while a specific example is available for processing on JASMIN.
If you installed ORAC using Anaconda, your local defaults file should be stored at ${CONDA_PREFIX}/lib/python3.7/site-packages/pyorac
(adjusting the Python version as appropriate). Otherwise, leave it in tools/pyorac
.
The values defined in this file are only defaults. All can be overridden for a call using the arguments --aux
(for paths), --global_att
(for NCDF attributes) or --batch_settings
(for batch processing settings). These all use the syntax -x KEY VALUE
, where KEY
is the name of the variable you wish to set and VALUE
is its new value.
This script takes a single argument, as above, and
- if it is a satellite image, runs the pre-processor;
- if it is any output of the pre-processor, runs the main processor once;
- if it is any output of the main processor, runs the post-processor.
Each of these has associated arguments to control the operation of ORAC. The same arguments are used by --settings
or the retrieval_settings
in your local defaults.
-
--day_flag N
specifies if only day (1) or night (2) should be processed. Default behaviour (0) is to process everything. (Twilight is neither day nor night.) -
--dellat
and--dellon
set the reciprocal of the resolution of the pre-processing grid. RTTOV is only run over that grid and then interpolated for each satellite pixel. These should be less than or equal the equivalent for the meteorological data used. -
--ir_only
skips all visible channels. Saves time for cloud top height retrievals. -
--camel_emis
uses the CAMEL surface emissivity library rather than the RTTOV atlas. -
--ecmwf_nlevels [60, 91, 137]
specifies the number of levels in the meteorological data given. -
--use_ecmwf_snow
uses the snow/ice fields in the meteorological data rather than from NISE. -
--no_snow_corr
skips the snow/ice correction. Saves time for geostationary imagery.
-
--approach
gives the forward model to be used. These are,- AppCld1l for single-layer cloud;
- AppCld2l for two-layer cloud;
- AppAerOx for aerosol over sea (using a BRDF surface model);
- AppAerSw for aerosol over land and multiple-view imagery (using the Swansea surface model);
- AppAerO1 for aerosol over land and single-view imagery (using a BRDF surface model).
-
--ret_class
allows alteration of the approach and only needs to be set if you wish to experiment with the forward model. Options are,- ClsCldWat for water cloud;
- ClsCldIce for ice cloud;
- ClsAerOx for aerosol over sea (using a BRDF surface model);
- ClsAerSw for multiple-view aerosol (using the Swansea surface model);
- ClsAerBR for multiple-view aerosol (using a BRDF-resolving Swansea surface model);
- ClsAshEyj for ash.
-
--phase
gives the type of particle to evaluate. These are,- WAT for water cloud;
- ICE for ice cloud;
- A70 for dust;
- A71 for polluted dust;
- A72 for light polluted dust;
- A73 for light dust;
- A74 for light clean dust;
- A75 for Northern Hemisphere background;
- A76 for clean maritime;
- A77 for dirty maritime;
- A78 for polluted maritime;
- A79 for smoke;
- EYJ for ash.
-
--multilayer PHS CLS
sets the--phase
and--ret_class
for the lower layer in a two-layer retrieval. -
--use_channels
set which channels should be used. Requested channels that weren't made available with-c
are quietly ignored. -
--types
allows the user to limit which pixels are processed to those listed. Pixels are flagged by type in pre-processing as one of CLEAR, SWITCHED_TO_WATER, FOG, WATER, SUPERCOOLED, SWITCHED_TO_ICE, OPAQUE_ICE, CIRRUS, OVERLAP, PROB_OPAQUE_ICE, or PROB_CLEAR. By default, all are processed. -
--no_land
skips all land pixels; -
--no_sea
skips all ocean pixels; -
--cloud_only
skips all clear-sky pixels; -
--aerosol_only
skips all cloudy pixels.
-
--phases
specifies which phases should be combined into this file. The code will not automatically work out which ones you want during single processing (but will work fine during normal running). -
--chunking
splits the satellite orbit in 4096 line chunks. Useful for machines with limited memory. -
--compress
compresses the data in the final output. This can significantly reduce the size of aerosol files, which contain many fill values. -
--no_night_opt
suppresses the output of cloud optical properties at night. -
--switch_phase
is a correction of cloud-only processing, whereby water pixels with a CTT below the freezing point are forced to ice (and vice versa).
This runs the ORAC regression tests, a sampling of orbits over Australia on 20 June 2008. If you intend to commit code to this repository, make certain that it compiles and can run these tests without unexpected changes.
The script accepts all of the arguments from the scripts above but ignores any --settings
in favour of built in tests. Additional arguments are,
-
--tests
specifies which tests should be run. They are,- The short tests (five lines containing both cloud and clear-sky) DAYMYDS, NITMYDS, DAYAATSRS, NITAATSRS, DAYAVHRRS, NITAVHRRS. These are sufficient in most circumstances and are run by default.
- The long tests (processing the entire image) DAYMYD, NITMYD, AATSR, AVHRR. All of these can be called by
--long
.
-
--test_type
specifies which manner of test should be run (specifically, which suffix to use when setting--preset_settings
): C for cloud, A for aerosol, or J for joint. -
--benchmark
suppresses comparison. By default, the script will increment the revision number of the repository by 1 and compare the new outputs to the previous.
orac.py /network/group/aopp/eodg/atsr/aatsr/v3/2008/06/03/ATS_TOA_1PUUPA20080603_160329_000065272069_00111_32730_5967.N1 \
--out_dir /data/MEXICO --available_channels 1 2 3 4 5 6 7 8 9 10 11 12 13 14 \
--limit 1 512 17200 18200 -S AATSR_J --l1_land_mask --use_oc --procs 7
This will process an AATSR orbit from 3 June 2008 stored in Oxford, saving the results to the folder /data/MEXICO
. All 14 channels in the data will be used (--available_channels
) over a 512x101 pixel block of the orbit (--limit
). The orbit will be evaluated using the preset settings for a joint retrieval (-S AATSR_J
; 23 runs covering two single-layer clouds, one multilayer cloud, ten sea-only BRDF-surface aerosol retrievals, and ten land-only Swansea-surface aerosol retrievals) but using the satellite data's own land/sea mask (--l1_land_mask
) and input from Ocean Colour CCI data (--use_oc
). Seven cores will be used for this processing (--procs 7
).
orac.py -i /network/group/aopp/eodg/atsr/aatsr/v3/2008/01/19 \
-o /network/aopp/apres/users/povey/settings_eval/default_land \
--day_flag 1 --dellon 1.5 --dellat 1.5 --ecmwf_flag 4 \
-x ecmwf_dir /network/aopp/matin/eodg/ecmwf/Analysis/REZ_0750 --skip_ecmwf_hr \
--settings_file ~/new_retrieval --use_oc --l1_land_mask --keep_driver \
--batch --ram 4000 4000 4000 -b queue legacy \
-g project AERONETCOLLOCATION -g product_name N0183-L2 \
ATS_TOA_1PUUPA20080119_103548_000065272065_00165_30780_4013.N1
This will process the day-time segment (--day_flag
) of an AATSR file from 19 January 2008, saving the result to a folder called default_land
(-o
). The pre-processing grid will have a resolution of 0.75° (--dellon --dellat
) and draw from operational ECMWF forecasts (--ecmwf_flag -x ecmwf_dir
) only (--skip_ecmwf_hr
). The settings are drawn from the file new_retrieval
in my home directory (--settings_file
), though Ocean Colour CCI data and the L1 land mask are added. The driver files will be retained after processing (--keep_driver
). The processing will be batch processed on the queue legacy
(--batch -b queue
), allocating 4Gb of RAM to each stage (--ram
). The project name will be AERONETCOLLOCATION
and the product name will be N0183-L2
(-g project -g product_name
).
regression.py -o /data/testoutput --l1_land_mask --procs 8 -r 1870 -C1
This runs the six short, cloud regression tests (the default), saving the results to /data/testoutput
. The L1 land/sea mask is used. Eight cores will be used and the result labelled as revision 1870. Any existing files of that pre-processor files of that revision will be kept (-C1
). (This call was made during debugging, where the pre-processor worked fine but the main processor failed, so we had no need to repeat the successful steps.)
The Python scripts in orac/tools are fairly simple wrappers for the code in orac/tools/pyorac. The files there are,
-
arguments.py
defines all of the command line arguments for the various scripts and functions that check the inputs are valid; -
batch.py
defines the interface to call a batch queuing system; -
definitions.py
defines some classes used throughout the code. All satellite instruments and particle types need to be defined in here; -
drivers.py
contains functions that create driver files for each part of ORAC. This is where most of the work actually happens; -
local_defaults.py
defines the default locations that the scripts search for input files so you don't have to type the full paths every time; -
mappable.py
contains a class that used for plotting satellite swaths on maps (it's sort of like Mappoints from IDL); -
regression_tests.py
defines which files and pixels to run during testing; -
run.py
contains functions that call everything else (process_all
contains everything needed to run ORAC); -
swath.py
contains a class that used for loading and filtering ORAC data; -
util.py
collects various routine functions used throughout the code.
The most common error is an environment error. To run, the scripts require a number of external libraries to be installed and to be able to find the orac/tools/pyorac
folder. The conda installation should do all of that. When something goes wrong, try cd $ORACDIR/tools
(if that helps, it means PYTHONPATH
hasn't been updated correctly). If you want to quickly check the scripts compile, call orac.py -h
for the help prompt.
The next most common error is from the local_defaults.py
file pointing to folders that don't exist. That probably comes up as some sort of "File not found" error. Then there are file name errors. The script makes certain assumptions about the format for the input files and when those change, the script fails with unhelpful error messages from drivers.py
.
Other common errors:
- When multiple syntax errors happen, it means Python can't compile the code. Check your ORAC environment is activated (as you might be using the default Python version, which is rather old).
- If you get error code -11, try
ulimit -s 500000
to increase your stack size. If that works, consider adding that line to your .bashrc file or the ORAC activation script.
- User Guide
- ORAC
- Developer Guide