-
Notifications
You must be signed in to change notification settings - Fork 21
Getting started
This tutorial page provides the basics on how to set up Docker on one's local computer and then connect to an eddy4R Docker container in order to use NEON eddy4R R package (see section 2.1 Install Docker and 2.2 Access eddy4R). Alternative methods to use eddy4R without Docker are also provided in section 2.3. eddy4R team currently only provide support to use eddy4R in Dockerized Rstudio, not for the alternative methods.
The directions for how to access eddy4R comes from:
Metzger, S., D. Durden, C. Sturtevant, H. Luo, N. Pingintha-durden, and T. Sachs (2017). eddy4R 0.2.0: a DevOps model for community-extensible processing and analysis of eddy-covariance data based on R, Git, Docker, and HDF5. Geoscientific Model Development 10:3189–3206. doi: 10.5194/gmd-10-3189-2017.
Content on this page includes:
To work with the eddy4R–Docker image, you first need to sign up for an account at DockerHub.
For more information on using Docker, consider reading through the content from CyVerse's Container Camp's Intro to Docker.
This section provides information about installation of docker on different operating systems:
- Docker requires RHEL / CentOS kernel version 3.10.0-229 or newer
- follow the Docker installation instructions
- installation via yum
- add Docker group
- add Docker to autostart
- follow the Docker installation instructions
- prior to "Step 3: Verify your installation", open
cmd.exe
, typedocker-machine upgrade
, hit enter and wait until the upgrade is complete - continue with "Step 3: Verify your installation"
- prior to "Step 3: Verify your installation", open
[Go back to the top of the page]
There different ways that you can access eddy4R package. This section provides following info to user:
- Run Rstudio in a Docker container
- Run Rstudio with access to the host file system
- Manage Docker Containers
-
sign-up at DockerHub
-
Navigate to the "eddy4r" DockerHub repository
-
open a Shell (in Linux) or the Docker Quickstart Terminal (in Windows)
-
log into your DockerHub account with your DockerHub username (all lowercase, not Email!) and password:
docker login
-
start an interactive Rstudio Server session with access to eddy4R functions and workflows:
docker run -d -p 8787:8787 -e PASSWORD=YOURPASSWORD stefanmet/eddy4r:1.0.0
-
in Linux, the IP address of the Docker host
MY-IP-ADDRESS
is simplylocalhost
, and in WindowsMY-IP-ADDRESS
is determined from openingcmd.exe
, typingdocker-machine ip default
, and hitting enter -
open a web-browser and log into the Rstudio Server session, with
MY-IP-ADDRESS
replaced by the IP address of the Docker host:address: http://MY-IP-ADDRESS:8787 username: rstudio password: YOURPASSWORD
-
rocker-org/rocker provides additional information on the use of Rstudio in Docker
[Go back to the top of this section]
[Go back to the top of the page]
- rocker-org/rocker provides overviews on sharing files with a host machine, saving data and managing users in Docker
-
in this example,
/FIUdata/IPT_data/dynamic/WG_SCI/docker/deve/v0.0.8_SERC_20170305-1
is the folder location in the host file system (origin), and/home/$USER/eddy
is the corresponding folder location in the Docker container (target) -
these folder locations can be modified by the user as needed/appropriate
-
Docker runs as superuser, and in order to ensure consistent file ownership and permissions, it is important to keep track of the Linux
$USER
and$UID
-
start the Rstudio session with access to the host file system and user information:
docker run -d -p 8787:8787 -v /FIUdata/IPT_data/dynamic/WG_SCI/docker/deve/v0.0.8_SERC_20170305-1:/home/$USER/eddy -e USER=$USER -e USERID=$UID -e PASSWORD=YOURPASSWORD stefanmet/eddy4r:1.0.0
-
open a web-browser and log into the Rstudio Server session, with
MY-IP-ADDRESS
replaced by the IP address of the Docker host, andMY-USER-NAME
replaced by your user name on the host machine (e.g., smetzger):address: http://MY-IP-ADDRESS:8787 username: MY-USER-NAME password: YOURPASSWORD
-
workflows located on the host file system such as examples
flow.turb.tow.neon.dock.r
andflow.erf.dock.r
can now be executed, where the home directory in the Docker container~
refers to/home/$USER
:source("~/eddy/flow/flow.turb.tow.neon.dock.r") source("~/eddy/flow/flow.erf.dock.r")
-
in this example,
/c/Users/smetzger/docker-data
is the folder location in the host file system (origin), and/home/rstudio/smetzger/eddy
is the corresponding folder location in the Docker container (target) -
these folder locations can be modified by the user as needed/appropriate, but it should be noted that by default only the Windows
C:\users
folder can be shared with Docker -
start the Rstudio session with access to the host file system:
docker run -d -p 8787:8787 -v /c/Users/smetzger/docker-data:/home/rstudio/smetzger/eddy -e PASSWORD=YOURPASSWORD stefanmet/eddy4r:1.0.0
-
open a web-browser and log into the Rstudio Server session, with
MY-IP-ADDRESS
replaced by the IP address of the Docker host:address: http://MY-IP-ADDRESS:8787 username: rstudio password: YOURPASSWORD
[Go back to the top of this section]
[Go back to the top of the page]
-
open a Shell (in Linux) or the Docker Quickstart Terminal (in Windows)
-
determine running Docker containers
docker ps
-
a new interactive Rstudio session can only be started after a Rstudio session currently running on port 8787 is terminated, with
NAME
being the namesake entry provided bydocker ps
:docker kill NAME
[Go back to the top of this section]
[Go back to the top of the page]
Instead of using eddy4R in Dockerized Rstudio, one can also elect not to use Docker, or not to use Rstudio. This section introduce alternative methods:
Users who elect not to use Docker can follow the guide below to access eddy4R package.
eddy4R is written in the R language for statistical computing and utilizes several third-party code packages. Consequently, R, the integrated development environment Rstudio, the eddy4R packages as well as their package dependencies need to be installed.
In the following, installation sequences for Red Hat Enterprise Linux (RHEL) / Community Enterprise Operating System (CentOS) 7.1 and Windows 7 with R version >= 3.2.3 are provided. If deviating from these versions and recommendations, results must be verified thoroughly to ensure their correctness.
Please ensure to have super user rights in Linux or Administrator rights in Windows when performing the installation steps, as otherwise not a global, but a user-specific installation directory is used.
-
in a shell, become super user:
sudo su
-
to install the Extra Packages for Enterprise Linux (EPEL), follow the guide here:
wget http://dl.fedoraproject.org/pub/epel/7/x86_64/e/epel-release-7-5.noarch.rpm rpm -ivh epel-release-7-5.noarch.rpm
-
continue with updates and installations:
yum update yum install libxml2-devel geos geos-devel java netcdf netcdf-devel yum install gtk2-devel ImageMagick-devel gtk+ gtk+-devel screen firefox proj proj-devel yum install R
-
open R as super user:
sudo R
-
log into Github, generate an access token with the scope "repo" (Full control of private repositories) checked, and copy it to the clipboard
-
for our private Github repository the authentication token needs to be set as environmental variable, by executing the following line with MyAccessToken replaced by the actual token that you just copied to the clipboard from Github:
Sys.setenv(GITHUB_PAT = "MyAccessToken")
-
run the eddy4R installation script:
source("https://www.dropbox.com/s/6kiiehesiuozzl8/flow.inst.R?dl=1")
-
exit R, it will ask you whether you want to save your workspace, you can confirm with "n":
quit()
-
follow guide here, type in the shell:
sudo apt-get install gdebi-core wget https://download2.rstudio.org/rstudio-server-0.99.896-amd64.deb sudo gdebi rstudio-server-0.99.896-amd64.deb
-
stop being super user:
exit
[Go back to the top of this section]
[Go back to the top of the page]
-
download and install the most recent R-version
-
change permissions of folder C:\Program Files\R\R-version\library into writeable
- follow guide here
-
start Rstudio as administrator: Start -> Programs -> Rstudio -> right-click -> Run as administrator
-
log into Github, generate an access token with the scope "repo" (Full control of private repositories) checked, and copy it to the clipboard
-
for our private Github repository the authentication token needs to be set as environmental variable, by executing the following line in Rstudio with MyAccessToken replaced by the actual token that you just copied to the clipboard from Github:
Sys.setenv(GITHUB_PAT = "MyAccessToken")
-
run the eddy4R installation script in Rstudio:
source("https://www.dropbox.com/s/6kiiehesiuozzl8/flow.inst.R?dl=1")
-
restart Rstudio
-
now you can use the functions contained in eddy4R to create your own workflows just like with any other R-package
-
in addition, eddy4R workflow templates such as a standard eddy-covariance flux processing or environmental response function processing are available
-
please see our Developer's Guide for accessing these workflow templates and modifying them for your purposes
[Go back to the top of this section]
[Go back to the top of the page]
For the users who want to use Docker for batch processing, it is possible to use the eddy4r Docker image without the Rstudio graphical interface.
-
open a Shell (in Linux) or the Docker Quickstart Terminal (in Windows)
-
the following examples are for Linux, but can be simplified for Windows following the preceding sections
-
the
Rscript
command, applied to an R-instruction fileflow.turb.tow.neon.dock.r
located at/home/$USER/eddy/flow/
, runs in the foreground and provides standard R command line output for supervision:docker run --rm -it -v /FIUdata/IPT_data/dynamic/WG_SCI/docker/deve/v0.0.8_SERC_20170305-1:/home/$USER/eddy -e USER=$USER -e USERID=$UID stefanmet/eddy4r:1.0.0 Rscript /home/$USER/eddy/flow/flow.turb.tow.neon.dock.r
-
the
Rscript
command can also be used with the same R-instruction fileflow.turb.tow.neon.dock.r
, but located at a URL such ashttps://www.dropbox.com/s/0yi4v3gvye1bnr2/flow.turb.tow.neon.dock.r?dl=1
(additional info):docker run --rm -it -v /FIUdata/IPT_data/dynamic/WG_SCI/docker/deve/v0.0.8_SERC_20170305-1:/home/$USER/eddy -e USER=$USER -e USERID=$UID stefanmet/eddy4r:1.0.0 Rscript -e "source('https://www.dropbox.com/s/0yi4v3gvye1bnr2/flow.turb.tow.neon.dock.r?dl=1')"
-
here an additional example for a different R-instruction file:
docker run --rm -it -v /FIUdata/IPT_data/dynamic/WG_SCI/docker/deve/v0.0.8_SERC_20170305-1:/home/$USER/eddy -e USER=$USER -e USERID=$UID stefanmet/eddy4r:1.0.0 Rscript -e "source('https://www.dropbox.com/s/94mkenkb4v93mm6/flow.erf.dock.R?dl=1')"
-
-
the
R CMD BATCH
command applied to the same R-instruction file runs in the foreground without command line output, but occupies the shell:docker run --rm -it -v /FIUdata/IPT_data/dynamic/WG_SCI/docker/deve/v0.0.8_SERC_20170305-1:/home/$USER/eddy -e USER=$USER -e USERID=$UID stefanmet/eddy4r:1.0.0 R CMD BATCH /home/$USER/eddy/flow/flow.turb.tow.neon.dock.r
-
the
R CMD BATCH
command can also be run in the background in Dockers "detached" mode, returning the shell to the user:docker run -d -v /FIUdata/IPT_data/dynamic/WG_SCI/docker/deve/v0.0.8_SERC_20170305-1:/home/$USER/eddy -e USER=$USER -e USERID=$UID stefanmet/eddy4r:1.0.0 R CMD BATCH /home/$USER/eddy/flow/flow.turb.tow.neon.dock.r
-
-
-
attention: the instruction file needs to explicitly load and attach R-package "methods" via library(methods)
- the default behavior for
Rscript
(and apparentlyR CMD BATCH
) omits R-package "methods" - this breaks calls to any packages depending on "methods", e.g. splus2R::colMins would stop with error "...could not find function "is"…"
- the default behavior for
To inquire with repository maintainers on questions or ideas that fall outside the Github Issue Tracker workflow, please contact us @ [email protected]
The National Ecological Observatory Network is a project solely funded by the National Science Foundation and managed under cooperative agreement by Battelle. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.
GNU AFFERO GENERAL PUBLIC LICENSE Version 3, 19 November 2007
Information and documents contained within this repository are available as-is. Codes or documents, or their use, may not be supported or maintained under any program or service and may not be compatible with data currently available from the NEON Data Portal.