Skip to content

Commit

Permalink
Merge pull request #24 from DARPA-CRITICALMAAS/quickstart
Browse files Browse the repository at this point in the history
quickstart
  • Loading branch information
asaxton authored Dec 5, 2024
2 parents 00c0c08 + 04f00cd commit ad9ad75
Show file tree
Hide file tree
Showing 7 changed files with 366 additions and 33 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
venv
__pycache__
docker-compose.override.yml
secrets.sh

*.pyc
*.DS_Store
Expand Down
32 changes: 31 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,33 @@
# UIUC-CDR

This repository contains the hook to receive messages from the CDR and starts the processing. The full stack consists of containers to proxy incoming http requests to the correct containers (traefik), container to handle the hooks (cdrhook), container to handle all messages (rabbitmq), show the status (monitor), download data (downloader) and upload the data (uploader) as well as a single model that is executed on the data (icy-resin).

# QuickStart

To be able to run you will need some files, to make this easier we have created a quickstart shell script that you can run. This will download all files needed as well as create some default value files.

```
curl -o quickstart.sh -s -L https://raw.githubusercontent.com/DARPA-CRITICALMAAS/uiuc-cdr/refs/heads/main/quickstart.sh
chmod 755 quickstart.sh
./quickstart.sh
```

The first time you run this, it will create four files:
- `secrets.sh` file that you can edit to change any variables needed
- `docker-compose.override.yml` these are changes to the docker-compose file
- `docker-compose.yml` *DO NOT EDIT* this file will be downloaded each time to make sure you have the latest version
- `.env` *DO NOT EDIT* this will be created from the secrets.sh file

Edit `secrets.sh` and `docker-compose.override.yml` to fit your environment.At a minumum you will need to change the `CDR_TOKEN`, but it is highly recommended to change `RABBITMQ_USERNAME`, `RABBITMQ_PASSWORD` and `CDRHOOK_SECRET`. If you only want to run the cdrhook, change the PROFILE to be `cdrhook`.

Once you have the secrets.sh file setup, you can use `quickstart.sh` to start the full stack.To restart it, simpl run `quickstart.sh` again.

To only start the pipeline, you can all four files to the GPU machine, change the `PROFILE` in `secrets.sh` to be pipeline and run `quickstart.sh`.

To stop the stack you can use `docker compose --profile allinone down`, you can use the profile allinone even if you only start the pipeline or cdrhook.

If you use the cdrhook profile, it will not start traefik by default. You can manually start that in this case with `docker compose --profile traefik up -d`

# CDR Hook for NCSA pipeline

This repository contains the hook to receive messages from the CDR and starts the processing. The full stack consists of a few containers that work together:
Expand All @@ -6,7 +36,7 @@ This repository contains the hook to receive messages from the CDR and starts th
- **rabbitmq**: The orchestrator of all the work, all other containers connect to this and will receive work. If any of the messages can not be handled it will be added to the `<queue>.error` with the exception attached to the original message.
- **cdrhook**: this is the entry point for all work, it will register with the CDR and receive messages when new work needs to be done. When a message arrives it will check to see if all necessary metadata is available and if so, it will send a message to the `download` queue.
- **downloader**: this will download the image and the metadata to a shared folder that can be used by the actual processing container. This can run on a different server than the cdrhook, but does need to have access to the same storage system that the pipeline uses. Once it is downloaded it will send a message to each of the pipelines that run a model using the `process_<model>` queue name.
- **pipeline**: this will do the actual inference of the map, it will use the map and the metadata and find all the legends and appropriate regions in the map and write the result to the output folder ready for the CDR, and send a message to the `upload` queue.
- **icy-resin**: this will do the actual inference of the map, it will use the map and the metadata and find all the legends and appropriate regions in the map and write the result to the output folder ready for the CDR, and send a message to the `upload` queue.
- **uploader**: this will upload the processed data from the pipeline to the CDR and move the message to `completed` queue.
- **monitor**: this not really part of the system, but will show the number of messages in the different queues, making it easy to track overall progress.

Expand Down
2 changes: 1 addition & 1 deletion cdrhook/models.json
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
{
"golden_muscat": ["map_area", "polygon_legend_area"]
"icy_resin": ["map_area", "polygon_legend_area"]
}
3 changes: 2 additions & 1 deletion cdrhook/server.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@


auth = HTTPBasicAuth()
cdr_url = "https://api.cdr.land"
cdr_url = os.getenv("CDR_URL","https://api.cdr.land")

config = { }
cdr_connector = None
Expand Down Expand Up @@ -422,6 +422,7 @@ def create_app():
cdr_connector = CdrConnector(
system_name=os.getenv("SYSTEM_NAME"),
system_version=os.getenv("SYSTEM_VERSION"),
cdr_url=os.getenv("CDR_URL", "https://api.cdr.land"),
token=os.getenv("CDR_TOKEN"),
callback_url=os.getenv("CALLBACK_URL")+'/hook',
callback_secret=os.getenv("CALLBACK_SECRET"),
Expand Down
28 changes: 28 additions & 0 deletions docker-compose.example.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,32 @@
services:
# Add SSL to traefik
traefik:
command:
- --log.level=INFO
- --api=true
- --api.dashboard=true
- --api.insecure=true
# Entrypoints
- --entrypoints.http.address=:80
- --entrypoints.http.http.redirections.entryPoint.to=https
- --entrypoints.https.address=:443
- --entrypoints.https.http.tls.certresolver=myresolver
# letsencrypt
- --certificatesresolvers.myresolver.acme.email=${TRAEFIK_ACME_EMAIL}
- --certificatesresolvers.myresolver.acme.storage=/config/acme.json
# uncomment to use testing certs
#- --certificatesresolvers.myresolver.acme.caserver=https://acme-staging-v02.api.letsencrypt.org/directory
- --certificatesresolvers.myresolver.acme.httpchallenge=true
- --certificatesresolvers.myresolver.acme.httpchallenge.entrypoint=http
# Docker setup
- --providers.docker=true
- --providers.docker.endpoint=unix:///var/run/docker.sock
- --providers.docker.exposedbydefault=false
- --providers.docker.watch=true
ports:
- "80:80"
- "443:443"

cdrhook:
environment:
SYSTEM_NAME: ${SYSTEM_NAME}
Expand Down
60 changes: 30 additions & 30 deletions docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ services:
# REVERSE PROXY
# ----------------------------------------------------------------------
traefik:
image: "traefik:v2.11"
image: "traefik:munster"
command:
- --log.level=INFO
- --api=true
Expand All @@ -13,26 +13,19 @@ services:
# Entrypoints
- --entrypoints.http.address=:80
- --entrypoints.http.http.redirections.entryPoint.to=https
- --entrypoints.https.address=:443
- --entrypoints.https.http.tls.certresolver=myresolver
# letsencrypt
- --certificatesresolvers.myresolver.acme.email=${TRAEFIK_ACME_EMAIL}
- --certificatesresolvers.myresolver.acme.storage=/config/acme.json
# uncomment to use testing certs
#- --certificatesresolvers.myresolver.acme.caserver=https://acme-staging-v02.api.letsencrypt.org/directory
- --certificatesresolvers.myresolver.acme.httpchallenge=true
- --certificatesresolvers.myresolver.acme.httpchallenge.entrypoint=http
# Docker setup
- --providers.docker=true
- --providers.docker.endpoint=unix:///var/run/docker.sock
- --providers.docker.exposedbydefault=false
- --providers.docker.watch=true
restart: "unless-stopped"
profiles:
- traefik
- allinone
security_opt:
- no-new-privileges:true
ports:
- "80:80"
- "443:443"
volumes:
- "traefik:/config"
- "/var/run/docker.sock:/var/run/docker.sock:ro"
Expand All @@ -44,6 +37,9 @@ services:
image: rabbitmq:3.13-management
hostname: rabbitmq
restart: unless-stopped
profiles:
- cdrhook
- allinone
environment:
RABBITMQ_DEFAULT_USER: "${RABBITMQ_USERNAME:-guest}"
RABBITMQ_DEFAULT_PASS: "${RABBITMQ_PASSWORD:-guest}"
Expand All @@ -55,16 +51,19 @@ services:
# CDR HOOK
# ----------------------------------------------------------------------
cdrhook:
image: ncsa/criticalmaas-cdr:latest
image: ncsa/criticalmaas-cdr:${CDRHOOK_VERSION:-latest}
hostname: cdrhook
build: cdrhook
restart: unless-stopped
profiles:
- cdrhook
- allinone
depends_on:
- rabbitmq
environment:
CDR_URL: "${CDR_URL}"
CDR_TOKEN: "${CDR_TOKEN}"
CDR_KEEP_EVENT: "no"
CALLBACK_URL: "https://${SERVER_NAME}/cdr"
CALLBACK_URL: "${CALLBACK_URL}"
CALLBACK_SECRET: "${CALLBACK_SECRET}"
CALLBACK_USERNAME: "${CALLBACK_USERNAME}"
CALLBACK_PASSWORD: "${CALLBACK_PASSWORD}"
Expand All @@ -80,10 +79,13 @@ services:
# RABBITMQ MONITOR
# ----------------------------------------------------------------------
monitor:
image: ncsa/criticalmaas-monitor:latest
image: ncsa/criticalmaas-monitor:${CDRHOOK_VERSION:-latest}
hostname: monitor
build: monitor
restart: unless-stopped
profiles:
- cdrhook
- allinone
depends_on:
- rabbitmq
environment:
Expand All @@ -98,17 +100,17 @@ services:
# DATA PROCESSING PIPELINE
# use one, or more, per model to be executed
# ----------------------------------------------------------------------
golden_muscat:
image: ncsa/criticalmaas-pipeline:latest
build: ../uiuc-pipeline
icy-resin:
image: ncsa/criticalmaas-pipeline:${PIPELINE_VERSION:-latest}
runtime: nvidia
restart: "unless-stopped"
profiles:
- pipeline
depends_on:
- rabbitmq
- allinone
environment:
NVIDIA_VISIBLE_DEVICES: all
PREFIX: ""
ipc: host
command:
- -v
- --data
Expand All @@ -123,9 +125,10 @@ services:
- "amqp://${RABBITMQ_USERNAME}:${RABBITMQ_PASSWORD}@rabbitmq/%2F"
- --inactive_timeout
- "86000"
- --output_types
- cdr_json
- --model
- golden_muscat
restart: "unless-stopped"
- icy_resin
volumes:
- "data:/data"
- "logs:/logs"
Expand All @@ -136,27 +139,24 @@ services:
# DOWNLOADER and UPLOADER
# ----------------------------------------------------------------------
downloader:
image: ncsa/criticalmaas-downloader:latest
build: uploader
image: ncsa/criticalmaas-downloader:${CDRHOOK_VERSION:-latest}
restart: "unless-stopped"
profiles:
- pipeline
depends_on:
- rabbitmq
- allinone
environment:
RABBITMQ_URI: "amqp://${RABBITMQ_USERNAME}:${RABBITMQ_PASSWORD}@rabbitmq/%2F"
volumes:
- "data:/data"

uploader:
image: ncsa/criticalmaas-uploader:latest
build: uploader
image: ncsa/criticalmaas-uploader:${CDRHOOK_VERSION:-latest}
restart: "unless-stopped"
profiles:
- pipeline
depends_on:
- rabbitmq
- allinone
environment:
CDR_URL: "${CDR_URL}"
CDR_TOKEN: "${CDR_TOKEN}"
RABBITMQ_URI: "amqp://${RABBITMQ_USERNAME}:${RABBITMQ_PASSWORD}@rabbitmq/%2F"
PREFIX: ""
Expand Down
Loading

0 comments on commit ad9ad75

Please sign in to comment.