Skip to content

Latest commit

 

History

History
51 lines (44 loc) · 2.15 KB

ROADMAP.md

File metadata and controls

51 lines (44 loc) · 2.15 KB

Waterwheel Roadmap

High level overview of things to be done. Not necessarily in order, or committed.

Features

  • show task logs in the web UI
    • logs are stored in Redis streams and sent to the UI over a websocket
    • (currently only implemented for Docker workers)
    • display logs after job completion
    • tail logs of a running job
  • overview interface to show recent job runs (like the box view of Airflow but less ugly)
    • plenty more improvements to be made here
  • APIs for activating tasks based on query criteria (eg. past/future)
  • keep history of jobs runs, record task attempts
    • need to consider the data model to get this right
  • built in task retries - you can do this currently with cyclic graphs ;)
  • job concurrency - limit backfills from flooding the queue
  • task value stash
    • to replace Airflow's xcom, variables and connections
    • most likely needs to be an HTTP API exposed to each container
  • task routing - send tasks to specific workers to support workers running on "privileged" hardware
    • maybe just based on projects, or maybe fully custom (with separate ACLs to control it)
  • ACLs
    • Web UI logins and edit/view permissions
    • API permissions - CRUD operations
    • All authorization decisions are evaluated by an OPA server
    • Authentication should be provided by a proxy (such as Oathkeeper or SealProxy)

Operational

  • emit metrics to statsd
  • better control over server and worker logs (send them to fluentd/Vector too?)
  • High Availability
    • separate the server from the web interface
      • update messages are sent from the api to the scheduler over AMQP
    • scheduler is stateless, but has in-memory caches, verify and test
    • HA mode for the scheduler
      • active/active - triggers are allocated to schedulers using Rendezvous hashing. Schedulers form a cluster using a gossip protocol.

Bug Fixes / Paper Cuts

  • task backfills with cross-job dependencies
    • need to check the cross-job tasks for status and possibly trigger tasks