Skip to content
mushkevych edited this page Oct 3, 2014 · 7 revisions

Scheduler is production-grade Job Scheduling System. Here, "job" term corresponds to any system process (for example: Python process, Hadoop map-reduce job, etc) that is started and monitored by the Scheduler means.

There are two types of processes:

  • cron-like jobs govern by timer. They are known to the system as free-run
  • managed jobs that are govern by state machine. Such jobs could have multiple dependencies on other jobs

Free-run process have following features:

  • no dependencies on other processes, and could not serve as a dependency themselves i.e. you can't tell Scheduler: wait for Process A to succeed on timeperiod T before running Process B on T, should either of two be a free-run process
  • no re-triggering of invalid/abandoned unit_of_work i.e. GarbageCollector will not skip all unit_of_work that belongs to free-run process

Managed processes have following features:

  • they chose state machine to govern their execution
  • they could have dependencies or serve as a dependency-provider for other processes
  • Garbage Collector provides fail-over mechanism for failed and abandoned unit_of_work