Skip to content
Dan Mushkevych edited this page Mar 6, 2018 · 22 revisions

Synergy Scheduler supervises execution of multiple processes and their jobs.

Process

Process or Worker is any system process (for example: Python process, Hadoop map-reduce job, etc) meant to convert or aggregate raw data into formatted.

There are two types of processes:

  • cron-like jobs govern by timer. They are known to the system as free-run
  • managed jobs that are govern by state machine. Such jobs could have multiple dependencies on other jobs

Managed processes have following features:

  • They chose state machine to govern their execution
  • They could have dependencies or serve as a dependency-provider for other processes
    i.e. you can instruct the Scheduler: wait for Process A to succeed on timeperiod T before running Process B on T
  • Garbage Collector provides fail-over mechanism for failed and abandoned unit_of_work
  • Each process has a comprehensive history of its runs and state transfers

Free-run process have following features:

  • Lightweight - no states, shallow history of runs; designed to function as a trigger only
  • Protection from overloading the target worker by tracking the status of the task execution: should the worker be busy with the ongoing task when the next trigger occurs - Scheduler will send a remainder, rather than newly created task
  • No dependencies on other processes, and could not serve as a dependency themselves
  • No re-triggering of invalid/abandoned unit_of_work
    i.e. Garbage Collector will skip all unit_of_work that belongs to free-run process

Timeperiod

Timeperiod represents a time window (or slice). It is encoded in format: YYYYMMDDHH.
For instance a timeperiod 2014011501 stands for 1 hour slice from 01:00 (inclusive) to 02:00 (exclusive) of 15 of January 2014.

Job

Job is a link between process and the timeperiod and adds tracking.
For instance, for some given process site_statistics and a timeperiod 2014011501, the job is responsible for tracking state of data conversion from 1-hour slice into site statistics.

Task

Task or unit_of_work or UOW is an attempt to perform a Job.

Time qualifier

Time qualifier is a class of a timeperiod. Its possible values are:

  • hourly
  • daily
  • monthly
  • yearly

For illustration purposes, let's assume that the Synergy Scheduler supervises a system that gathers and processes user's behaviour on a web site. In this context:

  • An hourly timeperiod represents data gathered within one hour, such that period from 10:00:00 of 1 of Jan 2011 till 10:59:59 of 1 of Jan 2011 represents one hourly period.
    Notation of this timeperiod is: 2011010110
  • Data gathered from 00:00:00 of 1 of Jan 2011 till 23:59:59 of 1 of Jan 2011 represents daily period.
    Notation of this timeperiod is: 2011010100
  • Data gathered from 00:00:00 of 1 of Jan 2011 till 23:59:59 of 31 of Jan 2011 represents monthly period.
    Notation of this timeperiod is: 2011010000
  • All-year statistics result in a yearly period
    Notation of this timeperiod is: 2011000000

Timetable and trees

Synergy Scheduler organizes timeperiods in tree-like structures.
root <- yearly periods <- monthly periods <- daily periods <- hourly periods

Each level of the tree can be considered as complete only if all nested timeperiods are in STATE_PROCESSED or STATE_SKIPPED states

For example: since daily period nests 24 hourly periods we need all of them to complete before daily period could be declared complete.

Trees can have following number of levels:

  • 4-level tree, hosts yearly, monthly, daily and hourly timeperiods
  • 3-level tree, hosts yearly, monthly and daily timeperiods
  • 2-level tree, hosts timeperiods (either hourly, daily or monthly) and virtual "root" level to maintain tree-like structure

Trees above underline downwards dependency: yearly periods depend on monthly; monthly depends on daily; daily depends on hourly.

Trees & Processes

Each level in the tree is managed by a designated process. For example: <site> hourly period statistics by "site_hourly_aggregator", <site> daily period statistics - by "site_daily_aggregator", etc.

Dependencies between trees

It is common for trees to have dependencies.
For example: to calculate Revenue Per Click, we need two numbers: number_of_clicks from <site tree> and revenue from <financial tree>.
Both numbers are required to compute Revenue Per Click = number_of_clicks / revenue. Thus, tree <financial post-processing> will depend on both <site tree> and <financial tree>.

Dependencies are registered in the context.py block defining the tree. They are time qualifier-dependent. Such that daily timeperiods from <site tree> can be dependent on daily timeperiods from <financial tree>. Consequentially, hourly timeperiods of tree A can not block daily timeperiods from dependent tree B, as they belong to different time-aggregation classes.

Dependencies can be of following types:

  • blocking_dependencies any processing of dependent timeperiods is blocked until blocking timeperiods are processed.
    Interesting use-case is when one of blocking timeperiods is in STATE_SKIPPED. In this case, dependent timeperiod is also moved to STATE_SKIPPED
  • blocking_children any processing of higher time granularity is blocked until all nested children timeperiods are processed.
  • blocking_normal dependency allows processing of the dependent timeperiod, however finalization of the dependent timeperiod is not allowed unless blocking timeperiods are processed