-
Notifications
You must be signed in to change notification settings - Fork 4
Concept
Synergy Scheduler supervises execution of multiple processes and their jobs.
Process or Worker is any system process (for example: Python process, Hadoop map-reduce job, etc) meant to process or aggregate raw data.
There are two types of processes:
- cron-like jobs governed by a timer. They are known to the system as free-run
- managed jobs that are govern by state machine. Such jobs could have multiple dependencies on other jobs
Managed processes have following features:
- They chose state machine to govern their execution
- They could have dependencies or serve as a dependency-provider for other processes
i.e. you can instruct the Scheduler:wait for Process A to succeed on timeperiod T before running Process B on timeperiod T
- Garbage Collector provides fail-over mechanism for failed or abandoned unit_of_work
- Each process has a comprehensive history of its runs and state transfers
Free-run process have following features:
- Lightweight - no states, shallow history of runs; designed to function as a trigger only
- Protection from overloading the target worker by tracking the status of the task execution: should the worker be busy with the ongoing task when the next trigger occurs - Scheduler will send a remainder, rather than newly created task
- No dependencies on other processes, and could not serve as a dependency themselves
- No re-triggering of invalid/abandoned unit_of_work
i.e. Garbage Collector will skip all unit_of_work that belongs to a free-run process
Timeperiod represents a time window (or slice). It is encoded in format: YYYYMMDDHH.
For instance a timeperiod 2014011501 stands for 1 hour slice from 01:00 (inclusive) to 02:00 (exclusive) of 15 of January 2014.
Job is a link between the process, the timeperiod and the state.
For instance, for some given process site_statistics and a timeperiod 2014011501, the job is responsible for tracking state of 1-hou slice of raw data processing into site statistics.
Examples of job states: STATE_PROCESSED, STATE_REQUESTED, etc.
Task or unit_of_work or UOW is an attempt to perform a Job.
Time qualifier is a class of a timeperiod. Its possible values are:
- hourly
- daily
- monthly
- yearly
For illustration purposes, let's assume that the Synergy Scheduler supervises a system that gathers and processes user's behavior on a web site. In this context:
- An hourly timeperiod represents data gathered within one hour, such that period from 10:00:00 of 1 of Jan 2011 till 10:59:59 of 1 of Jan 2011 represents one hourly period.
Notation of this timeperiod is: 2011010110 - Data gathered from 00:00:00 of 1 of Jan 2011 till 23:59:59 of 1 of Jan 2011 represents daily period.
Notation of this timeperiod is: 2011010100 - Data gathered from 00:00:00 of 1 of Jan 2011 till 23:59:59 of 31 of Jan 2011 represents monthly period.
Notation of this timeperiod is: 2011010000 - All-year statistics result in a yearly period
Notation of this timeperiod is: 2011000000
Synergy Scheduler organizes timeperiods in tree-like structures.
root <- yearly periods <- monthly periods <- daily periods <- hourly periods
Each level of the tree can be considered as complete only if all nested timeperiods are in STATE_PROCESSED or STATE_SKIPPED states
For example: since daily period nests 24 hourly periods we need all of them to complete before daily period could be declared complete.
Trees can have following number of levels:
- 4-level tree hosts
{yearly <- monthly <- daily <- hourly}
timeperiods - 3-level tree hosts
{yearly <- monthly <- daily}
or{monthly <- daily <- hourly}
timeperiods - 2-level tree hosts virtual "root" level to maintain tree-like structure for either hourly, daily or monthly:
{root <- monthly}
or{root <- daily}
or{root <- hourly}
Trees are characterized by downward dependency: yearly periods depend on monthly; monthly depends on daily; daily depends on hourly.
Each level in the tree is managed by a designated process. For example: <site> hourly period statistics by "site_hourly_aggregator", <site> daily period statistics - by "site_daily_aggregator", etc.
It is common for trees to have dependencies.
For example: to calculate Revenue Per Click, we need two numbers: number_of_clicks from <site tree> and revenue from <financial tree>.
Both numbers are required to compute Revenue Per Click = number_of_clicks / revenue
. Thus, tree <financial post-processing> will depend on both <site tree> and <financial tree>.
Dependencies are registered in the context.py block defining the tree. They are time qualifier-dependent. Such that daily timeperiods from <site tree> can be dependent on daily timeperiods from <financial tree>. Consequentially, hourly timeperiods of tree A can not block daily timeperiods from dependent tree B, as they belong to different time-aggregation classes.
Dependencies can be of following types:
-
blocking_dependencies any processing of dependent timeperiods is blocked until blocking timeperiods are processed.
Interesting use-case is when one of blocking timeperiods is in STATE_SKIPPED. In this case, dependent timeperiod is also moved to STATE_SKIPPED - blocking_children any processing of higher time granularity is blocked until all nested children timeperiods are processed.
- blocking_normal dependency allows processing of the dependent timeperiod, however finalization of the dependent timeperiod is not allowed unless blocking timeperiods are processed