Skip to content
mushkevych edited this page Dec 3, 2011 · 22 revisions

Scheduler is state machine that works with concept of time-period.

Timeperiods

Timeperiod can be either:

  • hourly
  • daily
  • monthly
  • yearly

For example:

  • Statistics (number of page-views, number of unique visitors, etc) for particular domain a.com for period from 10:00:00 of 1 of Jan 2011 till 10:59:59 of 1 of Jan 2011 comprise hourly period of statistics.
  • Statistics aggregation from 00:00:00 of 1 of Jan 2011 till 23:59:59 of 1 of Jan 2011 comprise daily period statistics
  • Statistics aggregation from 00:00:00 of 1 of Jan 2011 till 23:59:59 of 31 of Jan 2011 comprise monthly period statistics
  • All-year statistics result in yearly period

Verticals

Scheduler organize timperiods in tree-like structures. root <- yearly periods <- monthly periods <- daily periods <- hourly periods

Each level of the tree can be considered as complete only if all nested timeperiods are in STATE_PROCESSED or STATE_SKIPPED states

For example: since daily period nests 24 hourly periods we need all of them to complete before we can declare daily period completed.

Trees can have following number of levels:

  • 4-level tree, hosts yearly, monthly, daily and hourly timeperiods
  • 3-level tree, hosts yearly, monthly and daily timeperiods
  • 2-level tree, hosts daily timeperiods only and virtual "root" level to keep tree-like structure

Tree presented above, presents a vertical. Term vertical underlines downwards dependency: yearly periods depends on monthly; monthly depends on the daily; daily on hourly.

Processes & Timperiods

Each level in the tree is suppose to be managed by a separate process/aggregator. For example: site hourly period statistics by "site_hourly_aggregator", site daily period statistics - by "site_daily_aggregator", etc.

Dependencies between verticals