Skip to content

Domains

Erwin Walraven edited this page Dec 10, 2018 · 27 revisions

The toolbox has several built-in problem domains which can be obtained using a CMDPInstanceGenerator or CPOMDPInstanceGenerator in the domains package. Each domain provides a function getInstance(int numAgents, int numDecisions) which returns a CMDPInstance or CPOMDPInstance with the given number of agents and decisions. This instance also defines the resource limits which impose constraints on the behavior of the agents. Below we first discuss problem instances, after which we introduce the domains that are part of the toolbox.

Problem instances

A problem instance defines a constrained planning problem that can be solved by an algorithm. The toolbox provides two objects that can be used to model problem instances: CMDPInstance and CPOMDPInstance. We discuss both variants below in more detail.

A CMDPInstance object contains a CMDP model for each agent, and it defines the type of resource constraints with the associated resource limits. The CPOMDPInstance object defines the same properties for planning problems with partial observability.

Problem instances can be obtained using a CMDPInstanceGenerator or CPOMDPInstanceGenerator by calling the getInstance(int numAgents, int numDecisions) method, which returns the corresponding instance object. The number of decisions represents the horizon of the planning problem. For each included domain our toolbox provides a generator.

Important remark: the generators automatically set the resource limits, depending on the desired number of agents and the number of decisions. However, if these parameters are set too low, then it may be possible that the resource limits do not impose constraints on the behavior of the agents. For example, in some domains it is better to avoid resource consumption completely if only a few sequential decisions are made. As a general rule we recommend to set the parameters of the instance by exploring a few different options.

Instance generator

Domain descriptions

Several domains have been integrated in the toolbox already. A brief description of the domains is provided below, including references to the literature which either uses or describes the domain.

In the domain descriptions we refer to two types of resource constraints. Budget constraints can be used for non-renewable resources that can be used across all time steps. For example, these constraints are suitable for problems in which actions require a money investment while only a finite money budget is available during plan execution. Instantaneous constraints can be used for resources which are renewable, such as a hammer that can be used during every time step. In this case the usage of a hammer does not affects its availability in subsequent time steps.

Online advertising

Online advertising involves presenting advertisements to users that browse the internet in such a way that they become interested in, e.g., buying a product in a webshop. If there is only a limited amount of money available for advertising, then it is required to decide how this budget is spent in order to maximize revenue. Each user browsing on the internet is modeled as a Markov Decision Process in which states represent the level of interest of the user and actions represent the advertisements that can be shown to the user. Each action has cost associated with it, corresponding to the amount of money that is required to show the advertisement. The global budget imposes a constraint on the advertisements that can be shown to the users.

Model: MDP

Type of constraints: budget

Instance generator: domains.advertising.AdvertisingInstanceGenerator

Literature: Boutilier, C., & Lu, T. (2016). Budget Allocation using Weakly Coupled, Constrained Markov Decision Processes. In Proceedings of the 32nd Conference on Uncertainty in Artificial Intelligence (pp. 823–830).

De Nijs, F., Walraven, E., De Weerdt, M. M., & Spaan, M. T. J. (2017). Bounding the Probability of Resource Constraint Violations in Multi-Agent MDPs. In Proceedings of the 31st AAAI Conference on Artificial Intelligence (pp. 3562–3568).

Mars rovers: Maze

In the Maze domain it is required to assign a limited set of tools to Mars rovers. These tools are required to perform research tasks, and the assignment of tools to robots influences the total value of the research tasks performed. The planner needs to decide how the tools are assigned, such that the expected value of the research tasks is maximized.

Model: MDP

Type of constraints: instantaneous

Instance generator: domains.maze.MazeInstanceGenerator

Literature: Wu, J., & Durfee, E. H. (2010). Resource-Driven Mission-Phasing Techniques for Constrained Agents in Stochastic Environments. Journal of Artificial Intelligence Research, 38, 415–473.

De Nijs, F., Walraven, E., De Weerdt, M. M., & Spaan, M. T. J. (2017). Bounding the Probability of Resource Constraint Violations in Multi-Agent MDPs. In Proceedings of the 31st AAAI Conference on Artificial Intelligence (pp. 3562–3568).

Thermostatically Controlled Loads (TCL)

A Thermostatically Controlled Load (TCL) is a load in a power grid that is controlled autonomously using a thermostat. For example, heating systems in a house are controlled by a thermostat in order to ensure that the room temperature is close to a given setpoint. Since the temperature in a room decreases over time if it is cold outside, the thermostat needs to activate the heating multiple times a day to ensure that the temperature remains close to the setpoint. If there are multiple houses connected to the same power line, then the capacity limit of this line imposes constraints on the behavior of the thermostats. For example, activating all heating systems in a street at the same time may lead to more power consumption than the line can accommodate, which should be prevented at all times. In this domain each thermostat is represented by a Markov Decision Process in which the actions correspond to activating and deactivating the heating, and the states represent the room temperature. The domain comes with two versions: constant resource limits and multi-level resource limits dependent on time.

Model: MDP

Type of constraints: instantaneous

Instance generator: domains.tcl.TCLInstanceGeneratorFixedLimit and domains.tcl.TCLInstanceGeneratorMultiLevel

Literature: De Nijs, F., Spaan, M. T. J., & De Weerdt, M. M. (2015). Best-Response Planning of Thermostatically Controlled Loads under Power Constraints. In Proceedings of the 29th AAAI Conference on Artificial Intelligence (pp. 615–621).

De Nijs, F., Walraven, E., De Weerdt, M. M., & Spaan, M. T. J. (2017). Bounding the Probability of Resource Constraint Violations in Multi-Agent MDPs. In Proceedings of the 31st AAAI Conference on Artificial Intelligence (pp. 3562–3568).

WebAd

The web-ad domain is based on the same principle as the online advertising domain described earlier, but it includes partial observability of the level of interest of the user. As a result, the decision maker needs to maintain a belief regarding the level of interest of the users, rather than observing the level of interest directly. The original POMDP file corresponding to the domain can be found here.

Model: POMDP

Type of constraints: budget

Instance generator: domains.webad.WebAdGenerator

Literature: Walraven, E., & Spaan, M. T. J. (2018). Column Generation Algorithms for Constrained POMDPs. Journal of Artificial Intelligence Research, 62, 489–533.

Condition-based maintenance

Condition-based maintenance is a practice to reduce maintenance cost of systems for which the operating performance deteriorates stochastically over time. Rather than performing scheduled maintenance, it can be effective to perform maintenance only when sensor data and inspections indicate that maintenance is necessary. In this domain the system can be modeled as a Partially Observable Markov Decision Process, in which the state represents the current condition. Based on inspections and maintenance (the actions), the decision maker gets more information regarding the actual state of the system. If only a limited maintenance budget is available, then it is important to optimize inspection and maintenance actions in order to ensure that the system is kept in a good condition given the budget. The domain is a modified version of the bridge repair problem that can be found here.

Model: POMDP

Type of constraints: budget

Instance generator: domains.cbm.CBMGenerator

Literature: Walraven, E., & Spaan, M. T. J. (2018). Column Generation Algorithms for Constrained POMDPs. Journal of Artificial Intelligence Research, 62, 489–533.

Define your own domain

New domains can be added by creating a new package in the domains package. The new package should contain an implementation of the CMDPInstanceGenerator interface or the CPOMDPInstanceGenerator interface. The existing domains in the toolbox serve as an example. It is also possible to define problem instances directly without creating a generator first. This is illustrated in the examples.