Skip to content

Commit

Permalink
still improving MDP doc
Browse files Browse the repository at this point in the history
  • Loading branch information
BDonnot committed Feb 29, 2024
1 parent f528b20 commit 46adcde
Show file tree
Hide file tree
Showing 2 changed files with 63 additions and 16 deletions.
6 changes: 4 additions & 2 deletions docs/chronics.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,9 @@
.. currentmodule:: grid2op.Chronics

Chronics
===================================
.. _time-series-module:

Time series (formerly called "chronics")
=========================================

This page is organized as follow:

Expand Down
73 changes: 59 additions & 14 deletions docs/mdp.rst
Original file line number Diff line number Diff line change
Expand Up @@ -109,7 +109,7 @@ MDP):
:nowrap:
\begin{align*}
\min_{\pi \in \Pi} ~& \sum_{t=1}^T r_t \\
\min_{\pi \in \Pi} ~& \sum_{t=1}^T \mathbb{E} r_t \\
\text{s.t.} ~ \\
& \forall t, a_t \sim \pi (s_{t}) & \text{policy produces the action} \\
& \forall t, s_{t+1} \sim \mathcal{L}_S(s_t, a_t) & \text{environment produces next state} \\
Expand All @@ -134,14 +134,17 @@ This simulator is able to compute some informations that are part of the state
space :math:`\mathcal{S}` (*eg* flows on powerlines, active production value of generators etc.)
and thus are used in the computation of the transition kernel.

TODO how to model it.
We can model this simulator with a function :math:`\text{Sim}` that takes as input some data from an
"input space" :math:`\mathcal{S}_{\text{im}}^{(\text{in})}` and result
in data in :math:`\mathcal{S}_{\text{im}}^{(\text{out})}`.

.. This simulator is also used when implementing the transition kernel. Some part of the state space
.. other information given by the Environment (see :ref:`environment-module` for details about the
.. way the `Environment` is coded and refer to :class:`grid2op.Action._backendAction._BackendAction` for list
.. of all available informations informatically available for the solver).
.. note::
In grid2op we don't force the "shape" of :math:`\mathcal{S}_{\text{im}}^{(\text{in})}`, including
the format used to read the grid file from the hard drive, the solved equations, the way
these equations are used. Everything here is "free" and grid2op only needs that the simulator
(wrapped in a `Backend`) understands the "format" sent by grid2op (through a
:class:`grid2op.Action._backendAction._BackendAction`) and is able to expose
to grid2op some of its internal variables (accessed with the `***_infos()` methods of the backend)


To make a parallel with similar concepts "simulator",
Expand All @@ -153,21 +156,63 @@ here excepts that it solves powerflows.
Some Time Series
+++++++++++++++++

TODO
Another type of data that we need to define "the" grid2op MDP is the "time series", implemented in the `chronics`
grid2op module documented on the page
:ref:`time-series-module` with some complements given in the :ref:`doc_timeseries` page as well.

These time series define what exactly would happen if the grid was a
"copper plate" without any constraints. Said differently it provides what would each consumer
consume and what would each producer produce if they could all be connected together with
infinite "bandwith", without any constraints on the powerline etc.

In particular, grid2op supposes that these "time series" are balanced, in the sense that the producers
produce just the right amount (electrical power cannot really be stocked) for the consumer to consume
and that for each steps. It also supposes that all the "constraints" of the producers.

These time series are typically generated outside of grid2op, for example using `chronix2grid <https://github.com/BDonnot/ChroniX2Grid>`_
python package (or anything else).


Formally, we will define these time series as input :math:`\mathcal{X}_t` all these time series at time :math:`t`. These
exogenous data consist of :

- generator active production (in MW), for each generator
- load active power consumption (in MW), for each loads
- load reactive consumption (in MVAr), for each loads
- \* generator voltage setpoint / target (in kV)

.. note::
\* for this last part, this can be adapted "on demand" by the environment through the `voltage controler` module.
But for the sake of modeling, this can be modeled as being external / exogenous data.

And, to make a parrallel with similar concept in other RL environment, these "time series" can represent the layout of the maze
in pacman, the positions of the platforms in "mario-like" 2d games, the different turns and the width of the route in a car game etc.
This is the "base" of the levels in most games.

Finally, for most released environment, a lof of different :math:`\mathcal{X}` are available. By default, each time the
environment is "reset" (the user want to move to the next scenario), a new :math:`\mathcal{X}` is used (this behaviour
can be changed, more information on the section :ref:`environment-module-chronics-info` of the documentation).

.. _mdp-def:

Modeling sequential decisions
-------------------------------

TODO
As we said in introduction of this page, we will model a given scenario in grid2op. We have at our disposal:

- a simulator, which is represented as a function :math:`\text{Sim} : \mathcal{S}_{\text{im}}^{(\text{in})} \to \mathcal{S}_{\text{im}}^{(\text{out})}`
- some time series :math:`\mathcal{X} = \left\{ \mathcal{X}_t \right\}_{1 \leq t \leq T}`

Inputs
~~~~~~~~~~
And we need to define the MDP through the definition of :

Markov Decision process
~~~~~~~~~~~~~~~~~~~~~~~~
- :math:`\mathcal{S}`, the "state space"
- :math:`\mathcal{A}`, the "action space"
- :math:`\mathcal{L}_s(s, a)`, sometimes called "transition kernel", is the probability
distribution (over :math:`\mathcal{S}`) that gives the next
state after taking action :math:`a` in state :math:`s`
- :math:`\mathcal{L}_r(s, s', a)`, sometimes called "reward kernel",
is the probability distribution (over :math:`[0, 1]`) that gives
the reward :math:`r` after taking action :math:`a` in state :math:`s` which lead to state :math:`s'`

Extensions
-----------
Expand Down

0 comments on commit 46adcde

Please sign in to comment.