From 5e5cc67625e572cf967dad8714776444f717927d Mon Sep 17 00:00:00 2001 From: DONNOT Benjamin Date: Fri, 23 Feb 2024 17:43:45 +0100 Subject: [PATCH 1/5] improving docs, start to include MDP in doc [skip ci] --- docs/action.rst | 5 ++ docs/create_an_environment.rst | 9 ++++ docs/data_pipeline.rst | 8 +++ docs/dive_into_time_series.rst | 8 +++ docs/episode.rst | 1 + docs/grid_graph.rst | 9 ++++ docs/index.rst | 1 + docs/mdp.rst | 93 ++++++++++++++++++++++++++++++++++ docs/model_based.rst | 9 ++++ 9 files changed, 143 insertions(+) create mode 100644 docs/mdp.rst diff --git a/docs/action.rst b/docs/action.rst index 817fc3598..a81d0985e 100644 --- a/docs/action.rst +++ b/docs/action.rst @@ -29,6 +29,11 @@ Action =================================== +This page is organized as follow: + +.. contents:: Table of Contents + :depth: 3 + Objectives ---------- The "Action" module lets you define some actions on the underlying power _grid. diff --git a/docs/create_an_environment.rst b/docs/create_an_environment.rst index f802ad9c7..e0e36f8d0 100644 --- a/docs/create_an_environment.rst +++ b/docs/create_an_environment.rst @@ -8,6 +8,15 @@ Possible workflow to create an environment from existing time series ====================================================================== +This page is organized as follow: + +.. contents:: Table of Contents + :depth: 3 + + +Workflow in more details +------------------------- + In this subsection, we will give an example on how to set up an environment in grid2op if you already have some data that represents loads and productions at each steps. This paragraph aims at making more concrete the description of the environment shown previously. diff --git a/docs/data_pipeline.rst b/docs/data_pipeline.rst index cb86a6723..1792e834b 100644 --- a/docs/data_pipeline.rst +++ b/docs/data_pipeline.rst @@ -3,6 +3,14 @@ Optimize the data pipeline ============================ +This page is organized as follow: + +.. contents:: Table of Contents + :depth: 3 + +Objectives +-------------------------- + Optimizing the data pipeline can be crucial if you want to learn fast, especially at the beginning of the training. There exists multiple way to perform this task. diff --git a/docs/dive_into_time_series.rst b/docs/dive_into_time_series.rst index acf95f813..5a5264996 100644 --- a/docs/dive_into_time_series.rst +++ b/docs/dive_into_time_series.rst @@ -5,6 +5,14 @@ Input data of an environment =================================== +This page is organized as follow: + +.. contents:: Table of Contents + :depth: 3 + +Objectives +---------------- + A grid2op "environment" is nothing more than a local folder on your computer. This folder consists of different things: diff --git a/docs/episode.rst b/docs/episode.rst index 34bc8453e..9d8be3d8f 100644 --- a/docs/episode.rst +++ b/docs/episode.rst @@ -1,5 +1,6 @@ Episode =================================== + This page is organized as follow: .. contents:: Table of Contents diff --git a/docs/grid_graph.rst b/docs/grid_graph.rst index ccfdbc615..5b5702a90 100644 --- a/docs/grid_graph.rst +++ b/docs/grid_graph.rst @@ -10,6 +10,15 @@ A grid, a graph: grid2op representation of the powergrid =================================================================== + +This page is organized as follow: + +.. contents:: Table of Contents + :depth: 3 + +Objectives +---------------- + In this section of the documentation, we will dive a deeper into the "modeling" on which grid2op is based and especially how the underlying graph of the powergrid is represented and how it can be easily retrieved. diff --git a/docs/index.rst b/docs/index.rst index 751b37b11..31dd1f648 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -94,6 +94,7 @@ Modeling :maxdepth: 2 :caption: Models + mdp modeled_elements grid_graph diff --git a/docs/mdp.rst b/docs/mdp.rst new file mode 100644 index 000000000..3ede81e67 --- /dev/null +++ b/docs/mdp.rst @@ -0,0 +1,93 @@ +.. _mdp-doc-module: + +Dive into grid2op sequential decision process +=============================================== + +This page is organized as follow: + +.. contents:: Table of Contents + :depth: 3 + +Objectives +----------- + +TODO + +Modeling sequential decisions +------------------------------- + +TODO + + +Inputs +~~~~~~~~~~ + +A simulator +++++++++++++ + +TODO + +B Time Series +++++++++++++++ + +TODO + +Markov Decision process +~~~~~~~~~~~~~~~~~~~~~~~~ + +Extensions +----------- + +Partial Observatibility +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +This is the case in most grid2op environment: only some part of the environment +state at time `t` :math:`s_t` are +given to the agent in the observation at time `t` :math:`o_t`. + +More specifically, in most grid2op environment (by default at least), none of the +physical parameters of the solvers are provided. Also, to represent better +the daily operation in power systems, only the `t`th row :math:`x_t` of the matrix +X is given in the observation :math:`o_t`. The components :math:`X_{t', i}` +(for :math:`t' > t`) are not given. + +Adversarial attacks +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +TODO: explain the model of the environment + +Forecast and simulation on future states +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +TODO : explain the model the forecast and the fact that the "observation" also +includes a model of the world that can be different from the grid of the environment + +Simulator dynamics can be more complex +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Hide elements from the grid2op environment +++++++++++++++++++++++++++++++++++++++++++ + +TODO only a part of the grid would be "exposed" in the +grid2op environment. + + +Contain elements not modeled by grid2op +++++++++++++++++++++++++++++++++++++++++++ + +TODO: speak about HVDC or "pq" generators, or 3 winding transformers + +Contain embeded controls +++++++++++++++++++++++++++++++++++++++++++ + +TODO for example automatic setpoint for HVDC or limit on Q for generators + +Time domain simulation ++++++++++++++++++++++++ + +TODO: we can plug in simulator that solves more +accurate description of the grid and only "subsample" +(*eg* at a frequency of every 5 mins) provide grid2op +with some information. + +.. include:: final.rst diff --git a/docs/model_based.rst b/docs/model_based.rst index 54f4c6f6e..3645fa6e9 100644 --- a/docs/model_based.rst +++ b/docs/model_based.rst @@ -3,6 +3,15 @@ Model Based / Planning methods ==================================== + +This page is organized as follow: + +.. contents:: Table of Contents + :depth: 3 + +Objectives +---------------- + .. warning:: This page is in progress. We welcome any contribution :-) From a8f2885f2fabb3dad010d98b95f9c57b57fb1729 Mon Sep 17 00:00:00 2001 From: DONNOT Benjamin Date: Mon, 26 Feb 2024 13:18:12 +0100 Subject: [PATCH 2/5] update doc, mainly for Backend, MDP doc in slow progress --- docs/_static/hacks.css | 326 ++++++++ docs/backend.rst | 40 +- docs/conf.py | 1 + docs/createbackend.rst | 49 +- docs/grid_graph.rst | 4 - docs/mdp.rst | 28 +- docs/special.rst | 44 ++ grid2op/Action/_backendAction.py | 857 +++++++++++++++++++++- grid2op/Action/baseAction.py | 5 +- grid2op/Action/serializableActionSpace.py | 9 +- grid2op/Backend/backend.py | 3 + grid2op/tests/test_n_busbar_per_sub.py | 7 - 12 files changed, 1326 insertions(+), 47 deletions(-) create mode 100644 docs/_static/hacks.css create mode 100644 docs/special.rst diff --git a/docs/_static/hacks.css b/docs/_static/hacks.css new file mode 100644 index 000000000..a0fa73de4 --- /dev/null +++ b/docs/_static/hacks.css @@ -0,0 +1,326 @@ +/* + * CSS hacks and small modification for my Sphinx website + * :copyright: Copyright 2013-2016 Lilian Besson + * :license: GPLv3, see LICENSE for details. + */ + + +/* Colors and text decoration. + For example, :black:`text in black` or :blink:`text blinking` in rST. */ + + .black { + color: black; +} + +.gray { + color: gray; +} + +.grey { + color: gray; +} + +.silver { + color: silver; +} + +.white { + color: white; +} + +.maroon { + color: maroon; +} + +.red { + color: red; +} + +.magenta { + color: magenta; +} + +.fuchsia { + color: fuchsia; +} + +.pink { + color: pink; +} + +.orange { + color: orange; +} + +.yellow { + color: yellow; +} + +.lime { + color: lime; +} + +.green { + color: green; +} + +.olive { + color: olive; +} + +.teal { + color: teal; +} + +.cyan { + color: cyan; +} + +.aqua { + color: aqua; +} + +.blue { + color: blue; +} + +.navy { + color: navy; +} + +.purple { + color: purple; +} + +.under { + text-decoration: underline; +} + +.over { + text-decoration: overline; +} + +.blink { + text-decoration: blink; +} + +.line { + text-decoration: line-through; +} + +.strike { + text-decoration: line-through; +} + +.it { + font-style: italic; +} + +.ob { + font-style: oblique; +} + +.small { + font-size: small; +} + +.large { + font-size: large; +} + +.smallpar { + font-size: small; +} + + +/* Style pour les badges en bas de la page. */ + +div.supportBadges { + margin: 1em; + text-align: right; +} + +div.supportBadges ul { + padding: 0; + display: inline; +} + +div.supportBadges li { + display: inline; +} + +div.supportBadges a { + margin-right: 1px; + opacity: 0.6; +} + +div.supportBadges a:hover { + opacity: 1; +} + + +/* Details elements in the sidebar */ + +a.reference { + border-bottom: none; + text-decoration: none; +} + +ul.details { + font-size: 80%; +} + +ul.details li p { + font-size: 85%; +} + +ul.externallinks { + font-size: 85%; +} + + +/* Pour le drapeau de langue */ + +img.languageswitch { + width: 50px; + height: 32px; + margin-left: 5px; + vertical-align: bottom; +} + +div.sphinxsidebar { + overflow: hidden !important; + font-size: 120%; + word-wrap: break-word; + width: 300px; + max-width: 300px; +} + +div.sphinxsidebar h3 { + font-size: 125%; +} + +div.sphinxsidebar h4 { + font-size: 110%; +} + +div.sphinxsidebar a { + font-size: 85%; +} + + +/* Image style for scrollUp jQuery plugin */ + +#scrollUpLeft { + bottom: 50px; + left: 260px; + height: 38px; + width: 38px; + background: url('//perso.crans.org/besson/_images/.top.svg'); + background: url('../_images/.top.svg'); +} + +@media screen and (max-width: 875px) { + #scrollUpLeft { + right: 50px; + left: auto; + } +} + + +/* responsive for font-size. */ + +@media (max-width: 875px) { + body { + font-size: 105%; + /* Increase font size for responsive theme */ + } +} + +@media (max-width: 1480px) and (min-width: 876px) { + body { + font-size: 110%; + /* Increase font size for not-so-big screens */ + } +} + +@media (min-width: 1481px) { + body { + font-size: 115%; + /* Increase even more font size for big screens */ + } +} + + +/* Social Icons in the sidebar (available: twitter, facebook, linkedin, google+, bitbucket, github) */ + +.social-icons { + display: inline-block; + margin: 0; + text-align: center; +} + +.social-icons a { + background: none no-repeat scroll center top #444444; + border: 1px solid #F6F6F6; + border-radius: 50% 50% 50% 50%; + display: inline-block; + height: 35px; + width: 35px; + margin: 0; + text-indent: -9000px; + transition: all 0.2s ease 0s; + text-align: center; + border-bottom: none; +} + +.social-icons li { + display: inline-block; + list-style-type: none; + border-bottom: none; +} +.social-icons li a { + border-bottom: none; +} + +.social-icons a:hover { + background-color: #666666; + transition: all 0.2s ease 0s; + text-decoration: none; +} + +.social-icons a.facebook { + background-image: url('../_images/.facebook.png'); + background-image: url('//perso.crans.org/besson/_images/.facebook.png'); + display: block; + margin-left: auto; + margin-right: auto; + background-size: 35px 35px; +} + +.social-icons a.bitbucket { + background-image: url('../_images/.bitbucket.png'); + background-image: url('//perso.crans.org/besson/_images/.bitbucket.png'); + display: block; + margin-left: auto; + margin-right: auto; + background-size: 35px 35px; +} + +.social-icons li a.github { + background-image: url('../_images/.github.png'); + background-image: url('//perso.crans.org/besson/_images/.github.png'); + display: block; + margin-left: auto; + margin-right: auto; + background-size: 35px 35px; +} + +.social-icons li a.wikipedia { + background-image: url('../_images/.wikipedia.png'); + background-image: url('//perso.crans.org/besson/_images/.wikipedia.png'); + display: block; + margin-left: auto; + margin-right: auto; + background-size: 35px 35px; +} \ No newline at end of file diff --git a/docs/backend.rst b/docs/backend.rst index d4e666861..1d52adccd 100644 --- a/docs/backend.rst +++ b/docs/backend.rst @@ -1,4 +1,5 @@ .. currentmodule:: grid2op.Backend + .. _backend-module: Backend @@ -22,9 +23,39 @@ Objectives Both can serve as example if you want to code a new backend. This Module defines the template of a backend class. -Backend instances are responsible to translate action (performed either by an BaseAgent or by the Environment) into -comprehensive powergrid modifications. -They are responsible to perform the powerflow (AC or DC) computation. + +Backend instances are responsible to translate action into +comprehensive powergrid modifications that can be process by your "Simulator". +The simulator is responsible to perform the powerflow (AC or DC or Time Domain / Dynamic / Transient simulation) +and to "translate back" the results (of the simulation) to grid2op. + +More precisely, a backend should: + +#. inform grid2op of the grid: which objects exist, where are they connected etc. +#. being able to process an object of type :class:`grid2op.Action._backendAction._BackendAction` + into some modification to your solver (*NB* these "BackendAction" are created by the :class:`grid2op.Environment.BaseEnv` + from the agent's actions, the time series modifications, the maintenances, the opponent, etc. The backend **is not** + responsible for their creation) +#. being able to run a simulation (DC powerflow, AC powerflow or time domain / transient / dynamic) +#. expose (through some functions like :func:`Backend.generators_info` or :func:`Backend.loads_info`) + the state of some of the elements in the grid. + +.. note:: + A backend can model more elements than what can be controlled or modified in grid2op. + For example, at time of writing, grid2op does not allow the modification of + HVDC powerlines. But this does not mean that grid2op will not work if your grid + counts such devices. It just means that grid2op will not be responsible + for modifying them. + +.. note:: + A backend can expose only part of the grid to the environment / agent. For example, if you + give it as input a pan european grid but only want to study the grid of Netherlands or + France your backend can only "inform" grid2op (in the :func:`Backend.load_grid` function) + that "only the Dutch (or French) grid" exists and leave out all other informations. + + In this case grid2op will perfectly work, agents and environment will work as expected and be + able to control the Dutch (or French) part of the grid and your backend implementation + can control the rest (by directly updating the state of the solver). It is also through the backend that some quantities about the powergrid (such as the flows) can be inspected. @@ -57,6 +88,9 @@ We developed a dedicated page for the development of new "Backend" compatible wi Detailed Documentation by class ------------------------------- + +Then the `Backend` module: + .. automodule:: grid2op.Backend :members: :private-members: diff --git a/docs/conf.py b/docs/conf.py index e7b495411..46ea0ff96 100644 --- a/docs/conf.py +++ b/docs/conf.py @@ -75,6 +75,7 @@ # relative to this directory. They are copied after the builtin static files, # so a file named "default.css" will overwrite the builtin "default.css". html_static_path = ['_static'] +html_css_files = ['hacks.css'] # for pdf pdf_documents = [('index', u'rst2pdf', u'Grid2op documentation', u'B. DONNOT'),] diff --git a/docs/createbackend.rst b/docs/createbackend.rst index c4746f6d9..c343b21a4 100644 --- a/docs/createbackend.rst +++ b/docs/createbackend.rst @@ -89,7 +89,9 @@ everywhere). This includes, but is not limited to: - etc. .. note:: Grid2Op do not care about the modeling of the grid (static / steady state or dyanmic / transient) and both - types of solver could be implemented as backend. At time of writing (december 2020), only steady state powerflow are + Any types of solver could be implemented as backend. + + At time of writing (december 2020), only steady state powerflow are available. .. note:: The previous note entails that grid2op is also independent on the format used to store a powergrid. @@ -131,7 +133,28 @@ everywhere). This includes, but is not limited to: Main methods to implement -------------------------- Typically, a backend has a internal "modeling" / "representation" of the powergrid -stored in the attribute `self._grid` that can be anything. An more detailed example, with some +stored in the attribute `self._grid` that can be anything. + +.. note:: + `self._grid` is a "private" attribute. Only people that knows what it does and how + it works should be able to use it. + + Grid2op being fully generic, you can assume that all the classes of grid2op will never + access `self._grid`. For example, when building the observation of the grid, + grid2op will only use the information given in the `*_infos()` methods + (*eg* :func:`grid2op.Backend.Backend.loads_info`) and never by directly accessing `self._grid` + + In other words, `self._grid` can be anything: a PandaPower `Network`, a GridCal `MultiCircuit`, + a lightsim2grid `GridModel`, a pypowsybl `Network` (or `SortedNetwork`), + a powerfactory `Project` etc. Grid2op will never attempt to access `self._grid` + + (Though, to be perfectly honest, some agents might rely on some type `_grid`, if that's the case, too + bad for these agents they will need to implement special methods to be compatible with your backend. + Hopefully this should be extremely rare... The whole idea of grid2op being to make the different + "entities" (agent, environment, data, backend) as independant as possible this "corner" cases should + be rare.) + +An more detailed example, with some "working minimal code" is given in the "example/backend_integration" of the grid2op repository. There are 4 **__main__** types of method you need to implement if you want to use a custom powerflow @@ -495,7 +518,7 @@ BackendAction: modification In this section we detail step by step how to understand the specific format used by grid2op to "inform" the backend on how to modify its internal state before computing a powerflow. -A `BackendAction` will tell the backend on what is modified among: +A :class:`grid2op.Action._backendAction._BackendAction` will tell the backend on what is modified among: - the active value of each loads (see paragraph :ref:`change-inj`) - the reactive value of each loads (see paragraph :ref:`change-inj`) @@ -957,10 +980,26 @@ TODO this will be explained "soon". Detailed Documentation by class ------------------------------- -.. autoclass:: grid2op.Backend.EducPandaPowerBackend.EducPandaPowerBackend +A first example of a working backend that can be easily understood (without nasty gory speed optimization) +based on pandapower is available at : + +.. autoclass:: grid2op.Backend.educPandaPowerBackend.EducPandaPowerBackend :members: :private-members: :special-members: :autosummary: -.. include:: final.rst \ No newline at end of file +And to understand better some key concepts, you can have a look at :class:`grid2op.Action._backendAction._BackendAction` +or the :class:`grid2op.Action._backendAction.ValueStore` class: + +.. autoclass:: grid2op.Action._backendAction._BackendAction + :members: + :private-members: + :special-members: + :autosummary: + +.. autoclass:: grid2op.Action._backendAction.ValueStore + :members: + :autosummary: + +.. include:: final.rst diff --git a/docs/grid_graph.rst b/docs/grid_graph.rst index 5b5702a90..bdeae4c54 100644 --- a/docs/grid_graph.rst +++ b/docs/grid_graph.rst @@ -32,10 +32,6 @@ First, we detail some concepts from the power system community in section :ref:`graph-encoding-gridgraph`. Finally, we show some code examples on how to retrieve this graph in section :ref:`get-the-graph-gridgraph`. - -.. contents:: Table of Contents - :depth: 3 - .. _powersystem-desc-gridgraph: Description of a powergrid adopting the "energy graph" representation diff --git a/docs/mdp.rst b/docs/mdp.rst index 3ede81e67..96ee74705 100644 --- a/docs/mdp.rst +++ b/docs/mdp.rst @@ -1,3 +1,5 @@ +.. include:: special.rst + .. _mdp-doc-module: Dive into grid2op sequential decision process @@ -11,7 +13,24 @@ This page is organized as follow: Objectives ----------- -TODO +The goal of this page of the documentation is to provide you with a relatively extensive description of the +mathematical model behind grid2op. + +Grid2op is a software whose aim is to make experiments on powergrid, mainly sequential decision making, +as easy as possible. + +This problem has been modeled as a "Markov Decision Process" (MDP) and one some cases +"Partially Observable Markov Decision Process" (POMDP) or +"Constrainted Markov Decision Process" (CMDP) and (work in progress) even +"Decentralized (Partially Observable) Markov Decision Process" (Dec-(PO)MDP). + +In this section, we will suppose that: + +#. there a "simulator" [informatically, this is the Backend, detailed in :ref:`backend-module`] + that is able to compute some informations (*eg* flows on powerlines, active production value of generators etc.) + from some other information given by the Environment (see :ref:`environment-module` for details about the + way the `Environment` is coded and :class:`grid2op.Action._backendAction._BackendAction` ) + Modeling sequential decisions ------------------------------- @@ -90,4 +109,11 @@ accurate description of the grid and only "subsample" (*eg* at a frequency of every 5 mins) provide grid2op with some information. + +Some constraints +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +TODO + + .. include:: final.rst diff --git a/docs/special.rst b/docs/special.rst new file mode 100644 index 000000000..142235173 --- /dev/null +++ b/docs/special.rst @@ -0,0 +1,44 @@ +.. Color profiles for Sphinx. +.. Has to be used with hacks.css +.. (https://bitbucket.org/lbesson/web-sphinx/src/master/.static/hacks.css) + +.. role:: black +.. role:: gray +.. role:: grey +.. role:: silver +.. role:: white +.. role:: maroon +.. role:: red +.. role:: magenta +.. role:: fuchsia +.. role:: pink +.. role:: orange +.. role:: yellow +.. role:: lime +.. role:: green +.. role:: olive +.. role:: teal +.. role:: cyan +.. role:: aqua +.. role:: blue +.. role:: navy +.. role:: purple + +.. role:: under +.. role:: over +.. role:: blink +.. role:: line +.. role:: strike + +.. role:: it +.. role:: ob + +.. role:: small +.. role:: large + +.. role:: center +.. role:: left +.. role:: right + + +.. (c) Lilian Besson, 2011-2016, https://bitbucket.org/lbesson/web-sphinx/ diff --git a/grid2op/Action/_backendAction.py b/grid2op/Action/_backendAction.py index 33fd95ffe..99d61c921 100644 --- a/grid2op/Action/_backendAction.py +++ b/grid2op/Action/_backendAction.py @@ -22,17 +22,109 @@ # TODO see if it can be done in c++ easily class ValueStore: """ - INTERNAL USE ONLY + USE ONLY IF YOU WANT TO CODE A NEW BACKEND - .. warning:: /!\\\\ Internal, do not use unless you know what you are doing /!\\\\ + .. warning:: /!\\\\ Internal, do not modify, alter, change, override the implementation unless you know what you are doing /!\\\\ + + If you override them you might even notice some extremely weird behaviour. It's not "on purpose", we are aware of + it but we won't change it (for now at least) + + .. warning:: + Objects from this class should never be created by anyone except by objects of the :class:`grid2op.Action._backendAction._BackendAction` + when they are created or when instances of `_BackendAction` are process *eg* with :func:`_BackendAction.__call__` or + :func:`_BackendAction.get_loads_bus` etc. + + There are two correct uses for this class: + + #. by iterating manually with the `for xxx in value_stor_instance: ` + #. by checking which objects have been changed (with :attr:`ValueStore.changed`) and then check the + new value of the elements **changed** with :attr:`ValueStore.values` [el_id] + .. danger:: + + You should never trust the values in :attr:`ValueStore.values` [el_id] if :attr:`ValueStore.changed` [el_id] is `False`. + + Access data (values) only when the corresponding "mask" (:attr:`ValueStore.changed`) is `True`. + + This is, of course, ensured by default if you use the practical way of iterating through them with: + + .. code-block:: python + + load_p: ValueStore # a ValueStore object named "load_p" + + for load_id, new_p in load_p: + # do something + + In this case only "new_p" will be given if corresponding `changed` mask is true. + + Attributes + ---------- + + + Examples + --------- + + Say you have a "ValueStore" `val_sto` (in :class:`grid2op.Action._backendAction._BackendAction` you will end up manipulating + pretty much all the time `ValueStore` if you use it correctly, with :func:`_BackendAction.__call__` but also is you call + :func:`_BackendAction.get_loads_bus`, :func:`_BackendAction.get_loads_bus_global`, :func:`_BackendAction.get_gens_bus`, ...) + + Basically, the "variables" named `prod_p`, `prod_v`, `load_p`, `load_q`, `storage_p`, + `topo__`, `shunt_p`, `shunt_q`, `shunt_bus`, `backendAction.get_lines_or_bus()`, + `backendAction.get_lines_or_bus_global()`, etc in the doc of :class:`grid2op.Action._backendAction._BackendAction` + are all :class:`ValueStore`. + + Recommended usage: + + .. code-block:: python + + val_sto: ValueStore # a ValueStore object named "val_sto" + + for el_id, new_val in val_sto: + # do something + + # less abstractly, say `load_p` is a ValueStore: + # for load_id, new_p in load_p: + # do the real changes of load active value in self._grid + # load_id => id of loads for which the active consumption changed + # new_p => new load active consumption for `load_id` + # self._grid.change_load_active_value(load_id, new_p) # fictive example of course... + + + More advanced / vectorized usage (only do that if you found out your backend was + slow because of the iteration in python above, this is error-prone and in general + might not be worth it...): + + .. code-block:: python + + val_sto: ValueStore # a ValueStore object named "val_sto" + + # less abstractly, say `load_p` is a ValueStore: + # self._grid.change_all_loads_active_value(where_changed=load_p.changed, + new_vals=load_p.values[load_p.changed]) + # fictive example of couse, I highly doubt the self._grid + # implements a method named exactly `change_all_loads_active_value` + + WARNING, DANGER AHEAD: + Never trust the data in load_p.values[~load_p.changed], they might even be un intialized... + """ def __init__(self, size, dtype): ## TODO at the init it's mandatory to have everything at "1" here # if topo is not "fully connected" it will not work + + #: :class:`np.ndarray` + #: The new target values to be set in `backend._grid` in `apply_action` + #: never use the values if the corresponding mask is set to `False` + #: (it might be non initialized). self.values = np.empty(size, dtype=dtype) + + #: :class:`np.ndarray` (bool) + #: Mask representing which values (stored in :attr:`ValueStore.values` ) are + #: meaningful. The other values (corresponding to `changed=False` ) are meaningless. self.changed = np.full(size, dtype=dt_bool, fill_value=False) + + #: used internally for iteration self.last_index = 0 self.__size = size @@ -217,11 +309,175 @@ class _BackendAction(GridObjects): Internal class, use at your own risk. - This class "digest" the players / environment / opponent / voltage controlers "action", - and transform it to setpoint for the backend. + This class "digest" the players / environment / opponent / voltage controlers "actions", + and transform it to one single "state" that can in turn be process by the backend + in the function :func:`grid2op.Backend.Backend.apply_action`. + + .. note:: + In a :class:`_BackendAction` only the state of the element that have been modified + by an "entity" (agent, environment, opponent, voltage controler etc.) is given. + + We expect the backend to "remember somehow" the state of all the rest. + + This is to save a lot of computation time for larger grid. + + .. note:: + You probably don't need to import the `_BackendAction` class (this is why + we "hide" it), + but the `backendAction` you will receive in `apply_action` is indeed + a :class:`_BackendAction`, hence this documentation. + + If you want to use grid2op to develop agents or new time series, + this class should behave transparently for you and you don't really + need to spend time reading its documentation. + + If you want to develop in grid2op and code a new backend, you might be interested in: + + - :func:`_BackendAction.__call__` + - :func:`_BackendAction.get_loads_bus` + - :func:`_BackendAction.get_loads_bus_global` + - :func:`_BackendAction.get_gens_bus` + - :func:`_BackendAction.get_gens_bus_global` + - :func:`_BackendAction.get_lines_or_bus` + - :func:`_BackendAction.get_lines_or_bus_global` + - :func:`_BackendAction.get_lines_ex_bus` + - :func:`_BackendAction.get_lines_ex_bus_global` + - :func:`_BackendAction.get_storages_bus` + - :func:`_BackendAction.get_storages_bus_global` + - :func:`_BackendAction.get_shunts_bus_global` + + And in this case, for usage examples, see the examples available in: + + - https://github.com/rte-france/Grid2Op/tree/master/examples/backend_integration: a step by step guide to + code a new backend + - :class:`grid2op.Backend.educPandaPowerBackend.EducPandaPowerBackend` and especially the + :func:`grid2op.Backend.educPandaPowerBackend.EducPandaPowerBackend.apply_action` + - :ref:`create-backend-module` page of the documentation, especially the + :ref:`backend-action-create-backend` section + + Otherwise, "TL;DR" (only relevant when you want to implement the :func:`grid2op.Backend.Backend.apply_action` + function, rest is not shown): + + .. code-block:: python + + def apply_action(self, backendAction: Union["grid2op.Action._backendAction._BackendAction", None]) -> None: + if backendAction is None: + return + + ( + active_bus, + (prod_p, prod_v, load_p, load_q, storage_p), + topo__, + shunts__, + ) = backendAction() + + # change the active values of the loads + for load_id, new_p in load_p: + # do the real changes in self._grid + + # change the reactive values of the loads + for load_id, new_q in load_q: + # do the real changes in self._grid + + # change the active value of generators + for gen_id, new_p in prod_p: + # do the real changes in self._grid + + # for the voltage magnitude, pandapower expects pu but grid2op provides kV, + # so we need a bit of change + for gen_id, new_v in prod_v: + # do the real changes in self._grid + + # process the topology : + + # option 1: you can directly set the element of the grid in the "topo_vect" + # order, for example you can modify in your solver the busbar to which + # element 17 of `topo_vect` is computed (this is necessarily a local view of + # the buses ) + for el_topo_vect_id, new_el_bus in topo__: + # connect this object to the `new_el_bus` (local) in self._grid + + # OR !!! (use either option 1 or option 2.a or option 2.b - exclusive OR) + + # option 2: use "per element type" view (this is usefull) + # if your solver has organized its data by "type" and you can + # easily access "all loads" and "all generators" etc. + + # option 2.a using "local view": + # new_bus is either -1, 1, 2, ..., backendAction.n_busbar_per_sub + lines_or_bus = backendAction.get_lines_or_bus() + for line_id, new_bus in lines_or_bus: + # connect "or" side of "line_id" to (local) bus `new_bus` in self._grid + + # OR !!! (use either option 1 or option 2.a or option 2.b - exclusive OR) + + # option 2.b using "global view": + # new_bus is either 0, 1, 2, ..., backendAction.n_busbar_per_sub * backendAction.n_sub + # (this suppose internally that your solver and grid2op have the same + # "ways" of labelling the buses...) + lines_or_bus = backendAction.get_lines_or_bus_global() + for line_id, new_bus in lines_or_bus: + # connect "or" side of "line_id" to (global) bus `new_bus` in self._grid + + # now repeat option a OR b calling the right methods + # for each element types (*eg* get_lines_ex_bus, get_loads_bus, get_gens_bus, + # get_storages_bus for "option a-like") + + ######## end processing of the topology ############### + + # now implement the shunts: + + if shunts__ is not None: + shunt_p, shunt_q, shunt_bus = shunts__ + + if (shunt_p.changed).any(): + # p has changed for at least a shunt + for shunt_id, new_shunt_p in shunt_p: + # do the real changes in self._grid + + if (shunt_q.changed).any(): + # q has changed for at least a shunt + for shunt_id, new_shunt_q in shunt_q: + # do the real changes in self._grid + + if (shunt_bus.changed).any(): + # at least one shunt has been disconnected + # or has changed the buses + + # do like for normal topology with: + # option a -like (using local bus): + for shunt_id, new_shunt_bus in shunt_bus: + ... + # OR + # option b -like (using global bus): + shunt_global_bus = backendAction.get_shunts_bus_global() + for shunt_id, new_shunt_bus in shunt_global_bus: + # connect shunt_id to (global) bus `new_shunt_bus` in self._grid + + .. warning:: + The steps shown here are generic and might not be optimised for your backend. This + is why you probably do not see any of them directly in :class:`grid2op.Backend.PandaPowerBackend` + (where everything is vectorized to make things fast **with pandapower**). + + It is probably a good idea to first get this first implementation up and running, passing + all the tests, and then to worry about optimization: + + The real problem is that programmers have spent far too much + time worrying about efficiency in the wrong places and at the wrong times; + premature optimization is the root of all evil (or at least most of it) + in programming. + + Donald Knuth, "*The Art of Computer Programming*" + """ def __init__(self): + """ + .. warning:: /!\\\\ Internal, do not use unless you know what you are doing /!\\\\ + + This is handled by the environment ! + + """ GridObjects.__init__(self) cls = type(self) # last connected registered @@ -266,6 +522,11 @@ def __init__(self): self._storage_bus = None def __deepcopy__(self, memodict={}) -> Self: + + """ + .. warning:: /!\\\\ Internal, do not use unless you know what you are doing /!\\\\ + + """ res = type(self)() # last connected registered res.last_topo_registered.copy(self.last_topo_registered) @@ -298,14 +559,21 @@ def __deepcopy__(self, memodict={}) -> Self: return res def __copy__(self) -> Self: + + """ + .. warning:: /!\\\\ Internal, do not use unless you know what you are doing /!\\\\ + + """ res = self.__deepcopy__() # nothing less to do return res def reorder(self, no_load, no_gen, no_topo, no_storage, no_shunt) -> None: """ .. warning:: /!\\\\ Internal, do not use unless you know what you are doing /!\\\\ + + This is handled by BackendConverter, do not alter - reorder the element modified, this is use when converting backends only and should not be use + Reorder the element modified, this is use when converting backends only and should not be use outside of this usecase no_* stands for "new order" @@ -327,6 +595,12 @@ def reorder(self, no_load, no_gen, no_topo, no_storage, no_shunt) -> None: self.current_shunt_bus.reorder(no_shunt) def reset(self) -> None: + """ + .. warning:: /!\\\\ Internal, do not use unless you know what you are doing /!\\\\ + + This is called by the environment, do not alter. + + """ # last known topo self.last_topo_registered.reset() @@ -354,6 +628,11 @@ def reset(self) -> None: self.last_topo_registered.register_new_topo(self.current_topo) def all_changed(self) -> None: + """ + .. warning:: /!\\\\ Internal, do not use unless you know what you are doing /!\\\\ + + This is called by the environment, do not alter. + """ # last topo self.last_topo_registered.all_changed() @@ -375,9 +654,20 @@ def all_changed(self) -> None: # self.shunt_bus.all_changed() def set_redispatch(self, new_redispatching): + """ + .. warning:: /!\\\\ Internal, do not use unless you know what you are doing /!\\\\ + + This is called by the environment, do not alter. + """ self.prod_p.change_val(new_redispatching) def _aux_iadd_inj(self, dict_injection): + """ + .. warning:: /!\\\\ Internal, do not use unless you know what you are doing /!\\\\ + + Internal implementation of += + + """ if "load_p" in dict_injection: tmp = dict_injection["load_p"] self.load_p.set_val(tmp) @@ -392,6 +682,12 @@ def _aux_iadd_inj(self, dict_injection): self.prod_v.set_val(tmp) def _aux_iadd_shunt(self, other): + """ + .. warning:: /!\\\\ Internal, do not use unless you know what you are doing /!\\\\ + + Internal implementation of += + + """ shunts = {} if type(other).shunts_data_available: shunts["shunt_p"] = other.shunt_p @@ -407,6 +703,12 @@ def _aux_iadd_shunt(self, other): self.current_shunt_bus.values[self.shunt_bus.changed] = self.shunt_bus.values[self.shunt_bus.changed] def _aux_iadd_reconcile_disco_reco(self): + """ + .. warning:: /!\\\\ Internal, do not use unless you know what you are doing /!\\\\ + + Internal implementation of += + + """ disco_or = (self._status_or_before == -1) | (self._status_or == -1) disco_ex = (self._status_ex_before == -1) | (self._status_ex == -1) disco_now = ( @@ -432,8 +734,18 @@ def _aux_iadd_reconcile_disco_reco(self): def __iadd__(self, other : BaseAction) -> Self: """ .. warning:: /!\\\\ Internal, do not use unless you know what you are doing /!\\\\ - - other: a grid2op action standard + + This is called by the environment, do not alter. + + The goal of this function is to "fused" together all the different types + of modifications handled by: + + - the Agent + - the opponent + - the time series (part of the environment) + - the voltage controler + + It might be called multiple times per step. Parameters ---------- @@ -441,7 +753,8 @@ def __iadd__(self, other : BaseAction) -> Self: Returns ------- - + The updated state of `self` after the new action `other` has been added to it. + """ set_status = other._set_line_status @@ -512,7 +825,13 @@ def __iadd__(self, other : BaseAction) -> Self: return self def _assign_0_to_disco_el(self) -> None: - """do not consider disconnected elements are modified for there active / reactive / voltage values""" + """ + .. warning:: /!\\\\ Internal, do not use unless you know what you are doing /!\\\\ + + This is handled by the environment, do not alter. + + Do not consider disconnected elements are modified for there active / reactive / voltage values + """ cls = type(self) gen_changed = self.current_topo.changed[cls.gen_pos_topo_vect] gen_bus = self.current_topo.values[cls.gen_pos_topo_vect] @@ -532,6 +851,48 @@ def __call__(self) -> Tuple[np.ndarray, Tuple[ValueStore, ValueStore, ValueStore, ValueStore, ValueStore], ValueStore, Union[Tuple[ValueStore, ValueStore, ValueStore], None]]: + """ + This function should be called at the top of the :func:`grid2op.Backend.Backend.apply_action` + implementation when you decide to code a new backend. + + It processes the state of the backend into a form "easy to use" in the `apply_action` method. + + .. danger:: + It is mandatory to call it, otherwise some features might not work. + + .. warning:: /!\\\\ Do not alter / modify / change / override this implementation /!\\\\ + + Examples + ----------- + + A typical implementation of `apply_action` will start with: + + .. code-block:: python + + def apply_action(self, backendAction: Union["grid2op.Action._backendAction._BackendAction", None]) -> None: + if backendAction is None: + return + + ( + active_bus, + (prod_p, prod_v, load_p, load_q, storage), + topo__, + shunts__, + ) = backendAction() + + # process the backend action by updating `self._grid` + + Returns + ------- + + - `active_bus`: matrix with `type(self).n_sub` rows and `type(self).n_busbar_per_bus` columns. Each elements + represents a busbars of the grid. ``False`` indicates that nothing is connected to this busbar and ``True`` + means that at least an element is connected to this busbar + - (prod_p, prod_v, load_p, load_q, storage): 5-tuple of Iterable to set the new values of generators, loads and storage units. + - topo: iterable representing the target topology (in local bus, elements are ordered with their + position in the `topo_vect` vector) + + """ self._assign_0_to_disco_el() injections = ( self.prod_p, @@ -548,9 +909,94 @@ def __call__(self) -> Tuple[np.ndarray, return self.activated_bus, injections, topo, shunts def get_loads_bus(self) -> ValueStore: + """ + This function might be called in the implementation of :func:`grid2op.Backend.Backend.apply_action`. + + It is relevant when your solver expose API by "element types" for example + you get the possibility to set and access all loads at once, all generators at + once and your solver can easily move element from different busbar in a given + substation. + + This corresponds to option 2a described (shortly) in :class:`_BackendAction`. + + In this setting, this function will give you the "local bus" id for each loads that + have been changed by the agent / time series / voltage controlers / opponent / etc. + + .. warning:: /!\\\\ Do not alter / modify / change / override this implementation /!\\\\ + + .. seealso:: + The other related functions: + + - :func:`_BackendAction.get_loads_bus` + - :func:`_BackendAction.get_gens_bus` + - :func:`_BackendAction.get_lines_or_bus` + - :func:`_BackendAction.get_lines_ex_bus` + - :func:`_BackendAction.get_storages_bus` + + Examples + ----------- + + A typical use of `get_loads_bus` in `apply_action` is: + + .. code-block:: python + + def apply_action(self, backendAction: Union["grid2op.Action._backendAction._BackendAction", None]) -> None: + if backendAction is None: + return + + ( + active_bus, + (prod_p, prod_v, load_p, load_q, storage), + _, + shunts__, + ) = backendAction() + + # process the backend action by updating `self._grid` + ... + + # now process the topology (called option 2.a in the doc): + + lines_or_bus = backendAction.get_lines_or_bus() + for line_id, new_bus in lines_or_bus: + # connect "or" side of "line_id" to (local) bus `new_bus` in self._grid + self._grid.something(...) + # or + self._grid.something = ... + + lines_ex_bus = backendAction.get_lines_ex_bus() + for line_id, new_bus in lines_ex_bus: + # connect "ex" side of "line_id" to (local) bus `new_bus` in self._grid + self._grid.something(...) + # or + self._grid.something = ... + + storages_bus = backendAction.get_storages_bus() + for el_id, new_bus in storages_bus: + # connect storage id `el_id` to (local) bus `new_bus` in self._grid + self._grid.something(...) + # or + self._grid.something = ... + + gens_bus = backendAction.get_gens_bus() + for el_id, new_bus in gens_bus: + # connect generator id `el_id` to (local) bus `new_bus` in self._grid + self._grid.something(...) + # or + self._grid.something = ... + + loads_bus = backendAction.get_loads_bus() + for el_id, new_bus in loads_bus: + # connect generator id `el_id` to (local) bus `new_bus` in self._grid + self._grid.something(...) + # or + self._grid.something = ... + + # continue implementation of `apply_action` + + """ if self._loads_bus is None: - self._loads_bus = ValueStore(self.n_load, dtype=dt_int) - self._loads_bus.copy_from_index(self.current_topo, self.load_pos_topo_vect) + self._loads_bus = ValueStore(type(self).n_load, dtype=dt_int) + self._loads_bus.copy_from_index(self.current_topo, type(self).load_pos_topo_vect) return self._loads_bus def _aux_to_global(self, value_store, to_subid) -> ValueStore: @@ -559,54 +1005,413 @@ def _aux_to_global(self, value_store, to_subid) -> ValueStore: return value_store def get_loads_bus_global(self) -> ValueStore: + """ + This function might be called in the implementation of :func:`grid2op.Backend.Backend.apply_action`. + + It is relevant when your solver expose API by "element types" for example + you get the possibility to set and access all loads at once, all generators at + once AND you can easily switch element from one "busbars" to another in + the whole grid handled by your solver. + + This corresponds to situation 2b described in :class:`_BackendAction`. + + In this setting, this function will give you the "local bus" id for each loads that + have been changed by the agent / time series / voltage controlers / opponent / etc. + + .. warning:: /!\\\\ Do not alter / modify / change / override this implementation /!\\\\ + + .. seealso:: + The other related functions: + + - :func:`_BackendAction.get_loads_bus_global` + - :func:`_BackendAction.get_gens_bus_global` + - :func:`_BackendAction.get_lines_or_bus_global` + - :func:`_BackendAction.get_lines_ex_bus_global` + - :func:`_BackendAction.get_storages_bus_global` + + Examples + ----------- + + A typical use of `get_loads_bus_global` in `apply_action` is: + + .. code-block:: python + + def apply_action(self, backendAction: Union["grid2op.Action._backendAction._BackendAction", None]) -> None: + if backendAction is None: + return + + ( + active_bus, + (prod_p, prod_v, load_p, load_q, storage), + _, + shunts__, + ) = backendAction() + + # process the backend action by updating `self._grid` + ... + + # now process the topology (called option 2.a in the doc): + + lines_or_bus = backendAction.get_lines_or_bus_global() + for line_id, new_bus in lines_or_bus: + # connect "or" side of "line_id" to (global) bus `new_bus` in self._grid + self._grid.something(...) + # or + self._grid.something = ... + + lines_ex_bus = backendAction.get_lines_ex_bus_global() + for line_id, new_bus in lines_ex_bus: + # connect "ex" side of "line_id" to (global) bus `new_bus` in self._grid + self._grid.something(...) + # or + self._grid.something = ... + + storages_bus = backendAction.get_storages_bus_global() + for el_id, new_bus in storages_bus: + # connect storage id `el_id` to (global) bus `new_bus` in self._grid + self._grid.something(...) + # or + self._grid.something = ... + + gens_bus = backendAction.get_gens_bus_global() + for el_id, new_bus in gens_bus: + # connect generator id `el_id` to (global) bus `new_bus` in self._grid + self._grid.something(...) + # or + self._grid.something = ... + + loads_bus = backendAction.get_loads_bus_global() + for el_id, new_bus in loads_bus: + # connect generator id `el_id` to (global) bus `new_bus` in self._grid + self._grid.something(...) + # or + self._grid.something = ... + + # continue implementation of `apply_action` + + """ tmp_ = self.get_loads_bus() - return self._aux_to_global(tmp_, self.load_to_subid) + return self._aux_to_global(tmp_, type(self).load_to_subid) def get_gens_bus(self) -> ValueStore: + """ + This function might be called in the implementation of :func:`grid2op.Backend.Backend.apply_action`. + + It is relevant when your solver expose API by "element types" for example + you get the possibility to set and access all loads at once, all generators at + once and your solver can easily move element from different busbar in a given + substation. + + This corresponds to option 2a described (shortly) in :class:`_BackendAction`. + + In this setting, this function will give you the "local bus" id for each generators that + have been changed by the agent / time series / voltage controlers / opponent / etc. + + .. warning:: /!\\\\ Do not alter / modify / change / override this implementation /!\\\\ + + .. seealso:: + The other related functions: + + - :func:`_BackendAction.get_loads_bus` + - :func:`_BackendAction.get_gens_bus` + - :func:`_BackendAction.get_lines_or_bus` + - :func:`_BackendAction.get_lines_ex_bus` + - :func:`_BackendAction.get_storages_bus` + + Examples + --------- + + Some examples are given in the documentation of :func:`_BackendAction.get_loads_bus` + + """ if self._gens_bus is None: - self._gens_bus = ValueStore(self.n_gen, dtype=dt_int) - self._gens_bus.copy_from_index(self.current_topo, self.gen_pos_topo_vect) + self._gens_bus = ValueStore(type(self).n_gen, dtype=dt_int) + self._gens_bus.copy_from_index(self.current_topo, type(self).gen_pos_topo_vect) return self._gens_bus def get_gens_bus_global(self) -> ValueStore: + """ + This function might be called in the implementation of :func:`grid2op.Backend.Backend.apply_action`. + + It is relevant when your solver expose API by "element types" for example + you get the possibility to set and access all loads at once, all generators at + once AND you can easily switch element from one "busbars" to another in + the whole grid handled by your solver. + + This corresponds to situation 2b described in :class:`_BackendAction`. + + In this setting, this function will give you the "local bus" id for each loads that + have been changed by the agent / time series / voltage controlers / opponent / etc. + + .. warning:: /!\\\\ Do not alter / modify / change / override this implementation /!\\\\ + + .. seealso:: + The other related functions: + + - :func:`_BackendAction.get_loads_bus_global` + - :func:`_BackendAction.get_gens_bus_global` + - :func:`_BackendAction.get_lines_or_bus_global` + - :func:`_BackendAction.get_lines_ex_bus_global` + - :func:`_BackendAction.get_storages_bus_global` + + Examples + --------- + + Some examples are given in the documentation of :func:`_BackendAction.get_loads_bus_global` + """ + tmp_ = copy.deepcopy(self.get_gens_bus()) - return self._aux_to_global(tmp_, self.gen_to_subid) + return self._aux_to_global(tmp_, type(self).gen_to_subid) def get_lines_or_bus(self) -> ValueStore: + """ + This function might be called in the implementation of :func:`grid2op.Backend.Backend.apply_action`. + + It is relevant when your solver expose API by "element types" for example + you get the possibility to set and access all loads at once, all generators at + once and your solver can easily move element from different busbar in a given + substation. + + This corresponds to option 2a described (shortly) in :class:`_BackendAction`. + + In this setting, this function will give you the "local bus" id for each line (or side) that + have been changed by the agent / time series / voltage controlers / opponent / etc. + + .. warning:: /!\\\\ Do not alter / modify / change / override this implementation /!\\\\ + + .. seealso:: + The other related functions: + + - :func:`_BackendAction.get_loads_bus` + - :func:`_BackendAction.get_gens_bus` + - :func:`_BackendAction.get_lines_or_bus` + - :func:`_BackendAction.get_lines_ex_bus` + - :func:`_BackendAction.get_storages_bus` + + Examples + --------- + + Some examples are given in the documentation of :func:`_BackendAction.get_loads_bus` + + """ if self._lines_or_bus is None: - self._lines_or_bus = ValueStore(self.n_line, dtype=dt_int) + self._lines_or_bus = ValueStore(type(self).n_line, dtype=dt_int) self._lines_or_bus.copy_from_index( - self.current_topo, self.line_or_pos_topo_vect + self.current_topo, type(self).line_or_pos_topo_vect ) return self._lines_or_bus def get_lines_or_bus_global(self) -> ValueStore: + """ + This function might be called in the implementation of :func:`grid2op.Backend.Backend.apply_action`. + + It is relevant when your solver expose API by "element types" for example + you get the possibility to set and access all loads at once, all generators at + once AND you can easily switch element from one "busbars" to another in + the whole grid handled by your solver. + + This corresponds to situation 2b described in :class:`_BackendAction`. + + In this setting, this function will give you the "local bus" id for each loads that + have been changed by the agent / time series / voltage controlers / opponent / etc. + + .. warning:: /!\\\\ Do not alter / modify / change / override this implementation /!\\\\ + + .. seealso:: + The other related functions: + + - :func:`_BackendAction.get_loads_bus_global` + - :func:`_BackendAction.get_gens_bus_global` + - :func:`_BackendAction.get_lines_or_bus_global` + - :func:`_BackendAction.get_lines_ex_bus_global` + - :func:`_BackendAction.get_storages_bus_global` + + Examples + --------- + + Some examples are given in the documentation of :func:`_BackendAction.get_loads_bus_global` + """ tmp_ = self.get_lines_or_bus() - return self._aux_to_global(tmp_, self.line_or_to_subid) + return self._aux_to_global(tmp_, type(self).line_or_to_subid) def get_lines_ex_bus(self) -> ValueStore: + """ + This function might be called in the implementation of :func:`grid2op.Backend.Backend.apply_action`. + + It is relevant when your solver expose API by "element types" for example + you get the possibility to set and access all loads at once, all generators at + once and your solver can easily move element from different busbar in a given + substation. + + This corresponds to option 2a described (shortly) in :class:`_BackendAction`. + + In this setting, this function will give you the "local bus" id for each line (ex side) that + have been changed by the agent / time series / voltage controlers / opponent / etc. + + .. warning:: /!\\\\ Do not alter / modify / change / override this implementation /!\\\\ + + .. seealso:: + The other related functions: + + - :func:`_BackendAction.get_loads_bus` + - :func:`_BackendAction.get_gens_bus` + - :func:`_BackendAction.get_lines_or_bus` + - :func:`_BackendAction.get_lines_ex_bus` + - :func:`_BackendAction.get_storages_bus` + + Examples + --------- + + Some examples are given in the documentation of :func:`_BackendAction.get_loads_bus` + + """ if self._lines_ex_bus is None: - self._lines_ex_bus = ValueStore(self.n_line, dtype=dt_int) + self._lines_ex_bus = ValueStore(type(self).n_line, dtype=dt_int) self._lines_ex_bus.copy_from_index( - self.current_topo, self.line_ex_pos_topo_vect + self.current_topo, type(self).line_ex_pos_topo_vect ) return self._lines_ex_bus def get_lines_ex_bus_global(self) -> ValueStore: + """ + This function might be called in the implementation of :func:`grid2op.Backend.Backend.apply_action`. + + It is relevant when your solver expose API by "element types" for example + you get the possibility to set and access all loads at once, all generators at + once AND you can easily switch element from one "busbars" to another in + the whole grid handled by your solver. + + This corresponds to situation 2b described in :class:`_BackendAction`. + + In this setting, this function will give you the "local bus" id for each loads that + have been changed by the agent / time series / voltage controlers / opponent / etc. + + .. warning:: /!\\\\ Do not alter / modify / change / override this implementation /!\\\\ + + .. seealso:: + The other related functions: + + - :func:`_BackendAction.get_loads_bus_global` + - :func:`_BackendAction.get_gens_bus_global` + - :func:`_BackendAction.get_lines_or_bus_global` + - :func:`_BackendAction.get_lines_ex_bus_global` + - :func:`_BackendAction.get_storages_bus_global` + + Examples + --------- + + Some examples are given in the documentation of :func:`_BackendAction.get_loads_bus_global` + """ tmp_ = self.get_lines_ex_bus() - return self._aux_to_global(tmp_, self.line_ex_to_subid) + return self._aux_to_global(tmp_, type(self).line_ex_to_subid) def get_storages_bus(self) -> ValueStore: + """ + This function might be called in the implementation of :func:`grid2op.Backend.Backend.apply_action`. + + It is relevant when your solver expose API by "element types" for example + you get the possibility to set and access all loads at once, all generators at + once and your solver can easily move element from different busbar in a given + substation. + + This corresponds to option 2a described (shortly) in :class:`_BackendAction`. + + In this setting, this function will give you the "local bus" id for each storage that + have been changed by the agent / time series / voltage controlers / opponent / etc. + + .. warning:: /!\\\\ Do not alter / modify / change / override this implementation /!\\\\ + + .. seealso:: + The other related functions: + + - :func:`_BackendAction.get_loads_bus` + - :func:`_BackendAction.get_gens_bus` + - :func:`_BackendAction.get_lines_or_bus` + - :func:`_BackendAction.get_lines_ex_bus` + - :func:`_BackendAction.get_storages_bus` + + Examples + --------- + + Some examples are given in the documentation of :func:`_BackendAction.get_loads_bus` + + """ if self._storage_bus is None: - self._storage_bus = ValueStore(self.n_storage, dtype=dt_int) - self._storage_bus.copy_from_index(self.current_topo, self.storage_pos_topo_vect) + self._storage_bus = ValueStore(type(self).n_storage, dtype=dt_int) + self._storage_bus.copy_from_index(self.current_topo, type(self).storage_pos_topo_vect) return self._storage_bus def get_storages_bus_global(self) -> ValueStore: + """ + This function might be called in the implementation of :func:`grid2op.Backend.Backend.apply_action`. + + It is relevant when your solver expose API by "element types" for example + you get the possibility to set and access all loads at once, all generators at + once AND you can easily switch element from one "busbars" to another in + the whole grid handled by your solver. + + This corresponds to situation 2b described in :class:`_BackendAction`. + + In this setting, this function will give you the "local bus" id for each loads that + have been changed by the agent / time series / voltage controlers / opponent / etc. + + .. warning:: /!\\\\ Do not alter / modify / change / override this implementation /!\\\\ + + .. seealso:: + The other related functions: + + - :func:`_BackendAction.get_loads_bus_global` + - :func:`_BackendAction.get_gens_bus_global` + - :func:`_BackendAction.get_lines_or_bus_global` + - :func:`_BackendAction.get_lines_ex_bus_global` + - :func:`_BackendAction.get_storages_bus_global` + + Examples + --------- + + Some examples are given in the documentation of :func:`_BackendAction.get_loads_bus_global` + """ tmp_ = self.get_storages_bus() - return self._aux_to_global(tmp_, self.storage_to_subid) + return self._aux_to_global(tmp_, type(self).storage_to_subid) + + def get_shunts_bus_global(self) -> ValueStore: + """ + This function might be called in the implementation of :func:`grid2op.Backend.Backend.apply_action`. + + It is relevant when your solver expose API by "element types" for example + you get the possibility to set and access all loads at once, all generators at + once AND you can easily switch element from one "busbars" to another in + the whole grid handled by your solver. + + This corresponds to situation 2b described in :class:`_BackendAction`. + + In this setting, this function will give you the "local bus" id for each loads that + have been changed by the agent / time series / voltage controlers / opponent / etc. + + .. warning:: /!\\\\ Do not alter / modify / change / override this implementation /!\\\\ + + .. seealso:: + The other related functions: + + - :func:`_BackendAction.get_loads_bus_global` + - :func:`_BackendAction.get_gens_bus_global` + - :func:`_BackendAction.get_lines_or_bus_global` + - :func:`_BackendAction.get_lines_ex_bus_global` + - :func:`_BackendAction.get_storages_bus_global` + + Examples + --------- + + Some examples are given in the documentation of :func:`_BackendAction.get_loads_bus_global` + """ + tmp_ = self.shunt_bus + return self._aux_to_global(tmp_, type(self).shunt_to_subid) def _get_active_bus(self) -> None: + """ + .. warning:: /!\\\\ Internal, do not use unless you know what you are doing /!\\\\ + + """ self.activated_bus[:, :] = False tmp = self.current_topo.values - 1 is_el_conn = tmp >= 0 @@ -619,8 +1424,10 @@ def _get_active_bus(self) -> None: def update_state(self, powerline_disconnected) -> None: """ .. warning:: /!\\\\ Internal, do not use unless you know what you are doing /!\\\\ + + This is handled by the environment ! - Update the internal state. Should be called after the cascading failures + Update the internal state. Should be called after the cascading failures. """ if (powerline_disconnected >= 0).any(): diff --git a/grid2op/Action/baseAction.py b/grid2op/Action/baseAction.py index 085379f18..2f6bffc2f 100644 --- a/grid2op/Action/baseAction.py +++ b/grid2op/Action/baseAction.py @@ -2235,12 +2235,15 @@ def update(self, dict_): be used to modify a :class:`grid2op.Backend.Backend`. In all the following examples, we suppose that a valid grid2op environment is created, for example with: + .. code-block:: python import grid2op + from grid2op.Action import BaseAction + env_name = "l2rpn_case14_sandbox" # create a simple environment # and make sure every type of action can be used. - env = grid2op.make(action_class=grid2op.Action.Action) + env = grid2op.make(env_name, action_class=BaseAction) *Example 1*: modify the load active values to set them all to 1. You can replace "load_p" by "load_q", "prod_p" or "prod_v" to change the load reactive value, the generator active setpoint or the generator diff --git a/grid2op/Action/serializableActionSpace.py b/grid2op/Action/serializableActionSpace.py index 6c698eeb7..f163da11b 100644 --- a/grid2op/Action/serializableActionSpace.py +++ b/grid2op/Action/serializableActionSpace.py @@ -247,8 +247,10 @@ def _sample_storage_power(self, rnd_update=None): return rnd_update def _sample_raise_alarm(self, rnd_update=None): - """.. warning:: + """ + .. warning:: /!\\\\ Only valid with "l2rpn_icaps_2021" environment /!\\\\ + """ if rnd_update is None: rnd_update = {} @@ -257,6 +259,11 @@ def _sample_raise_alarm(self, rnd_update=None): return rnd_update def _sample_raise_alert(self, rnd_update=None): + """ + .. warning:: + Not available in all environments. + + """ if rnd_update is None: rnd_update = {} rnd_alerted_lines = self.space_prng.choice([True, False], self.dim_alerts).astype(dt_bool) diff --git a/grid2op/Backend/backend.py b/grid2op/Backend/backend.py index a0fab4ebd..976c79f98 100644 --- a/grid2op/Backend/backend.py +++ b/grid2op/Backend/backend.py @@ -172,6 +172,7 @@ def __init__(self, self._my_kwargs[k] = v #: .. versionadded:: 1.9.9 + #: #: A flag to indicate whether the :func:`Backend.cannot_handle_more_than_2_busbar` #: or the :func:`Backend.cannot_handle_more_than_2_busbar` #: has been called when :func:`Backend.load_grid` was called. @@ -180,6 +181,7 @@ def __init__(self, self._missing_two_busbars_support_info: bool = True #: .. versionadded:: 1.9.9 + #: #: There is a difference between this and the class attribute. #: You should not worry about the class attribute of the backend in :func:`Backend.apply_action` self.n_busbar_per_sub: int = DEFAULT_N_BUSBAR_PER_SUB @@ -1933,6 +1935,7 @@ def assert_grid_correct(self) -> None: .. warning:: /!\\\\ Internal, do not use unless you know what you are doing /!\\\\ This is done as it should be by the Environment + """ # lazy loading from grid2op.Action import CompleteAction diff --git a/grid2op/tests/test_n_busbar_per_sub.py b/grid2op/tests/test_n_busbar_per_sub.py index 0424a5140..b1bed8dbd 100644 --- a/grid2op/tests/test_n_busbar_per_sub.py +++ b/grid2op/tests/test_n_busbar_per_sub.py @@ -1911,13 +1911,6 @@ def setUp(self) -> None: test=True, n_busbar=self.get_nb_bus(), _add_to_name=type(self).__name__ + f'_{self.get_nb_bus()}') - # param = self.env.parameters - # param.NB_TIMESTEP_COOLDOWN_SUB = 0 - # param.NB_TIMESTEP_COOLDOWN_LINE = 0 - # param.MAX_LINE_STATUS_CHANGED = 9999999 - # param.MAX_SUB_CHANGED = 99999999 - # self.env.change_parameters(param) - # self.env.change_forecast_parameters(param) self.seed = 0 self.env.reset(**self.get_reset_kwargs()) self.list_loc_bus = list(range(1, type(self.env).n_busbar_per_sub + 1)) From cd41187e0fd42d28e41354e9091d0743d36727e8 Mon Sep 17 00:00:00 2001 From: DONNOT Benjamin Date: Mon, 26 Feb 2024 17:31:56 +0100 Subject: [PATCH 3/5] WIP: doc MDP [skip ci] --- docs/mdp.rst | 57 ++++++++++++++++++++++++++++++++++++++++++++++------ 1 file changed, 51 insertions(+), 6 deletions(-) diff --git a/docs/mdp.rst b/docs/mdp.rst index 96ee74705..462c106e2 100644 --- a/docs/mdp.rst +++ b/docs/mdp.rst @@ -19,17 +19,62 @@ mathematical model behind grid2op. Grid2op is a software whose aim is to make experiments on powergrid, mainly sequential decision making, as easy as possible. -This problem has been modeled as a "Markov Decision Process" (MDP) and one some cases -"Partially Observable Markov Decision Process" (POMDP) or -"Constrainted Markov Decision Process" (CMDP) and (work in progress) even -"Decentralized (Partially Observable) Markov Decision Process" (Dec-(PO)MDP). +We chose to model this sequential decision making probleme as a +"*Markov Decision Process*" (MDP) and one some cases +"*Partially Observable Markov Decision Process*" (POMDP) or +"*Constrainted Markov Decision Process*" (CMDP) and (work in progress) even +"*Decentralized (Partially Observable) Markov Decision Process*" (Dec-(PO)MDP). + +Definitions +~~~~~~~~~~~~ + +In an MDP an "agent" / "automaton" / "algorithm" / "policy" takes some action :math:`a_t \in \mathcal{A}`. This +action is processed by the environment and update its internal state from :math:`s_t \in \mathcal{S}` +to :math:`s_{t+1} \in \mathcal{S}` and +computes a so-called *reward* :math:`r_{t+1} \in [0, 1)`. + +.. note:: + By stating the dynamic of the environment this way, we ensure the "*Markovian*" property: the + state :math:`s_{t+1}` is determined by the knowledge of the previous state :math:`s_{t}` and the + action :math:`a_{t}` + +.. note:: + More formally even, everything written can be stochastic: + + - :math:`a_t \sim \pi_{\theta}(s_t)` where :math:`\pi_{\theta}(\cdot)` is the "policy" parametrized by + some parameters :math:`\theta` that outputs here a probability distribution (depending on the + state of the environment :math:`s_t`) over all the actions `\mathcal{A}` + - :math:`s_{t+1} \sim \mathcal{L}(s_t, a_t)` where :math:`\mathcal{L}(s_t, a_t)` is a probability distribution + over :math:`\mathcal{S}` representing the likelyhood if the "next state" given the current state and the action + of the "policy" + + +This tuple +:math:`(s_t, r_t)` is then given to the "agent" / "automaton" / "algorithm" which in turns produce the action :math:`a_{t+1}` + +This alternation :math:`\dots \to a \to (s, r) \to a \to \dots` is done for a certain number of "steps" called :math:`T`. + +We will call the list :math:`s_{1} \to a_1 \to (s_2, r_2) \to \dots \to a_{T-1} \to (s_{T}, r_T)` +an "episode". + + In this section, we will suppose that: -#. there a "simulator" [informatically, this is the Backend, detailed in :ref:`backend-module`] +#. there is a "simulator" [informatically, this is the Backend, detailed in the :ref:`backend-module` section of the documentation] that is able to compute some informations (*eg* flows on powerlines, active production value of generators etc.) from some other information given by the Environment (see :ref:`environment-module` for details about the - way the `Environment` is coded and :class:`grid2op.Action._backendAction._BackendAction` ) + way the `Environment` is coded and refer to :class:`grid2op.Action._backendAction._BackendAction` for list + of all available informations informatically available for the solver). +#. some + +To make a parrallel with some other available environments you can view: + +#. The "simulator" represents the physics as in all `"mujoco" environments `_ + *eg* `Ant `_ or + `Inverted Pendulum `_ The "simulator" is really the same + concept in grid2op and in these environments. +#. Modeling sequential decisions From f528b20fbfa313ceebf9f8161203f4c8383101a4 Mon Sep 17 00:00:00 2001 From: DONNOT Benjamin Date: Tue, 27 Feb 2024 17:56:48 +0100 Subject: [PATCH 4/5] progress on MDP in doc, still not there [skip ci] --- docs/mdp.rst | 130 +++++++++++++++++++++++++++++++++++++++------------ 1 file changed, 100 insertions(+), 30 deletions(-) diff --git a/docs/mdp.rst b/docs/mdp.rst index 462c106e2..94eed2f09 100644 --- a/docs/mdp.rst +++ b/docs/mdp.rst @@ -25,77 +25,147 @@ We chose to model this sequential decision making probleme as a "*Constrainted Markov Decision Process*" (CMDP) and (work in progress) even "*Decentralized (Partially Observable) Markov Decision Process*" (Dec-(PO)MDP). -Definitions -~~~~~~~~~~~~ +General notations +~~~~~~~~~~~~~~~~~~~~ + +There are different ways to define an MDP. In this paragraph we introduce the notations that we will use. In an MDP an "agent" / "automaton" / "algorithm" / "policy" takes some action :math:`a_t \in \mathcal{A}`. This action is processed by the environment and update its internal state from :math:`s_t \in \mathcal{S}` to :math:`s_{t+1} \in \mathcal{S}` and -computes a so-called *reward* :math:`r_{t+1} \in [0, 1)`. +computes a so-called *reward* :math:`r_{t+1} \in [0, 1]`. .. note:: By stating the dynamic of the environment this way, we ensure the "*Markovian*" property: the state :math:`s_{t+1}` is determined by the knowledge of the previous state :math:`s_{t}` and the action :math:`a_{t}` +This tuple +:math:`(s_t, r_t)` is then given to the "agent" / "automaton" / "algorithm" which in turns produce the action :math:`a_{t+1}` + .. note:: More formally even, everything written can be stochastic: - :math:`a_t \sim \pi_{\theta}(s_t)` where :math:`\pi_{\theta}(\cdot)` is the "policy" parametrized by some parameters :math:`\theta` that outputs here a probability distribution (depending on the state of the environment :math:`s_t`) over all the actions `\mathcal{A}` - - :math:`s_{t+1} \sim \mathcal{L}(s_t, a_t)` where :math:`\mathcal{L}(s_t, a_t)` is a probability distribution + - :math:`s_{t+1} \sim \mathcal{L}_S(s_t, a_t)` where :math:`\mathcal{L}_S(s_t, a_t)` is a probability distribution over :math:`\mathcal{S}` representing the likelyhood if the "next state" given the current state and the action of the "policy" - + - :math:`r_{t+1} \sim \mathcal{L}_R(s_t, s_{t+1}, a_t)` is the reward function indicating "how good" + was the transition from :math:`s_{t}` to :math:`s_{t+1}` by taking action :math:`a_t` -This tuple -:math:`(s_t, r_t)` is then given to the "agent" / "automaton" / "algorithm" which in turns produce the action :math:`a_{t+1}` This alternation :math:`\dots \to a \to (s, r) \to a \to \dots` is done for a certain number of "steps" called :math:`T`. We will call the list :math:`s_{1} \to a_1 \to (s_2, r_2) \to \dots \to a_{T-1} \to (s_{T}, r_T)` -an "episode". +an "**episode**". + +Formally the knowledge of: + +- :math:`\mathcal{S}`, the "state space" +- :math:`\mathcal{A}`, the "action space" +- :math:`\mathcal{L}_s(s, a)`, sometimes called "transition kernel", is the probability + distribution (over :math:`\mathcal{S}`) that gives the next + state after taking action :math:`a` in state :math:`s` +- :math:`\mathcal{L}_r(s, s', a)`, sometimes called "reward kernel", + is the probability distribution (over :math:`[0, 1]`) that gives + the reward :math:`r` after taking action :math:`a` in state :math:`s` which lead to state :math:`s'` +- :math:`T \in \mathbb{N}^*` the maximum number of steps for an episode + +Defines a MDP. We will detail all of them in the section :ref:`mdp-def` bellow. + +In grid2op, there is a special case where a grid state cannot be computed (either due to some physical infeasibilities +or because the resulting state would be irrealistic). This can be modeled relatively easily in the MDP formulation +above if we add a "terminal state" :math:`s_{\emptyset}` in the state space :math:`\mathcal{S}_{new} := \mathcal{S} \cup \left\{ s_{\emptyset} \right\}`: and add the transitions: +:math:`\mathcal{L}_s(s_{\emptyset}, a) = \text{Dirac}(s_{\emptyset}) \forall a \in \mathcal{A}` +stating that once the agent lands in this "terminal state" then the game is over, it stays there until the +end of the scenario. + +We can also define the reward kernel in this state, for example with +:math:`\mathcal{L}_r(s_{\emptyset}, s', a) = \text{Dirac}(0) \forall s' \in \mathcal{S}, a \in \mathcal{A}` and +:math:`\mathcal{L}_r(s, s_{\emptyset}, a) = \text{Dirac}(0) \forall s \in \mathcal{S}, a \in \mathcal{A}` which +states that there is nothing to be gained in being in this terminal set. + +Unless specified otherwise, we will not enter these details in the following explanation and take it as +"pre requisite" as it can be defined in general. We will focus on the definition of :math:`\mathcal{S}`, +:math:`\mathcal{A}`, :math:`\mathcal{L}_s(s, a)` and :math:`\mathcal{L}_r(s, s', a)` by leaving out the +"terminal state". +.. note:: + In grid2op implementation, this "terminal state" is not directly implemented. Instead, the first Observation leading + to this state is marked as "done" (flag `obs.done` is set to `True`). + + No other "observation" will be given by + grid2op after an observation with `obs.done` set to `True` and the environment needs to be "reset". + This is consistent with the gymnasium implementation. -In this section, we will suppose that: +The main goal of a finite horizon MDP is then to find a policy :math:`\pi \in \Pi` that given states :math:`s` and reward :math:`r` +output an action :math:`a` such that (*NB* here :math:`\Pi` denotes the set of all considered policies for this +MDP): -#. there is a "simulator" [informatically, this is the Backend, detailed in the :ref:`backend-module` section of the documentation] - that is able to compute some informations (*eg* flows on powerlines, active production value of generators etc.) - from some other information given by the Environment (see :ref:`environment-module` for details about the - way the `Environment` is coded and refer to :class:`grid2op.Action._backendAction._BackendAction` for list - of all available informations informatically available for the solver). -#. some +.. math:: + :nowrap: -To make a parrallel with some other available environments you can view: + \begin{align*} + \min_{\pi \in \Pi} ~& \sum_{t=1}^T r_t \\ + \text{s.t.} ~ \\ + & \forall t, a_t \sim \pi (s_{t}) & \text{policy produces the action} \\ + & \forall t, s_{t+1} \sim \mathcal{L}_S(s_t, a_t) & \text{environment produces next state} \\ + & \forall t, r_{t+1} \sim \mathcal{L}_r(s_t, a_t, s_{t+1}) & \text{environment produces next reward} \\ + \end{align*} -#. The "simulator" represents the physics as in all `"mujoco" environments `_ - *eg* `Ant `_ or - `Inverted Pendulum `_ The "simulator" is really the same - concept in grid2op and in these environments. -#. +Specific notations +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +To define "the" MDP modeled by grid2op, we also need to define some other concepts that will be used to define the +state space :math:`\mathcal{S}` or transition kernel :math:`\mathcal{L}_s(s, a)` for example. -Modeling sequential decisions -------------------------------- +A Simulator +++++++++++++ -TODO +We need a so called "simulator". +Informatically, this is represented by the `Backend` inside the grid2op environment (more information +about the `Backend` is detailed in the :ref:`backend-module` section of the documentation). -Inputs -~~~~~~~~~~ +This simulator is able to compute some informations that are part of the state +space :math:`\mathcal{S}` (*eg* flows on powerlines, active production value of generators etc.) +and thus are used in the computation of the transition kernel. -A simulator -++++++++++++ +TODO how to model it. + +.. This simulator is also used when implementing the transition kernel. Some part of the state space + + +.. other information given by the Environment (see :ref:`environment-module` for details about the +.. way the `Environment` is coded and refer to :class:`grid2op.Action._backendAction._BackendAction` for list +.. of all available informations informatically available for the solver). + + +To make a parallel with similar concepts "simulator", +represents the physics as in all `"mujoco" environments `_ +*eg* `Ant `_ or +`Inverted Pendulum `_ . This is the same concept +here excepts that it solves powerflows. + +Some Time Series ++++++++++++++++++ TODO -B Time Series -++++++++++++++ +.. _mdp-def: + +Modeling sequential decisions +------------------------------- TODO + +Inputs +~~~~~~~~~~ + Markov Decision process ~~~~~~~~~~~~~~~~~~~~~~~~ From 46adcdee5d9abae84977ce664ee9c9ef97e348b2 Mon Sep 17 00:00:00 2001 From: DONNOT Benjamin Date: Thu, 29 Feb 2024 16:00:21 +0100 Subject: [PATCH 5/5] still improving MDP doc --- docs/chronics.rst | 6 ++-- docs/mdp.rst | 73 ++++++++++++++++++++++++++++++++++++++--------- 2 files changed, 63 insertions(+), 16 deletions(-) diff --git a/docs/chronics.rst b/docs/chronics.rst index 428852556..8a13f5674 100644 --- a/docs/chronics.rst +++ b/docs/chronics.rst @@ -1,7 +1,9 @@ .. currentmodule:: grid2op.Chronics -Chronics -=================================== +.. _time-series-module: + +Time series (formerly called "chronics") +========================================= This page is organized as follow: diff --git a/docs/mdp.rst b/docs/mdp.rst index 94eed2f09..d85a193ec 100644 --- a/docs/mdp.rst +++ b/docs/mdp.rst @@ -109,7 +109,7 @@ MDP): :nowrap: \begin{align*} - \min_{\pi \in \Pi} ~& \sum_{t=1}^T r_t \\ + \min_{\pi \in \Pi} ~& \sum_{t=1}^T \mathbb{E} r_t \\ \text{s.t.} ~ \\ & \forall t, a_t \sim \pi (s_{t}) & \text{policy produces the action} \\ & \forall t, s_{t+1} \sim \mathcal{L}_S(s_t, a_t) & \text{environment produces next state} \\ @@ -134,14 +134,17 @@ This simulator is able to compute some informations that are part of the state space :math:`\mathcal{S}` (*eg* flows on powerlines, active production value of generators etc.) and thus are used in the computation of the transition kernel. -TODO how to model it. +We can model this simulator with a function :math:`\text{Sim}` that takes as input some data from an +"input space" :math:`\mathcal{S}_{\text{im}}^{(\text{in})}` and result +in data in :math:`\mathcal{S}_{\text{im}}^{(\text{out})}`. -.. This simulator is also used when implementing the transition kernel. Some part of the state space - - -.. other information given by the Environment (see :ref:`environment-module` for details about the -.. way the `Environment` is coded and refer to :class:`grid2op.Action._backendAction._BackendAction` for list -.. of all available informations informatically available for the solver). +.. note:: + In grid2op we don't force the "shape" of :math:`\mathcal{S}_{\text{im}}^{(\text{in})}`, including + the format used to read the grid file from the hard drive, the solved equations, the way + these equations are used. Everything here is "free" and grid2op only needs that the simulator + (wrapped in a `Backend`) understands the "format" sent by grid2op (through a + :class:`grid2op.Action._backendAction._BackendAction`) and is able to expose + to grid2op some of its internal variables (accessed with the `***_infos()` methods of the backend) To make a parallel with similar concepts "simulator", @@ -153,21 +156,63 @@ here excepts that it solves powerflows. Some Time Series +++++++++++++++++ -TODO +Another type of data that we need to define "the" grid2op MDP is the "time series", implemented in the `chronics` +grid2op module documented on the page +:ref:`time-series-module` with some complements given in the :ref:`doc_timeseries` page as well. + +These time series define what exactly would happen if the grid was a +"copper plate" without any constraints. Said differently it provides what would each consumer +consume and what would each producer produce if they could all be connected together with +infinite "bandwith", without any constraints on the powerline etc. + +In particular, grid2op supposes that these "time series" are balanced, in the sense that the producers +produce just the right amount (electrical power cannot really be stocked) for the consumer to consume +and that for each steps. It also supposes that all the "constraints" of the producers. + +These time series are typically generated outside of grid2op, for example using `chronix2grid `_ +python package (or anything else). + + +Formally, we will define these time series as input :math:`\mathcal{X}_t` all these time series at time :math:`t`. These +exogenous data consist of : + +- generator active production (in MW), for each generator +- load active power consumption (in MW), for each loads +- load reactive consumption (in MVAr), for each loads +- \* generator voltage setpoint / target (in kV) + +.. note:: + \* for this last part, this can be adapted "on demand" by the environment through the `voltage controler` module. + But for the sake of modeling, this can be modeled as being external / exogenous data. + +And, to make a parrallel with similar concept in other RL environment, these "time series" can represent the layout of the maze +in pacman, the positions of the platforms in "mario-like" 2d games, the different turns and the width of the route in a car game etc. +This is the "base" of the levels in most games. + +Finally, for most released environment, a lof of different :math:`\mathcal{X}` are available. By default, each time the +environment is "reset" (the user want to move to the next scenario), a new :math:`\mathcal{X}` is used (this behaviour +can be changed, more information on the section :ref:`environment-module-chronics-info` of the documentation). .. _mdp-def: Modeling sequential decisions ------------------------------- -TODO +As we said in introduction of this page, we will model a given scenario in grid2op. We have at our disposal: +- a simulator, which is represented as a function :math:`\text{Sim} : \mathcal{S}_{\text{im}}^{(\text{in})} \to \mathcal{S}_{\text{im}}^{(\text{out})}` +- some time series :math:`\mathcal{X} = \left\{ \mathcal{X}_t \right\}_{1 \leq t \leq T}` -Inputs -~~~~~~~~~~ +And we need to define the MDP through the definition of : -Markov Decision process -~~~~~~~~~~~~~~~~~~~~~~~~ +- :math:`\mathcal{S}`, the "state space" +- :math:`\mathcal{A}`, the "action space" +- :math:`\mathcal{L}_s(s, a)`, sometimes called "transition kernel", is the probability + distribution (over :math:`\mathcal{S}`) that gives the next + state after taking action :math:`a` in state :math:`s` +- :math:`\mathcal{L}_r(s, s', a)`, sometimes called "reward kernel", + is the probability distribution (over :math:`[0, 1]`) that gives + the reward :math:`r` after taking action :math:`a` in state :math:`s` which lead to state :math:`s'` Extensions -----------