Skip to content

Commit

Permalink
Ensure examples run
Browse files Browse the repository at this point in the history
  • Loading branch information
utf committed Nov 10, 2021
1 parent 996693b commit 9383eca
Show file tree
Hide file tree
Showing 5 changed files with 86 additions and 52 deletions.
18 changes: 9 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,8 +39,8 @@ It is easy to customise and compose any of the above workflows.

Workflows in atomate2 written using the [jobflow] library. Workflows are generated using
`Maker` objects, that have a consistent API for modifying input settings and chaining
workflows together. Below, we demonstrate how to run a band structure workflow as
detailed in the [RelaxBandStructure] section of the documentation. In total, 4 VASP
workflows together. Below, we demonstrate how to run a band structure workflow
(see the [documentation][RelaxBandStructure] for more details). In total, 4 VASP
calculations will be performed:

1. A structural optimisation.
Expand All @@ -57,21 +57,21 @@ from pymatgen.core import Structure

# construct a rock salt MgO structure
mgo_structure = Structure(
lattice=[[0, 2.13, 2.13], [2.13, 0, 2.13], [2.13, 2.13, 0]],
species=["Mg", "O"],
coords=[[0, 0, 0], [0.5, 0.5, 0.5]],
lattice=[[0, 2.13, 2.13], [2.13, 0, 2.13], [2.13, 2.13, 0]],
species=["Mg", "O"],
coords=[[0, 0, 0], [0.5, 0.5, 0.5]],
)

# make a band structure flow to optimise the structure and obtain the band structure
bandstructure_flow = RelaxBandStructureMaker().make(mgo_structure)

# run the job
run_locally(bandstructure_flow)
run_locally(bandstructure_flow, create_folders=True)
```

In this example, we run execute the workflow immediately. In most cases, you will want
to perform calculations on many materials simulatenously. To achieve this, all atomate2
workflows can be run using the [FireWorks] software. See the
In this example, we run execute the workflow immediately. In many cases, you might want
to perform calculations on several materials simulatenously. To achieve this, all
atomate2 workflows can be run using the [FireWorks] software. See the
[documentation][atomate2_fireworks] for more details.

## Installation
Expand Down
Binary file added docs/src/_static/MgO-bandstructure.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/src/_static/MgO-dos.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
94 changes: 62 additions & 32 deletions docs/src/user/install.rst
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ FireWorks libraries. Briefly:

Running and writing your own workflows are covered in later tutorials. For now, these
topics will be covered in enough depth to get you set up and to help you know where to
troubleshoot if you are having problems.
troubleshoot if you're having problems.

Note that this installation tutorial is VASP-centric since almost all functionality
currently in atomate2 pertains to VASP.
Expand Down Expand Up @@ -66,7 +66,7 @@ academic computing clusters as well as systems with a MOM-node style architectur
VASP
----

To get access to VASP on supercomputing resources typically requires that you are added
To get access to VASP on supercomputing resources typically requires that you're added
to a user group on the system you work on after your license is verified. Ensure that
you have access to the VASP executable and that it is functional before starting this
tutorial.
Expand All @@ -78,34 +78,30 @@ MongoDB_ is a NoSQL database that stores each database entry as a document, whic
represented in the JSON format (the formatting is similar to a dictionary in Python).
Atomate2 uses MongoDB to:

* to create database of calculation results.
* store the workflows that you want to run as well as their state details (through
* Create a database of calculation results.
* Store the workflows that you want to run as well as their state details (through
FireWorks - optional).

MongoDB must be running and available to accept connections whenever you are running
MongoDB must be running and available to accept connections whenever you're running
workflows. Thus, it is strongly recommended that you have a server to run MongoDB or
(simpler) use a hosting service. Your options are:

* use a commercial service to host your MongoDB instance. These are typically the
- Use a commercial service to host your MongoDB instance. These are typically the
easiest to use and offer high quality service but require payment for larger
databases. `MongoDB Atlas <https://www.mongodb.com/cloud/atlas>`_ offers free 500 MB
which is certainly enough to get started for small to medium size projects, and it is
easy to upgrade or migrate your database if you do exceed the free allocation.
* contact your supercomputing center to see if they offer MongoDB hosting (e.g., NERSC
has this, Google "request NERSC MongoDB database")
* self-host a MongoDB server

If you are just starting, we suggest the first (with a free plan) or second option
databases. `MongoDB Atlas <https://www.mongodb.com/cloud/atlas>`_ offers a free 500 MB
server which is certainly enough to get started for small to medium size projects, and
it is easy to upgrade or migrate your database if you exceed the free allocation.
- Contact your supercomputing center to see if they offer MongoDB hosting (e.g., NERSC
has this, Google "request NERSC MongoDB database").
- Self-host a MongoDB server.

If you're just starting, we suggest the first (with a free plan) or second option
(if available to you). The third option will require you to open up network settings to
accept outside connections properly which can sometimes be tricky.

Next, create a new database and set up two new username/password combinations:

- an admin user
- a read-only user

Keep a record of your credentials - we will configure jobflow to connect to them in a
later step. Also make sure you note down the hostname and port for the MongoDB instance.
Next, create a new database and set up an account with admin access. Keep a record of
your credentials - we will configure jobflow to connect to them in a later step. Also
make sure you note down the hostname and port for the MongoDB instance.

.. note::

Expand All @@ -116,15 +112,15 @@ later step. Also make sure you note down the hostname and port for the MongoDB i
centers (e.g., LLNL, PNNL, ARCHER) will run into issues. If you run into connection
issues later in this tutorial, some options are:

* contact your computing center to review their security policy to allow connections
from your MongoDB server (best resolution)
* host your Mongo database on a machine that you are able to securely connect to,
e.g. on the supercomputing network itself (ask a system administrator for help)
* use a proxy service to forward connections from the MongoDB --> login node -->
- Contact your computing center to review their security policy to allow connections
from your MongoDB server (best resolution).
- Host your Mongo database on a machine that you're able to securely connect to,
e.g. on the supercomputing network itself (ask a system administrator for help).
- Use a proxy service to forward connections from the MongoDB --> login node -->
compute node (you might try, for example, `the mongo-proxy tool
<https://github.com/bakks/mongo-proxy>`_).
* set up an ssh tunnel to forward connections from allowed machines (the tunnel must
be kept alive at all times you are running workflows)
- Set up an ssh tunnel to forward connections from allowed machines (the tunnel must
be kept alive at all times you're running workflows).


.. _MongoDB: https://docs.mongodb.com/manual/
Expand Down Expand Up @@ -234,7 +230,7 @@ jobflow.yaml
------------

The ``jobflow.yaml`` file contains the credentials of the MongoDB server that will store
calculation outputs. The ``jobflow.json`` file requires you to enter the basic database
calculation outputs. The ``jobflow.yaml`` file requires you to enter the basic database
information as well as what to call the main collection that results are kept in (e.g.
``ouputs``). Note that you should replace the whole ``<<PROPERTY>>`` definition with
your own settings.
Expand All @@ -260,6 +256,40 @@ your own settings.
password: <<PASSWORD>>
collection_name: outputs_blobs
.. note::

If you're using a mongoDB hosted on Atlas (using the free plan linked above) the
connection format is slightly different. Instead your ``jobflow.yaml`` file should
contain the following.

.. code-block:: yaml
JOB_STORE:
docs_store:
type: MongoURIStore
uri: mongodb+srv://<<USERNAME>>:<<PASSWORD>>@<<HOST>>/<<DB_NAME>>?retryWrites=true&w=majority
collection_name: outputs
additional_stores:
data:
type: GridFSURIStore
uri: mongodb+srv://<<USERNAME>>:<<PASSWORD>>@<<HOST>>/<<DB_NAME>>?retryWrites=true&w=majority
collection_name: outputs_blobs
The URI key may be different based on the Atlas database you deployed. You can
see the template for the URI string by clicking on "Databases" (under "Deployment"
in the left hand menu) then "Connect" then "Connect your application". Select
Python as the driver and 3.12 as the version. The connection string should now be
displayed in the box.

Note that the username and password are not your login account details for Atlas.
Instead you must add a new database user by selecting "Database Access" (under
"Security" in the left hand menu) and then "Add a new database user".

Secondly, Atlas only allows connections from known IP addresses. You must therefore
add the IP address of your cluster (and any other computers you'll be connecting
from) by clicking "Network Access" (under "Security" in the left hand menu) and then
"Add IP address".

Atomate2 uses two database collections, one for small documents (such as elastic
tensors, structures, and energies) called the ``docs`` store and another for large
documents such as band structures and density of states called the ``data`` store.
Expand Down Expand Up @@ -313,7 +343,7 @@ where ``<<INSTALL_DIR>>`` is your installation directory.
Configure pymatgen
==================

If you are planning to run VASP, the last configuration step is to configure pymatgen to
If you're planning to run VASP, the last configuration step is to configure pymatgen to
(required) find the pseudopotentials for VASP and (optional) set up your API key from
the `Materials Project`_.

Expand Down Expand Up @@ -383,7 +413,7 @@ or work directory) and create a file called ``relax.py`` containing:
relax_job = RelaxMaker().make(si_structure)
# run the job
run_locally(relax_job)
run_locally(relax_job, create_folders=True)
The ``run_locally`` function is a jobflow command that will execute the workflow on
the current computing resource.
Expand Down Expand Up @@ -437,7 +467,7 @@ output.
# query the job store
result = store.query_one(
query={"output.formula_pretty": "Si"}, properties=["output.output.energy_per_atom"]
{"output.formula_pretty": "Si"}, properties=["output.output.energy_per_atom"]
)
print(result)
Expand Down
26 changes: 15 additions & 11 deletions docs/src/user/running-workflows.rst
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,7 @@ Create a Python script named ``mgo_bandstructure.py`` with the following content
bandstructure_flow = RelaxBandStructureMaker().make(mgo_structure)
# run the job
run_locally(bandstructure_flow)
run_locally(bandstructure_flow, create_folders=True)
.. _Running the workflow:
Expand Down Expand Up @@ -132,43 +132,47 @@ code, either as a script or on the Python prompt.
from jobflow import SETTINGS
from pymatgen.electronic_structure.plotter import DosPlotter, BSPlotter
from pymatgen.electronic_structure.dos import CompleteDos
from pymatgen.electronic_structure.bandstructure import BandStructureSymmLine
store = jobflow.SETTINGS.JOB_STORE
store = SETTINGS.JOB_STORE
store.connect()
# get the uniform bandstructure from the database
result = store.query_one(
query={"output.task_label": "band structure uniform"},
{"output.task_label": "non-scf uniform"},
properties=["output.vasp_objects.dos"],
load=True, # DOS stored in the data store, so we need to explicitly load it
)
dos = result["output"]["vasp_objects"]["dos"]
dos = CompleteDos.from_dict(result["output"]["vasp_objects"]["dos"])
# plot the DOS
dos_plotter = DosPlotter()
dos_plotter.add_dos(dos)
dos_plotter.save_plot("MgO-dos.pdf", xlim=(-10, 10))
dos_plotter.add_dos_dict(dos.get_element_dos())
dos_plotter.save_plot("MgO-dos.pdf", xlim=(-10, 10), img_format="pdf")
# get the line mode bandstructure from the database
result = store.query_one(
query={"output.task_label": "band structure line"},
{"output.task_label": "non-scf line"},
properties=["output.vasp_objects.bandstructure"],
load=True, # BS stored in the data store, so we need to explicitly load it
)
bandstructure = result["output"]["vasp_objects"]["bandstructure"]
bandstructure = BandStructureSymmLine.from_dict(
result["output"]["vasp_objects"]["bandstructure"]
)
# plot the line mode band structure
bs_plotter = BSPlotter(bandstructure)
bs_plotter.save_plot("MgO-bandstructure.pdf")
bs_plotter.save_plot("MgO-bandstructure.pdf", img_format="pdf")
If you open the saved figures, you should see a plot of your DOS and bandstructure!

.. figure:: _static/MgO-dos.png
.. figure:: ../_static/MgO-dos.png
:alt: MgO density of states


.. figure:: _static/MgO-bandstructure.png
.. figure:: ../_static/MgO-bandstructure.png
:alt: MgO bandstructure


Expand Down

0 comments on commit 9383eca

Please sign in to comment.