Ensure examples run

materialsproject · Nov 10, 2021 · 9383eca · 9383eca
1 parent 996693b
commit 9383eca
Show file tree

Hide file tree

Showing 5 changed files with 86 additions and 52 deletions.
diff --git a/README.md b/README.md
@@ -39,8 +39,8 @@ It is easy to customise and compose any of the above workflows.
 
 Workflows in atomate2 written using the [jobflow] library. Workflows are generated using
 `Maker` objects, that have a consistent API for modifying input settings and chaining
-workflows together.  Below, we demonstrate how to run a band structure workflow as
-detailed in the [RelaxBandStructure] section of the documentation.  In total, 4 VASP
+workflows together.  Below, we demonstrate how to run a band structure workflow
+(see the [documentation][RelaxBandStructure] for more details). In total, 4 VASP
 calculations will be performed:
 
 1. A structural optimisation.
@@ -57,21 +57,21 @@ from pymatgen.core import Structure
 
 # construct a rock salt MgO structure
 mgo_structure = Structure(
-  lattice=[[0, 2.13, 2.13], [2.13, 0, 2.13], [2.13, 2.13, 0]],
-  species=["Mg", "O"],
-  coords=[[0, 0, 0], [0.5, 0.5, 0.5]],
+    lattice=[[0, 2.13, 2.13], [2.13, 0, 2.13], [2.13, 2.13, 0]],
+    species=["Mg", "O"],
+    coords=[[0, 0, 0], [0.5, 0.5, 0.5]],
 )
 
 # make a band structure flow to optimise the structure and obtain the band structure
 bandstructure_flow = RelaxBandStructureMaker().make(mgo_structure)
 
 # run the job
-run_locally(bandstructure_flow)
+run_locally(bandstructure_flow, create_folders=True)
 ```
 
-In this example, we run execute the workflow immediately. In most cases, you will want
-to perform calculations on many materials simulatenously. To achieve this, all atomate2
-workflows can be run using the [FireWorks] software. See the
+In this example, we run execute the workflow immediately. In many cases, you might want
+to perform calculations on several materials simulatenously. To achieve this, all
+atomate2 workflows can be run using the [FireWorks] software. See the
 [documentation][atomate2_fireworks] for more details.
 
 ## Installation

diff --git a/docs/src/_static/MgO-bandstructure.png b/docs/src/_static/MgO-bandstructure.png
diff --git a/docs/src/_static/MgO-dos.png b/docs/src/_static/MgO-dos.png
diff --git a/docs/src/user/install.rst b/docs/src/user/install.rst
@@ -19,7 +19,7 @@ FireWorks libraries. Briefly:
 
 Running and writing your own workflows are covered in later tutorials. For now, these
 topics will be covered in enough depth to get you set up and to help you know where to
-troubleshoot if you are having problems.
+troubleshoot if you're having problems.
 
 Note that this installation tutorial is VASP-centric since almost all functionality
 currently in atomate2 pertains to VASP.
@@ -66,7 +66,7 @@ academic computing clusters as well as systems with a MOM-node style architectur
 VASP
 ----
 
-To get access to VASP on supercomputing resources typically requires that you are added
+To get access to VASP on supercomputing resources typically requires that you're added
 to a user group on the system you work on after your license is verified. Ensure that
 you have access to the VASP executable and that it is functional before starting this
 tutorial.
@@ -78,34 +78,30 @@ MongoDB_ is a NoSQL database that stores each database entry as a document, whic
 represented in the JSON format (the formatting is similar to a dictionary in Python).
 Atomate2 uses MongoDB to:
 
-* to create database of calculation results.
-* store the workflows that you want to run as well as their state details (through
+* Create a database of calculation results.
+* Store the workflows that you want to run as well as their state details (through
   FireWorks - optional).
 
-MongoDB must be running and available to accept connections whenever you are running
+MongoDB must be running and available to accept connections whenever you're running
 workflows. Thus, it is strongly recommended that you have a server to run MongoDB or
 (simpler) use a hosting service. Your options are:
 
-* use a commercial service to host your MongoDB instance. These are typically the
+- Use a commercial service to host your MongoDB instance. These are typically the
   easiest to use and offer high quality service but require payment for larger
-  databases. `MongoDB Atlas <https://www.mongodb.com/cloud/atlas>`_ offers free 500 MB
-  which is certainly enough to get started for small to medium size projects, and it is
-  easy to upgrade or migrate your database if you do exceed the free allocation.
-* contact your supercomputing center to see if they offer MongoDB hosting (e.g., NERSC
-  has this, Google "request NERSC MongoDB database")
-* self-host a MongoDB server
-
-If you are just starting, we suggest the first (with a free plan) or second option
+  databases. `MongoDB Atlas <https://www.mongodb.com/cloud/atlas>`_ offers a free 500 MB
+  server which is certainly enough to get started for small to medium size projects, and
+  it is easy to upgrade or migrate your database if you exceed the free allocation.
+- Contact your supercomputing center to see if they offer MongoDB hosting (e.g., NERSC
+  has this, Google "request NERSC MongoDB database").
+- Self-host a MongoDB server.
+
+If you're just starting, we suggest the first (with a free plan) or second option
 (if available to you). The third option will require you to open up network settings to
 accept outside connections properly which can sometimes be tricky.
 
-Next, create a new database and set up two new username/password combinations:
-
-- an admin user
-- a read-only user
-
-Keep a record of your credentials - we will configure jobflow to connect to them in a
-later step. Also make sure you note down the hostname and port for the MongoDB instance.
+Next, create a new database and set up an account with admin access. Keep a record of
+your credentials - we will configure jobflow to connect to them in a later step. Also
+make sure you note down the hostname and port for the MongoDB instance.
 
 .. note::
 
@@ -116,15 +112,15 @@ later step. Also make sure you note down the hostname and port for the MongoDB i
     centers (e.g., LLNL, PNNL, ARCHER) will run into issues. If you run into connection
     issues later in this tutorial, some options are:
 
-    * contact your computing center to review their security policy to allow connections
-      from your MongoDB server (best resolution)
-    * host your Mongo database on a machine that you are able to securely connect to,
-      e.g. on the supercomputing network itself (ask a system administrator for help)
-    * use a proxy service to forward connections from the MongoDB --> login node -->
+    - Contact your computing center to review their security policy to allow connections
+      from your MongoDB server (best resolution).
+    - Host your Mongo database on a machine that you're able to securely connect to,
+      e.g. on the supercomputing network itself (ask a system administrator for help).
+    - Use a proxy service to forward connections from the MongoDB --> login node -->
       compute node (you might try, for example, `the mongo-proxy tool
       <https://github.com/bakks/mongo-proxy>`_).
-    * set up an ssh tunnel to forward connections from allowed machines (the tunnel must
-      be kept alive at all times you are running workflows)
+    - Set up an ssh tunnel to forward connections from allowed machines (the tunnel must
+      be kept alive at all times you're running workflows).
 
 
 .. _MongoDB: https://docs.mongodb.com/manual/
@@ -234,7 +230,7 @@ jobflow.yaml
 ------------
 
 The ``jobflow.yaml`` file contains the credentials of the MongoDB server that will store
-calculation outputs. The ``jobflow.json`` file requires you to enter the basic database
+calculation outputs. The ``jobflow.yaml`` file requires you to enter the basic database
 information as well as what to call the main collection that results are kept in (e.g.
 ``ouputs``). Note that you should replace the whole ``<<PROPERTY>>`` definition with
 your own settings.
@@ -260,6 +256,40 @@ your own settings.
             password: <<PASSWORD>>
             collection_name: outputs_blobs
 
+.. note::
+
+    If you're using a mongoDB hosted on Atlas (using the free plan linked above) the
+    connection format is slightly different. Instead your ``jobflow.yaml`` file should
+    contain the following.
+
+    .. code-block:: yaml
+
+        JOB_STORE:
+            docs_store:
+              type: MongoURIStore
+              uri: mongodb+srv://<<USERNAME>>:<<PASSWORD>>@<<HOST>>/<<DB_NAME>>?retryWrites=true&w=majority
+              collection_name: outputs
+            additional_stores:
+              data:
+                type: GridFSURIStore
+                uri: mongodb+srv://<<USERNAME>>:<<PASSWORD>>@<<HOST>>/<<DB_NAME>>?retryWrites=true&w=majority
+                collection_name: outputs_blobs
+
+    The URI key may be different based on the Atlas database you deployed. You can
+    see the template for the URI string by clicking on "Databases" (under "Deployment"
+    in the left hand menu) then "Connect" then "Connect your application". Select
+    Python as the driver and 3.12 as the version. The connection string should now be
+    displayed in the box.
+
+    Note that the username and password are not your login account details for Atlas.
+    Instead you must add a new database user by selecting "Database Access" (under
+    "Security" in the left hand menu) and then "Add a new database user".
+
+    Secondly, Atlas only allows connections from known IP addresses. You must therefore
+    add the IP address of your cluster (and any other computers you'll be connecting
+    from) by clicking "Network Access" (under "Security" in the left hand menu) and then
+    "Add IP address".
+
 Atomate2 uses two database collections, one for small documents (such as elastic
 tensors, structures, and energies) called the ``docs`` store and another for large
 documents such as band structures and density of states called the ``data`` store.
@@ -313,7 +343,7 @@ where ``<<INSTALL_DIR>>`` is your installation directory.
 Configure pymatgen
 ==================
 
-If you are planning to run VASP, the last configuration step is to configure pymatgen to
+If you're planning to run VASP, the last configuration step is to configure pymatgen to
 (required) find the pseudopotentials for VASP and (optional) set up your API key from
 the `Materials Project`_.
 
@@ -383,7 +413,7 @@ or work directory) and create a file called ``relax.py`` containing:
     relax_job = RelaxMaker().make(si_structure)
 
     # run the job
-    run_locally(relax_job)
+    run_locally(relax_job, create_folders=True)
 
 The ``run_locally`` function is a jobflow command that will execute the workflow on
 the current computing resource.
@@ -437,7 +467,7 @@ output.
 
     # query the job store
     result = store.query_one(
-        query={"output.formula_pretty": "Si"}, properties=["output.output.energy_per_atom"]
+        {"output.formula_pretty": "Si"}, properties=["output.output.energy_per_atom"]
     )
     print(result)
 

diff --git a/docs/src/user/running-workflows.rst b/docs/src/user/running-workflows.rst
@@ -86,7 +86,7 @@ Create a Python script named ``mgo_bandstructure.py`` with the following content
     bandstructure_flow = RelaxBandStructureMaker().make(mgo_structure)
 
     # run the job
-    run_locally(bandstructure_flow)
+    run_locally(bandstructure_flow, create_folders=True)
 
 
 .. _Running the workflow:
@@ -132,43 +132,47 @@ code, either as a script or on the Python prompt.
 
     from jobflow import SETTINGS
     from pymatgen.electronic_structure.plotter import DosPlotter, BSPlotter
+    from pymatgen.electronic_structure.dos import CompleteDos
+    from pymatgen.electronic_structure.bandstructure import BandStructureSymmLine
 
-    store = jobflow.SETTINGS.JOB_STORE
+    store = SETTINGS.JOB_STORE
     store.connect()
 
     # get the uniform bandstructure from the database
     result = store.query_one(
-        query={"output.task_label": "band structure uniform"},
+        {"output.task_label": "non-scf uniform"},
         properties=["output.vasp_objects.dos"],
         load=True,  # DOS stored in the data store, so we need to explicitly load it
     )
-    dos = result["output"]["vasp_objects"]["dos"]
+    dos = CompleteDos.from_dict(result["output"]["vasp_objects"]["dos"])
 
     # plot the DOS
     dos_plotter = DosPlotter()
-    dos_plotter.add_dos(dos)
-    dos_plotter.save_plot("MgO-dos.pdf", xlim=(-10, 10))
+    dos_plotter.add_dos_dict(dos.get_element_dos())
+    dos_plotter.save_plot("MgO-dos.pdf", xlim=(-10, 10), img_format="pdf")
 
     # get the line mode bandstructure from the database
     result = store.query_one(
-        query={"output.task_label": "band structure line"},
+        {"output.task_label": "non-scf line"},
         properties=["output.vasp_objects.bandstructure"],
         load=True,  # BS stored in the data store, so we need to explicitly load it
     )
-    bandstructure = result["output"]["vasp_objects"]["bandstructure"]
+    bandstructure = BandStructureSymmLine.from_dict(
+        result["output"]["vasp_objects"]["bandstructure"]
+    )
 
     # plot the line mode band structure
     bs_plotter = BSPlotter(bandstructure)
-    bs_plotter.save_plot("MgO-bandstructure.pdf")
+    bs_plotter.save_plot("MgO-bandstructure.pdf", img_format="pdf")
 
 
 If you open the saved figures, you should see a plot of your DOS and bandstructure!
 
-.. figure:: _static/MgO-dos.png
+.. figure:: ../_static/MgO-dos.png
     :alt: MgO density of states
 
 
-.. figure:: _static/MgO-bandstructure.png
+.. figure:: ../_static/MgO-bandstructure.png
     :alt: MgO bandstructure