-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
deploy: metatensor/metatensor@a0a2ddf
- Loading branch information
0 parents
commit 1dc9ae8
Showing
23,282 changed files
with
8,222,994 additions
and
0 deletions.
The diff you're trying to view is too large. We only load the first 3000 changed files.
There are no files selected for viewing
Empty file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
docs.metatensor.org |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
# metatensor-docs | ||
|
||
Documentation website for metatensor. This is in a separate repository to limit | ||
the size of the main repository. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
<!DOCTYPE html> | ||
<html> | ||
|
||
<head> | ||
<meta charset="utf-8" /> | ||
<meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1" /> | ||
<meta http-equiv="refresh" content="0;URL=latest/index.html" /> | ||
</head> | ||
|
||
<body></body> | ||
</html> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
# Sphinx build info version 1 | ||
# This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done. | ||
config: 28d93f123c0060d0856cc677117cf63a | ||
tags: 645f666f9bcd5a90fca523b33c5a78b7 |
Binary file added
BIN
+114 KB
...12d8aa642bb1ca6b280f72a86489285b15a499742d2fa03121c9e445-fig_2-running-ase-md_001.json.gz
Binary file not shown.
Binary file added
BIN
+168 KB
...c5ad578e618c92d1cd275ae6b43c2056c47886420ad8044-fig_3-atomistic-model-with-nl_002.json.gz
Binary file not shown.
237 changes: 237 additions & 0 deletions
237
latest/_downloads/06a401252cd11c041771bae30d8fcd80/2-handling-sparsity.ipynb
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,237 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"\n\n# Handling sparsity\n\nThe one sentence introduction to metatensor mentions that this is a \"self-describing\n**sparse** tensor data format\". The `previous tutorial <core-tutorial-first-steps>`\nexplained the self-describing part of the format, and in this tutorial we will explore\nwhat makes metatensor a sparse format; and how to remove the corresponding sparsity when\nrequired.\n\nLike in the `previous tutorial <core-tutorial-first-steps>`, we will load the data\nwe need from a file. The code used to generate this file can be found below:\n\n.. details:: Show the code used to generate the :file:`radial-spectrum.npz` file\n\n ..\n\n The data was generated with `rascaline`_, a package to compute atomistic\n representations for machine learning applications.\n\n\n .. literalinclude:: radial-spectrum.py.example\n :language: python\n\nThe file contains a representation of two molecules called the radial spectrum. The atom\n$i$ is represented by the radial spectrum $R_i^\\alpha$, which is an\nexpansion of the neighbor density $\\rho_i^\\alpha(r)$ on a set of radial basis\nfunctions $f_n(r)$\n\n\\begin{align}R_i^\\alpha(n) = \\int f_n(r) \\rho_i(r) dr\\end{align}\n\nThe density $\\rho_i^\\alpha(r)$ associated with all neighbors of species\n$\\alpha$ of the atom $i$ (each neighbor is replaced with a Gaussian function\ncentered on the neighbor $g(r_{ij})$) is defined as:\n\n\\begin{align}\\rho_i^\\alpha(r) = \\sum_{j \\in \\text{ neighborhood of i }} g(r_{ij})\n \\delta_{\\alpha_j,\\alpha}\\end{align}\n\n\nThe exact mathematical details above don't matter too much for this tutorial, the main\npoint being that this representation treats atomic species as completely independent,\neffectively using the neighbor species $\\alpha$ for `one-hot encoding`_.\n\n\n.. py:currentmodule:: metatensor\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"collapsed": false | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"import ase\nimport ase.visualize.plot\nimport matplotlib.pyplot as plt\nimport numpy as np\n\nimport metatensor" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"We will work on the radial spectrum representation of three molecules in our system:\na carbon monoxide, an oxygen molecule and a nitrogen molecule.\n\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"collapsed": false | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"atoms = ase.Atoms(\n \"COO2N2\",\n positions=[(0, 0, 0), (1.2, 0, 0), (0, 6, 0), (1.1, 6, 0), (6, 0, 0), (7.3, 0, 0)],\n)\n\nfig, ax = plt.subplots(figsize=(3, 3))\nase.visualize.plot.plot_atoms(atoms, ax)\nax.set_axis_off()\nplt.show()" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## Sparsity in ``TensorMap``\n\nThe radial spectrum representation has two keys: ``central_species`` indicating the\nspecies of the central atom (atom $i$ in the equations); and\n``neighbor_type`` indicating the species of the neighboring atoms (atom $j$\nin the equations)\n\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"collapsed": false | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"radial_spectrum = metatensor.load(\"radial-spectrum.npz\")\n\nprint(radial_spectrum)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"This shows the first level of sparsity in ``TensorMap``: block sparsity.\n\nOut of all possible combinations of ``central_species`` and ``neighbor_type``, some\nare missing such as ``central_species=7, neighbor_type=8``. This is because we are\nusing a spherical cutoff of 2.5 \u00c5, and as such there are no oxygen neighbor atoms\nclose enough to the nitrogen centers. This means that all the corresponding radial\nspectrum coefficients $R_i^\\alpha(n)$ will be zero (since the neighbor density\n$\\rho_i^\\alpha(r)$ is zero everywhere).\n\nInstead of wasting memory space by storing all of these zeros explicitly, we simply\navoid creating the corresponding blocks from the get-go and save a lot of memory!\n\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"Let's now look at the block containing the representation for oxygen centers and\ncarbon neighbors:\n\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"collapsed": false | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"block = radial_spectrum.block(center_type=8, neighbor_type=6)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"Naively, this block should contain samples for all oxygen atoms (since\n``center_type=8``); in practice we only have a single sample!\n\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"collapsed": false | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"print(block.samples)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"There is a second level of sparsity happening here, using a format related to\n[coordinate sparse arrays (COO format)](COO_). Since there is only one oxygen atom\nwith carbon neighbors, we only include this atom in the samples, and the\ndensity/radial spectrum coefficient for all the other oxygen atoms is assumed to be\nzero.\n\n\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## Making the data dense again\n\nSometimes, we might have to use data in a sparse metatensor format with code that does\nnot understands this sparsity. One solution is to convert the data to a dense format,\nmaking the zeros explicit as much as possible. Metatensor provides functionalities to\nconvert sparse data to a dense format for the keys sparsity; and metadata to convert\nto a dense format for sample sparsity.\n\nFirst, the sample sparsity can be removed block by block by creating a new array full\nof zeros, and copying the data according to the indices in ``block.samples``\n\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"collapsed": false | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"dense_block_data = np.zeros((len(atoms), block.values.shape[1]))\n\n# only copy the non-zero data stored in the block\ndense_block_data[block.samples[\"atom\"]] = block.values\n\nprint(dense_block_data)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"Alternatively, we can undo the keys sparsity with\n:py:meth:`TensorMap.keys_to_samples` and :py:meth:`TensorMap.keys_to_properties`,\nwhich merge multiple blocks along the samples or properties dimensions respectively.\n\nWhich one of these functions to call will depend on the data you are handling.\nTypically, one-hot encoding (the ``neighbor_types`` key here) should be merged\nalong the properties dimension; and keys that define subsets of the samples\n(``center_type``) should be merged along the samples dimension.\n\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"collapsed": false | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"dense_radial_spectrum = radial_spectrum.keys_to_samples(\"center_type\")\ndense_radial_spectrum = dense_radial_spectrum.keys_to_properties(\"neighbor_type\")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"After calling these two functions, we now have a :py:class:`TensorMap` with a single\nblock and no keys:\n\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"collapsed": false | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"print(dense_radial_spectrum)\n\nblock = dense_radial_spectrum.block()" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"We can see that the resulting dense data array contains a lot of zeros (and has a well\ndefined block-sparse structure):\n\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"collapsed": false | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"with np.printoptions(precision=3):\n print(block.values)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"And using the metadata attached to the block, we can understand which part of the data\nis zero and why. For example, the lower-right corner of the array corresponds to\nnitrogen atoms (the last two samples):\n\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"collapsed": false | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"print(block.samples.print(max_entries=-1))" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"And these two bottom rows are zero everywhere, except in the part representing the\nnitrogen neighbor density:\n\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"collapsed": false | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"print(block.properties.print(max_entries=-1))" | ||
] | ||
} | ||
], | ||
"metadata": { | ||
"kernelspec": { | ||
"display_name": "Python 3", | ||
"language": "python", | ||
"name": "python3" | ||
}, | ||
"language_info": { | ||
"codemirror_mode": { | ||
"name": "ipython", | ||
"version": 3 | ||
}, | ||
"file_extension": ".py", | ||
"mimetype": "text/x-python", | ||
"name": "python", | ||
"nbconvert_exporter": "python", | ||
"pygments_lexer": "ipython3", | ||
"version": "3.12.5" | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 0 | ||
} |
Binary file added
BIN
+10.2 KB
latest/_downloads/0cc0ba974aa15743303d66de4a7f7bb9/radial-spectrum.npz
Binary file not shown.
Binary file added
BIN
+16.8 KB
latest/_downloads/0e60c5708ef3ea3957a153ed3f69f975/2-handling-sparsity.zip
Binary file not shown.
Oops, something went wrong.