DScribe is a Python package for transforming atomic structures into fixed-size numerical fingerprints. These fingerprints are often called "descriptors" and they can be used in various tasks, including machine learning, visualization, similarity analysis, etc.
For more details and tutorials, visit the homepage at: https://singroup.github.io/dscribe/
import numpy as np
from ase.build import molecule
from dscribe.descriptors import SOAP
from dscribe.descriptors import CoulombMatrix
# Define atomic structures
samples = [molecule("H2O"), molecule("NO2"), molecule("CO2")]
# Setup descriptors
cm_desc = CoulombMatrix(n_atoms_max=3, permutation="sorted_l2")
soap_desc = SOAP(species=["C", "H", "O", "N"], rcut=5, nmax=8, lmax=6, crossover=True)
# Create descriptors as numpy arrays or sparse arrays
water = samples[0]
coulomb_matrix = cm_desc.create(water)
soap = soap_desc.create(water, positions=[0])
# Easy to use also on multiple systems, can be parallelized across processes
coulomb_matrices = cm_desc.create(samples)
coulomb_matrices = cm_desc.create(samples, n_jobs=3)
oxygen_indices = [np.where(x.get_atomic_numbers() == 8)[0] for x in samples]
oxygen_soap = soap_desc.create(samples, oxygen_indices, n_jobs=3)
# Some descriptors also allow calculating derivatives with respect to atomic
# positions
der, des = soap_desc.derivatives(samples, method="auto", return_descriptor=True)
Descriptor | Spectrum | Derivatives |
---|---|---|
Coulomb matrix | ✔️ | ✔️ |
Sine matrix | ✔️ | |
Ewald matrix | ✔️ | |
Atom-centered Symmetry Functions (ACSF) | ✔️ | |
Smooth Overlap of Atomic Positions (SOAP) | ✔️ | ✔️ |
Many-body Tensor Representation (MBTR) | ✔️ | |
Local Many-body Tensor Representation (LMBTR) | ✔️ | |
Valle-Oganov descriptor | ✔️ |
In-depth installation instructions can be found in the documentation, but in short:
pip install dscribe
conda install -c conda-forge dscribe
git clone https://github.com/SINGROUP/dscribe.git
cd dscribe
git submodule update --init
pip install .