Sparse representation for machine learning the properties of defects in 2D materials

Quickstart

Open in Constructor Research Platform (a cloud service for scientific computations)

Summary

In the paper we propose sparse representation as a way to reduce the computational cost and improve the accuracy of machine learning the properties of defects in 2D materials. The code in the project implements the method, and a rigorous comparison of its performance to the a set of baselines.

Two-dimensional materials offer a promising platform for the next generation of (opto-) electronic devices and other high technology applications. One of the most exciting characteristics of 2D crystals is the ability to tune their properties via controllable introduction of defects. However, the search space for such structures is enormous, and ab-initio computations prohibitively expensive. We propose a machine learning approach for rapid estimation of the properties of 2D material given the lattice structure and defect configuration. The method suggests a way to represent configuration of 2D materials with defects that allows a neural network to train quickly and accurately. We compare our methodology with the state-of-the-art approaches and demonstrate at least 3.7 times energy prediction error drop. Also, our approach is an order of magnitude more resource-efficient than its contenders both for the training and inference part.

The main idea of our method is using a point cloud of defects as an input to the predictive model, as opposed to the usual point cloud of atoms, or expertly created feature vector.

We compare our approach to state-of-the-art generic structure-property prediction algorithms: GemNet, SchNet, MegNet, matminer+CatBoost.

For dataset, we use 2DMD. It consists of the most popular 2D materials: MoS2, WSe2, h-BN, GaSe, InSe, and black phosphorous (BP) with point defect density in the range of 2.5% to 12.5%. We use DFT to relax the structures and compute the defect formation energy and HOMO-LUMO gap. ML algorithms predict those quantities, taking unrelaxed structures as input.

Using the pre-trained models

Library

Use the library https://github.com/HSE-LAMBDA/MEGNetSparse/

This repository

Clone the repository
Set up the environment
Download the weights and data:

dvc pull datasets/checkpoints/combined_mixed_all_train/formation_energy_per_site/megnet_pytorch/sparse/05-12-2022_19-50-53/d6b7ce45/0.pth.dvc datasets/checkpoints/combined_mixed_all_train/homo_lumo_gap_min/megnet_pytorch/sparse/05-12-2022_19-50-53/831cc496/0.pth.dvc csv-cif-low-density-8x8 csv-cif-no-spin-500-data csv-cif-spin-500-data train-only-split

The data are not needed for predictions, and are only used to generate new structures in the example notebook.

Open the notebook. It contains the prediction code, along with generation of new structures with defects, and example processing of user-uploaded data.

Citation

Please cite the following two papers if you use the code or the data:

Kazeev, N., Al-Maeeni, A.R., Romanov, I. et al. Sparse representation for machine learning the properties of defects in 2D materials. npj Comput Mater 9, 113 (2023). https://doi.org/10.1038/s41524-023-01062-z

Huang, P., Lukin, R., Faleev, M. et al. Unveiling the complex structure-property correlation of defects in 2D materials based on high throughput datasets. npj 2D Mater Appl 7, 6 (2023). https://doi.org/10.1038/s41699-023-00369-1

Internal links

The overall design is documented in an obsolete flowchart
Some design decisions are outlined in an obsolete RFC
Project log is in Notion
Paper in Overleaf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Sparse representation for machine learning the properties of defects in 2D materials

Quickstart

Table of contents

Summary

Using the pre-trained models

Library

This repository

Citation

Internal links

Files

README.md

Latest commit

History

README.md

File metadata and controls

Sparse representation for machine learning the properties of defects in 2D materials

Quickstart

Table of contents

Summary

Using the pre-trained models

Library

This repository

Citation

Internal links