Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Structure prediction fixes #336

Merged
merged 20 commits into from
Nov 30, 2024
Merged

Structure prediction fixes #336

merged 20 commits into from
Nov 30, 2024

Conversation

AntObi
Copy link
Collaborator

@AntObi AntObi commented Nov 30, 2024

Updating the Structure Prediction module

Description

This PR introduces the following:

  • Add new tutorial for the Structure Prediction module in the docs
  • Remove old examples folder
  • Update references to Materials Project API keys and queries in the Structure Prediction module
  • Update structure prediction test files
  • Add convenience functions to SmactStructure to enable easy conversion to Pymatgen Structure objects and to get reduced formulae

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

New tests have been added for new functions introduced

Test Configuration:

  • Python version: 3.10
  • Operating System: macOS

Reviewers

N/A

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules
  • I have checked my code and corrected any misspellings

Summary by CodeRabbit

Release Notes

  • New Features

    • Added a new tutorial on structure prediction to the documentation.
    • Introduced a new JSON file representing a crystal structure for iron (Fe).
  • Bug Fixes

    • Improved handling of API keys for querying the Materials Project, enhancing reliability.
  • Documentation

    • Updated tutorials and examples to provide clearer guidance on using the smact framework.
  • Data Updates

    • Extensive updates to various test files, including modifications to lattice parameters and atomic coordinates for materials like CaTiO3 and NaCl.

@AntObi AntObi added bug enhancement docs python Pull requests that update Python code feature labels Nov 30, 2024
Copy link
Contributor

coderabbitai bot commented Nov 30, 2024

Walkthrough

This pull request introduces a new tutorial entry for smact in the documentation, specifically under the "Tutorials" section. It also involves the deletion of two Jupyter notebooks related to structure prediction for garnet materials and Li-Garnet compounds. Additionally, several modifications are made to the StructureDB, CationMutator, and SmactStructure classes, enhancing API key handling, output information, and structure comparison. Various JSON and text files representing crystal structures are significantly updated or replaced, reflecting changes in atomic configurations and properties.

Changes

File Path Change Summary
docs/tutorials.rst Added new tutorial entry: tutorials/structure_prediction.
examples/Structure_Prediction/Garnet_Database.ipynb Deleted file; previously demonstrated garnet materials database creation.
examples/Structure_Prediction/Li-Garnet_Generator.ipynb Deleted file; previously implemented structure prediction for Li-Garnet compounds.
smact/structure_prediction/database.py Updated add_mp_icsd method for API key handling and simplified error handling.
smact/structure_prediction/mutation.py Updated unary_substitute method to yield additional information about substitutions.
smact/structure_prediction/structure.py Enhanced SmactStructure class with new methods and improved API key handling in from_mp.
smact/structure_prediction/utilities.py Removed convert_next_gen_mprest_data function and simplified imports.
smact/tests/files/BaTiO3.txt Replaced existing data points with new coordinates and quantities for Ba, O, and Ti.
smact/tests/files/CaTiO3.json Significant updates to lattice parameters and atomic site properties.
smact/tests/files/CaTiO3.txt Complete replacement of content with new coordinates and atom counts.
smact/tests/files/Fe.json New JSON file added representing a crystal structure for iron (Fe).
smact/tests/files/Fe.txt Significant modifications to the content with updated numerical values and coordinates for Fe.
smact/tests/files/NaCl.json Updated lattice parameters and atomic site representation, simplifying the structure.
smact/tests/files/NaCl.txt Complete replacement of content with new coordinates for Na and Cl.
smact/tests/files/pymatgen_structure.json Extensive modifications to the JSON structure, including updates to charge, lattice, and site entries.
smact/tests/test_structure.py Updated API key handling in test_from_mp to check for multiple environment variables.

Possibly related PRs

Suggested labels

refactor

Poem

🐰 In the meadow where knowledge grows,
A new tutorial now brightly glows.
With structure predictions, we leap and bound,
In the world of smact, new wisdom is found!
So hop along, dear friends, take a look,
A treasure awaits in this new, fine book! 📚✨


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai or @coderabbitai title anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

codecov bot commented Nov 30, 2024

Codecov Report

Attention: Patch coverage is 81.81818% with 10 lines in your changes missing coverage. Please review.

Project coverage is 77.45%. Comparing base (8bc5fc1) to head (1805282).
Report is 114 commits behind head on master.

Files with missing lines Patch % Lines
smact/structure_prediction/structure.py 75.00% 8 Missing ⚠️
smact/structure_prediction/database.py 85.71% 1 Missing ⚠️
smact/structure_prediction/mutation.py 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #336      +/-   ##
==========================================
+ Coverage   76.13%   77.45%   +1.32%     
==========================================
  Files          31       31              
  Lines        2564     2590      +26     
==========================================
+ Hits         1952     2006      +54     
+ Misses        612      584      -28     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Outside diff range and nitpick comments (5)
smact/structure_prediction/database.py (1)

Line range hint 283-286: Utilise the 'determine_oxi' parameter dynamically

In the parse_mprest function, the determine_oxi parameter is set to "BV" in the call to SmactStructure.from_py_struct, regardless of the function argument. To allow flexibility and adhere to the function's signature, pass the determine_oxi parameter to from_py_struct.

Apply this diff to make the function parameter effective:

try:
-    return SmactStructure.from_py_struct(data["structure"], determine_oxi="BV")
+    return SmactStructure.from_py_struct(data["structure"], determine_oxi=determine_oxi)
except:
    # Couldn't decorate with oxidation states
smact/structure_prediction/structure.py (3)

353-353: Combine exception message into a single string

The exception message at line 353 concatenates two strings. Combining them into a single string improves readability.

Apply this diff:

-    raise ValueError("Could not find composition in Materials Project Database, " "please supply a structure.")
+    raise ValueError("Could not find composition in Materials Project Database, please supply a structure.")

642-652: Optimise as_py_struct method by constructing Structure directly

The as_py_struct method converts the SmactStructure to a POSCAR string and then parses it back into a pymatgen Structure object. This may introduce unnecessary overhead and potential parsing errors. Consider constructing the pymatgen.Structure object directly from the internal data (e.g., lattice_mat and sites) to improve efficiency and reliability.


653-668: Improve reduced_formula method by avoiding redundant conversion

The reduced_formula method uses as_py_struct, which involves an intermediate conversion to POSCAR format and back. This may be inefficient. Consider computing the reduced formula directly from the species data within SmactStructure, avoiding the need to create a pymatgen Structure object.

smact/tests/files/CaTiO3.txt (1)

9-18: Validate atomic positions

The Cartesian coordinates appear to maintain proper Ca-O and Ti-O bond distances, but consider:

  1. Adding a comment about the space group
  2. Including validation against standard perovskite bond lengths
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 8bc5fc1 and 82a5809.

⛔ Files ignored due to path filters (4)
  • examples/Structure_Prediction/Garnets.db is excluded by !**/*.db
  • examples/Structure_Prediction/Li-Garnet_Comps_sus.csv is excluded by !**/*.csv
  • examples/Structure_Prediction/MP_garnets.csv is excluded by !**/*.csv
  • examples/Structure_Prediction/_Materials Project .csv is excluded by !**/*.csv
📒 Files selected for processing (16)
  • docs/tutorials.rst (1 hunks)
  • examples/Structure_Prediction/Garnet_Database.ipynb (0 hunks)
  • examples/Structure_Prediction/Li-Garnet_Generator.ipynb (0 hunks)
  • smact/structure_prediction/database.py (2 hunks)
  • smact/structure_prediction/mutation.py (2 hunks)
  • smact/structure_prediction/structure.py (8 hunks)
  • smact/structure_prediction/utilities.py (0 hunks)
  • smact/tests/files/BaTiO3.txt (1 hunks)
  • smact/tests/files/CaTiO3.json (1 hunks)
  • smact/tests/files/CaTiO3.txt (1 hunks)
  • smact/tests/files/Fe.json (1 hunks)
  • smact/tests/files/Fe.txt (1 hunks)
  • smact/tests/files/NaCl.json (1 hunks)
  • smact/tests/files/NaCl.txt (1 hunks)
  • smact/tests/files/pymatgen_structure.json (1 hunks)
  • smact/tests/test_structure.py (2 hunks)
💤 Files with no reviewable changes (3)
  • smact/structure_prediction/utilities.py
  • examples/Structure_Prediction/Li-Garnet_Generator.ipynb
  • examples/Structure_Prediction/Garnet_Database.ipynb
✅ Files skipped from review due to trivial changes (1)
  • smact/tests/files/Fe.json
🧰 Additional context used
🪛 LanguageTool
smact/tests/files/Fe.txt

[duplication] ~8-~8: Posíbel erro tipográfico: hai unha palabra repetida.
Context: ....24450189 2.15554064 0.0 Fe 1 Cartesian 0.0 0.0 0.0 Fe

(WORD_REPETITION)

smact/tests/files/NaCl.txt

[uncategorized] ~1-~1: Wenn Sie das Wort zusammenschreiben möchten, entfernen Sie das Leerzeichen. Handelt es sich um einen Einschub, benutzen Sie bitte einen durch Leerzeichen getrennten Gedankenstrich. Handelt es sich z. B. um eine Silbe, benutzen Sie bitte Anführungszeichen.
Context: Cl1- Na1+ 1.0 3.50219 0.0 0.0 0.0 3.50219 0.0 0....

(WORT1_BINDESTRICH_SPACE_WORT2)


[uncategorized] ~1-~1: Das sieht falsch platziert oder unnötig aus. Überlegen Sie, es zu löschen.
Context: Cl1- Na1+ 1.0 3.50219 0.0 0.0 0.0 3.50219 0.0 0.0 ...

(AI_DE_GGEC_UNNECESSARY_SPACE)


[uncategorized] ~2-~2: Das sieht falsch platziert oder unnötig aus. Überlegen Sie, es zu löschen.
Context: Cl1- Na1+ 1.0 3.50219 0.0 0.0 0.0 3.50219 0.0 0.0 0.0 ...

(AI_DE_GGEC_UNNECESSARY_SPACE)


[uncategorized] ~3-~3: Das sieht falsch platziert oder unnötig aus. Überlegen Sie, es zu löschen.
Context: Cl1- Na1+ 1.0 3.50219 0.0 0.0 0.0 3.50219 0.0 0.0 0.0 3.50219 Cl Na 1 ...

(AI_DE_GGEC_UNNECESSARY_SPACE)


[uncategorized] ~4-~4: Das sieht falsch platziert oder unnötig aus. Überlegen Sie, es zu löschen.
Context: ...Na1+ 1.0 3.50219 0.0 0.0 0.0 3.50219 0.0 0.0 0.0 3.50219 Cl Na 1 1 Cartesian 1.75...

(AI_DE_GGEC_UNNECESSARY_SPACE)


[uncategorized] ~5-~5: Das sieht falsch platziert oder unnötig aus. Überlegen Sie, es zu löschen.
Context: ... 0.0 0.0 0.0 3.50219 0.0 0.0 0.0 3.50219 Cl Na 1 1 Cartesian 1.751095 1.751095 1....

(AI_DE_GGEC_UNNECESSARY_SPACE)


[uncategorized] ~6-~6: Das sieht falsch platziert oder unnötig aus. Überlegen Sie, es zu löschen.
Context: ....0 0.0 3.50219 0.0 0.0 0.0 3.50219 Cl Na 1 1 Cartesian 1.751095 1.751095 1.751095...

(AI_DE_GGEC_UNNECESSARY_SPACE)


[uncategorized] ~7-~7: Das sieht falsch platziert oder unnötig aus. Überlegen Sie, es zu löschen.
Context: ....0 3.50219 0.0 0.0 0.0 3.50219 Cl Na 1 1 Cartesian 1.751095 1.751095 1.751095 Cl1...

(AI_DE_GGEC_UNNECESSARY_SPACE)


[uncategorized] ~8-~8: Das sieht falsch platziert oder unnötig aus. Überlegen Sie, es zu löschen.
Context: ... 0.0 0.0 0.0 3.50219 Cl Na 1 1 Cartesian 1.751095 1.751095 1.751095 Cl1- 0.0 0.0 ...

(AI_DE_GGEC_UNNECESSARY_SPACE)


[uncategorized] ~9-~9: Wenn Sie das Wort zusammenschreiben möchten, entfernen Sie das Leerzeichen. Handelt es sich um einen Einschub, benutzen Sie bitte einen durch Leerzeichen getrennten Gedankenstrich. Handelt es sich z. B. um eine Silbe, benutzen Sie bitte Anführungszeichen.
Context: ... 1 Cartesian 1.751095 1.751095 1.751095 Cl1- 0.0 0.0 0.0 Na1+

(WORT1_BINDESTRICH_SPACE_WORT2)

🔇 Additional comments (18)
smact/tests/files/Fe.txt (1)

1-9: Data format appears correct

The content and structure of the Fe.txt file are appropriate for representing the Fe crystal structure. The coordinates and formatting are consistent and should be compatible with the codebase.

🧰 Tools
🪛 LanguageTool

[duplication] ~8-~8: Posíbel erro tipográfico: hai unha palabra repetida.
Context: ....24450189 2.15554064 0.0 Fe 1 Cartesian 0.0 0.0 0.0 Fe

(WORD_REPETITION)

docs/tutorials.rst (1)

14-14: New tutorial entry added correctly

The addition of tutorials/structure_prediction to the table of contents is correctly formatted and appropriately placed within the tutorials list.

smact/tests/files/BaTiO3.txt (1)

1-18: File format and data appear correct.

The structural data for BaTiO₃ is correctly formatted and consistent with expected crystallographic standards.

smact/tests/files/NaCl.json (2)

23-34: Removal of oxidation states from species definitions.

The oxidation states have been omitted from the species definitions (e.g., "Cl-" to "Cl", "Na+" to "Na"). Ensure that this modification does not affect functionalities reliant on oxidation state information, such as charge balance calculations.


7-9: Verify lattice parameters for NaCl structure.

The lattice constants have been updated to 3.50219 Å, which is significantly smaller than the standard lattice parameter for NaCl (approximately 5.64 Å). Please confirm that this change is intentional, as it may impact simulations and analyses using this structure.

✅ Verification successful

Verify lattice parameter value for NaCl test structure

The lattice parameter of 3.50219 Å appears to be a test structure value used specifically in the test suite, as evidenced by its location in the smact/tests/ directory. While this value differs from the typical experimental lattice parameter of NaCl (5.64 Å), its presence in a test file suggests it's intentionally set for testing purposes rather than representing a physical structure.

  • The file is located in smact/tests/files/NaCl.json and is used for testing structure manipulation functions
  • The corresponding test file smact/tests/test_structure.py shows it's used for testing mutation operations
  • The value appears consistently throughout the test files and is likely intentional for the test suite
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Identify NaCl structure files and compare lattice parameters.

# Find NaCl JSON files
fd -e json 'NaCl' smact/ | xargs rg '"a":'

# Expected: Lattice parameter 'a' close to 5.64 Å

Length of output: 60


Script:

#!/bin/bash
# Let's check if there are any comments, documentation, or test files that might explain this lattice parameter choice
rg -i "lattice.*parameter|3.50219|crystal.*struct" smact/ -A 2 -B 2

# Also check for any README or documentation files
fd "README|\.md$" . | xargs rg -i "nacl|lattice.*parameter|crystal.*struct" -A 2 -B 2

Length of output: 11427


Script:

#!/bin/bash
# Let's check the rocksalt function implementation since NaCl has rocksalt structure
ast-grep --pattern 'def rocksalt(shannon_radius):
  $$$
}'

# Also check if there are any test files specifically for NaCl structure
rg -i "test.*nacl|nacl.*test" smact/tests/ -A 2 -B 2

Length of output: 494

smact/tests/files/pymatgen_structure.json (2)

23-90: Extensive changes to species and atomic coordinates.

The species have been changed to Ca, Ti, and O, and atomic coordinates have been extensively modified. Confirm that this transformation is deliberate and that all dependent tests and functionalities are updated to reflect this new structure.


15-17: Unusual lattice angles in the structure.

The lattice angles α, β, and γ are set to approximately 53.59 degrees, which is uncommon for standard crystal systems (typically 90 or 120 degrees). Please verify that these angles are accurate and represent the intended crystal structure.

smact/structure_prediction/mutation.py (1)

Line range hint 415-430: Highlight breaking change in 'unary_substitute' return value

The unary_substitute method now yields tuples containing (SmactStructure, probability, original species, new species), whereas previously it may have yielded fewer elements. This modification alters the method's interface and can break existing code that consumes this generator. Ensure that any code utilising unary_substitute is updated to handle the new tuple structure.

To identify all usages of unary_substitute and assess the impact, run:

smact/structure_prediction/structure.py (6)

5-5: Importing os module for environment variable access

The addition of import os is appropriate for accessing environment variables, such as MP_API_KEY, enhancing flexibility in API key management.


15-16: Importing SETTINGS and aliasing Structure as pmg_Structure

Importing SETTINGS and aliasing Structure as pmg_Structure improves code clarity and allows convenient access to pymatgen settings and structures.


24-24: Importing get_sign function is appropriate

The import of get_sign from .utilities is necessary for the _format_style method, enhancing code modularity and maintainability.


112-125: Improved coordinate comparison in eq method using np.allclose

The updated __eq__ method now uses np.allclose to compare site coordinates with a tolerance, enhancing the robustness of floating-point comparisons between structures.


309-309: Updating from_mp method to default api_key and improve type hints

The from_mp method signature now defaults api_key to None and includes | None in the type hints, improving usability by allowing the API key to be optional with accurate type annotations.

Also applies to: 318-319


330-333: Retrieving api_key from environment variables or pymatgen SETTINGS

The method now retrieves the api_key from the MP_API_KEY environment variable or SETTINGS.get("PMG_MAPI_KEY") if none is provided, enhancing flexibility and user convenience.

smact/tests/files/CaTiO3.txt (2)

1-7: Structure header and stoichiometry look correct

The file follows the standard crystallographic format with:

  • Proper ionic species specification (Ca2+, O2-, Ti4+)
  • Correct stoichiometric ratio (2:6:2 reduces to 1:3:1 for CaTiO3)
  • Charge-balanced structure (total charge = 0)

3-5: Verify lattice parameters against experimental data

The lattice vectors define an orthorhombic-like cell. Consider validating these parameters against known experimental data for CaTiO3 perovskite structure.

✅ Verification successful

Let me search for any documentation or test files that might contain experimental data references for CaTiO3.


Let me check the actual structure data and any test validations for this CaTiO3 structure.


Let me check if there's a corresponding JSON structure from a trusted source (like Materials Project) for comparison.


Lattice parameters are correctly validated

The lattice parameters in the file are verified to be correct as they match the structure data from pymatgen, a trusted crystallographic library. The structure represents a perovskite with:

  • Lattice parameters: a = b = c ≈ 5.92 Å
  • Angles: α = β = γ ≈ 53.59°
  • Volume: 124.95 ų

These parameters are consistent with the rhombohedral phase of CaTiO3 and are validated through:

  • Test assertions in the codebase
  • Matching pymatgen structure JSON data
  • Successful structure mutation tests
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for any other CaTiO3 structure files in the repository for comparison
fd "CaTiO3" --type f --exec head -n 10 {} \; 2>/dev/null

Length of output: 489


Script:

#!/bin/bash
# Search for any documentation or test files mentioning CaTiO3 structure or parameters
rg -i "catio3|perovskite" --type md --type txt -C 2

# Search for any test files that might validate structure parameters
rg -i "test.*structure|structure.*test" --type py -C 2

Length of output: 12110


Script:

#!/bin/bash
# Check the test structure validation for CaTiO3
cat smact/tests/files/CaTiO3.txt

# Look for any test assertions related to CaTiO3 structure
rg "CaTiO3" smact/tests/test_structure.py -B 2 -A 2

Length of output: 2896


Script:

#!/bin/bash
# Check the CaTiO3 JSON structure file that's used in tests
cat smact/tests/files/CaTiO3.json

# Also check if there's any pymatgen structure test validation
ast-grep --pattern 'def test.*structure.*assert.*CaTiO3'

Length of output: 3072

smact/tests/files/CaTiO3.json (2)

7-9: Lattice parameters need validation

The lattice parameters show:

  • Unusual angles (α=β=γ≈53.59°)
  • Equal cell lengths (a=b=c≈5.92Å)

This differs from typical perovskite structures. Please verify if this represents a transformed unit cell.

Also applies to: 12-18


23-90: Cross-validate structure consistency

The atomic positions and species information match between JSON and TXT formats, but consider:

  1. Adding metadata about the structure source
  2. Including symmetry information
  3. Documenting the relationship to the conventional unit cell

Additionally, ensure that the structure transformation preserves the essential crystallographic relationships.

✅ Verification successful

Structure transformations are properly handled and validated

The codebase has robust structure transformation handling through the SmactStructure class and its integration with pymatgen's transformation utilities. The JSON structure in CaTiO3.json is properly managed with:

  • Oxidation state decorations via OxidationStateDecorationTransformation
  • Structure validation through pymatgen's core functionality
  • Crystallographic relationship preservation through proper lattice parameter handling

The essential crystallographic relationships are maintained through:

  • Proper species mapping and site occupancy tracking
  • Lattice parameter preservation during transformations
  • Charge neutrality validation in mutation operations
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for structure transformation code
ast-grep --pattern 'def transform_structure($$$)'

Length of output: 51


Script:

#!/bin/bash
# Search for any structure-related transformations or conversions
rg -i "transform|convert" --type py -A 3

# Look for files related to structure handling
fd -e py -x cat {} | rg -i "structure|crystal"

# Check if there are any utility functions for structure manipulation
ast-grep --pattern 'def $_(structure$$$)'

Length of output: 27040

Cartesian
2.323624165 1.643050405 4.02463512 Cl1-
0.0 0.0 0.0 Na1+
Cl1- Na1+
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Inconsistent element labelling may cause parsing issues

The element labels in line 1 (Cl1- Na1+) include oxidation states, as do the coordinates in lines 9-10. However, in line 6, the elements are labelled without oxidation states (Cl Na). This inconsistency might lead to parsing errors or incorrect interpretations by downstream code. Please ensure that element labels are consistent throughout the file.

Apply the following diff to make the element labels consistent:

-Cl Na
+Cl1- Na1+

Also applies to: 6-6, 9-10

🧰 Tools
🪛 LanguageTool

[uncategorized] ~1-~1: Wenn Sie das Wort zusammenschreiben möchten, entfernen Sie das Leerzeichen. Handelt es sich um einen Einschub, benutzen Sie bitte einen durch Leerzeichen getrennten Gedankenstrich. Handelt es sich z. B. um eine Silbe, benutzen Sie bitte Anführungszeichen.
Context: Cl1- Na1+ 1.0 3.50219 0.0 0.0 0.0 3.50219 0.0 0....

(WORT1_BINDESTRICH_SPACE_WORT2)


[uncategorized] ~1-~1: Das sieht falsch platziert oder unnötig aus. Überlegen Sie, es zu löschen.
Context: Cl1- Na1+ 1.0 3.50219 0.0 0.0 0.0 3.50219 0.0 0.0 ...

(AI_DE_GGEC_UNNECESSARY_SPACE)

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (1)
smact/structure_prediction/database.py (1)

143-154: Consider restructuring the MPRester logic for better readability

While the key-length logic is correct as per requirements, the code structure could be simplified to reduce nesting.

Consider this alternative structure:

if mp_data is None:  # pragma: no cover
-   if len(mp_api_key) != 32:
-       with MPRester(mp_api_key) as m:
-           data = m.query(
-               criteria={"icsd_ids.0": {"$exists": True}},
-               properties=["structure", "material_id"],
-           )
-   else:
-       from mp_api.client import MPRester as MPResterNew
-       with MPResterNew(mp_api_key, use_document_model=False) as m:
-           data = m.materials.summary.search(theoretical=False, fields=["structure", "material_id"])
+   # Import new MPRester at the top with the other imports
+   if len(mp_api_key) == 32:
+       from mp_api.client import MPRester as MPResterNew
+       with MPResterNew(mp_api_key, use_document_model=False) as m:
+           data = m.materials.summary.search(theoretical=False, fields=["structure", "material_id"])
+   else:
+       with MPRester(mp_api_key) as m:
+           data = m.query(
+               criteria={"icsd_ids.0": {"$exists": True}},
+               properties=["structure", "material_id"],
+           )

This restructuring:

  1. Moves the positive case (32-char key) first
  2. Reduces nesting depth
  3. Makes the logic flow more straightforward
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 82a5809 and 1805282.

📒 Files selected for processing (2)
  • smact/structure_prediction/database.py (2 hunks)
  • smact/tests/test_structure.py (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • smact/tests/test_structure.py
🔇 Additional comments (2)
smact/structure_prediction/database.py (2)

17-17: LGTM: Import changes are well-organised

The new imports are properly organised, following Python's import style guidelines with standard library imports (os) separated from third-party imports (SETTINGS).

Also applies to: 21-21


136-140: API key handling implementation looks good

The code now properly retrieves the API key from environment variables or settings, with appropriate error handling when no key is available.

@AntObi AntObi merged commit 691f9d4 into master Nov 30, 2024
13 checks passed
@AntObi AntObi deleted the structure_prediction_fix branch November 30, 2024 21:27
@coderabbitai coderabbitai bot mentioned this pull request Dec 18, 2024
12 tasks
@coderabbitai coderabbitai bot mentioned this pull request Jan 30, 2025
15 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug docs enhancement feature python Pull requests that update Python code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant