Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change default oxidation states from SMACT14 to ICSD24 #346

Merged
merged 9 commits into from
Dec 2, 2024

Conversation

AntObi
Copy link
Collaborator

@AntObi AntObi commented Dec 1, 2024

Change default oxidation states from SMACT14 to ICSD24

Description

This PR introduces a breaking change to SMACT, where the decade long oxidation states default are now changed to a data-mined set of oxidation states from the ICSD. Main considerations for these changes were to have a list of oxidation states which reflect what has been observed in experimental structures.

  • smact_filter and smact_validity now uses icsd24 as the default oxidation states
  • Tests for the above functions have been updated accordingly to check the smact14 and icsd24 oxidation states
  • generate_compositions_with_smact now allows users to specify the oxidation state lists
  • The smact14 and icsd24 oxidation states are now tested for the above function
  • The Crystal Space tutorial in the documentation explicitly uses the smact14 oxidation states for reproducibility of the results reported in the Faraday Discussions publication

Type of change

  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

New tests were added and run locally to verify that the breaking changes don't break the codebase. Where tests failed, this arose due to oxidation_states_set not having an argument being supplied. This has been fixed by setting the argument equal to smact14 and additionally testing for icsd24

Test Configuration:

  • Python version: 3.10
  • Operating System: macOS

Reviewers

N/A

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules
  • I have checked my code and corrected any misspellings

Summary by CodeRabbit

  • New Features

    • Introduced a new parameter for generating binary chemical compositions, allowing users to filter based on specific oxidation states.
    • Enhanced the charge-neutrality testing process with improved accuracy by integrating a new oxidation states parameter.
    • Added visualisation of materials passing the SMACT validity test in the GNoME database.
  • Bug Fixes

    • Updated output display for interactive components and clarified limitations in the documentation.
  • Tests

    • Expanded test coverage for the smact_filter and generate_composition_with_smact functions to include new oxidation states.

Copy link
Contributor

coderabbitai bot commented Dec 1, 2024

Walkthrough

The changes involve the introduction of a new parameter, oxidation_states_set, across several functions and tutorial notebooks related to chemical composition generation and validation. This parameter allows users to specify the oxidation states to be used, enhancing the flexibility and accuracy of the functions. The default value for this parameter has been set to "icsd24", and the modifications include updates to function signatures, tutorial content, and tests to accommodate this new feature. Overall, the modifications maintain existing functionality while expanding the capabilities of the relevant methods.

Changes

File Path Change Summary
docs/tutorials/crystal_space.ipynb Added oxidation_states_set parameter to generate_composition_with_smact function call.
docs/tutorials/smact_validity_of_GNoMe.ipynb Updated smact_validity function call in parallel_apply to include **{"oxidation_states_set":"smact14"}; added a new bar plot for SMACT validity test results.
smact/screening.py Changed default value of oxidation_states_set from "smact14" to "icsd24" in smact_filter and smact_validity functions; updated documentation.
smact/tests/test_core.py Enhanced test_smact_filter method to incorporate oxidation_states_set parameter; updated assertions for different oxidation states.
smact/tests/test_utils.py Restructured test_generate_composition_with_smact to iterate over "smact14" and "icsd24" oxidation states; added various assertions for validation.
smact/utils/crystal_space/generate_composition_with_smact.py Introduced oxidation_states_set parameter to generate_composition_with_smact function; updated docstring and filtering logic to use this parameter.
smact/__init__.py Enhanced documentation for Element class; updated logic for retrieving oxidation states based on SMACT version.

Possibly related PRs

Suggested labels

bug, docs, feature, refactor

🐰 "In the world of crystal and light,
New parameters take flight,
Oxidation states now in view,
Enhancing tests, making all things new.
With every function, we explore,
Chemistry's wonders, forevermore!" 🐇


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai or @coderabbitai title anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

codecov bot commented Dec 1, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 77.56%. Comparing base (f3779cf) to head (59bf75c).
Report is 12 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #346      +/-   ##
==========================================
+ Coverage   77.44%   77.56%   +0.12%     
==========================================
  Files          31       31              
  Lines        2589     2599      +10     
==========================================
+ Hits         2005     2016      +11     
+ Misses        584      583       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (2)
smact/utils/crystal_space/generate_composition_with_smact.py (1)

59-59: Enhance parameter documentation

The docstring should specify all valid options for oxidation_states_set and explain the implications of each choice.

Consider expanding the documentation:

-        oxidation_states_set (str): the oxidation states set to use. Options are "smact14", "icsd16", "icsd24", "pymatgen_sp" or a filepath to a custom oxidation states list. For reproducing the Faraday Discussions results, use "smact14".
+        oxidation_states_set (str): The oxidation states set to use for filtering compositions. Options:
+            - "smact14": Original SMACT oxidation states, suitable for reproducing Faraday Discussions results
+            - "icsd16": ICSD 2016 oxidation states
+            - "icsd24": ICSD 2024 oxidation states (default, recommended for new analyses)
+            - "pymatgen_sp": Pymatgen's oxidation states
+            - filepath: Path to a custom oxidation states list file
smact/tests/test_core.py (1)

Line range hint 402-407: Consider adding assertions for all oxidation state sets.

While the test covers several oxidation state sets for MgB2, it would be more comprehensive to test all available sets consistently.

-        self.assertFalse(smact.screening.smact_validity("MgB2", oxidation_states_set="smact14"))
-        self.assertTrue(smact.screening.smact_validity("MgB2", oxidation_states_set="icsd16"))
-        self.assertFalse(smact.screening.smact_validity("MgB2", oxidation_states_set="pymatgen_sp"))
-        self.assertTrue(smact.screening.smact_validity("MgB2", oxidation_states_set="wiki"))
-        self.assertFalse(smact.screening.smact_validity("MgB2", oxidation_states_set=TEST_OX_STATES))
+        oxidation_sets_results = {
+            "smact14": False,
+            "icsd16": True,
+            "icsd24": True,
+            "pymatgen_sp": False,
+            "wiki": True,
+            TEST_OX_STATES: False
+        }
+        for ox_set, expected in oxidation_sets_results.items():
+            with self.subTest(ox_set=ox_set):
+                result = smact.screening.smact_validity("MgB2", oxidation_states_set=ox_set)
+                self.assertEqual(result, expected, f"Failed for oxidation set {ox_set}")
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between f3779cf and d8f74fa.

📒 Files selected for processing (6)
  • docs/tutorials/crystal_space.ipynb (1 hunks)
  • docs/tutorials/smact_validity_of_GNoMe.ipynb (3 hunks)
  • smact/screening.py (3 hunks)
  • smact/tests/test_core.py (3 hunks)
  • smact/tests/test_utils.py (1 hunks)
  • smact/utils/crystal_space/generate_composition_with_smact.py (3 hunks)
🔇 Additional comments (9)
smact/utils/crystal_space/generate_composition_with_smact.py (2)

47-47: LGTM: Parameter addition follows Python best practices

The new parameter oxidation_states_set is added with a sensible default value of "icsd24".


109-112: LGTM: Consistent parameter forwarding

The oxidation_states_set parameter is correctly forwarded to the smact_filter function.

docs/tutorials/crystal_space.ipynb (1)

105-105: LGTM: Ensures reproducibility with published results

The explicit use of oxidation_states_set="smact14" maintains consistency with the Faraday Discussions paper results.

smact/tests/test_utils.py (2)

99-103: LGTM: Well-structured test data

The test data is clearly organised with expected results for each oxidation state set.


104-120: LGTM: Comprehensive test coverage

The test implementation:

  • Uses subtests for clear failure isolation
  • Verifies both oxidation state sets
  • Checks DataFrame type, size, and content
  • Validates file saving
  • Includes proper cleanup
smact/tests/test_core.py (2)

338-347: Well-structured test data organization!

The dictionary structure provides a clear separation of test cases for different oxidation state sets, making it easier to maintain and extend.


351-357: Good use of subTest for parameterized testing!

Using subTest is the correct approach for testing multiple oxidation state sets, as it provides better test isolation and clearer failure reporting.

smact/screening.py (1)

338-339: Verify the impact of changing default oxidation states.

The change of default oxidation state set from "smact14" to "icsd24" is a breaking change that could affect existing code.

Also applies to: 441-442

docs/tutorials/smact_validity_of_GNoMe.ipynb (1)

347-347: Good practice: Explicit oxidation state set specification.

Explicitly setting oxidation_states_set="smact14" ensures reproducibility and maintains backward compatibility with previous results.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Outside diff range and nitpick comments (3)
smact/tests/test_core.py (2)

339-359: LGTM: Comprehensive test coverage for both oxidation state sets

The test effectively validates both SMACT14 and ICSD24 oxidation states using parameterised subtests.

Consider extracting the test data to a separate fixture or constant to improve maintainability:

OXIDATION_STATES_TEST_DATA = {
    "smact14": {
        "thresh_2": [
            (("Na", "Fe", "Cl"), (1, -1, -1), (2, 1, 1)),
            (("Na", "Fe", "Cl"), (1, 1, -1), (1, 1, 2)),
        ]
    },
    "icsd24": {"thresh_2": [(("Na", "Fe", "Cl"), (1, 1, -1), (1, 1, 2))]}
}

403-404: LGTM: Good test case selection

The test effectively demonstrates how MgB2 validity varies with different oxidation state sets.

Consider adding a docstring to explain why MgB2 was chosen as a test case and what makes it particularly suitable for testing different oxidation state sets.

smact/screening.py (1)

487-493: LGTM: Consistent implementation of oxidation state selection

The implementation correctly handles both explicit and default oxidation state selection.

Consider updating the error message to explicitly mention "icsd24" as the default:

-                f'{oxidation_states_set} is not valid. Enter either "smact14", "icsd16", "icsd24", "pymatgen_sp","wiki" or a filepath to a textfile of oxidation states.'
+                f'{oxidation_states_set} is not valid. Enter either "icsd24" (default), "smact14", "icsd16", "pymatgen_sp", "wiki" or a filepath to a textfile of oxidation states.'
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between d8f74fa and 59bf75c.

📒 Files selected for processing (3)
  • smact/__init__.py (2 hunks)
  • smact/screening.py (5 hunks)
  • smact/tests/test_core.py (4 hunks)
🔇 Additional comments (3)
smact/tests/test_core.py (1)

50-51: LGTM: Test maintains backwards compatibility

The test correctly validates both SMACT14 and default oxidation states for the Rb element.

smact/__init__.py (2)

61-61: LGTM: Clear version-specific documentation

The documentation clearly explains that oxidation states differ between versions 3.0 and earlier.


182-182: LGTM: Consistent implementation of new defaults

The change correctly implements ICSD24 as the default oxidation states source.

Let's check for any direct usage of the oxidation_states attribute:

✅ Verification successful

Change to ICSD24 oxidation states is safe and well-integrated

The codebase analysis shows that the change to use ICSD24 as the default oxidation states source is properly integrated:

  • The Element class maintains backward compatibility by preserving access to other oxidation state sets (SMACT14, ICSD16, SP, Wiki)
  • The screening module correctly handles all oxidation state sources including ICSD24
  • All direct usages of oxidation states are properly abstracted through the Element class interface
  • Test coverage exists for oxidation state functionality
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for direct usage of oxidation_states attribute
ast-grep --pattern 'Element($$$).oxidation_states'

Length of output: 52


Script:

#!/bin/bash
# Search for any usage of oxidation_states in the codebase
rg -l "oxidation_states" 

# Search for Element class usage patterns
rg "Element\(" -A 2

# Search for specific pattern where oxidation states might be accessed
rg "\.oxidation_states"

Length of output: 10171

@@ -335,8 +335,7 @@ def smact_filter(
threshold: int | None = 8,
stoichs: list[list[int]] | None = None,
species_unique: bool = True,
oxidation_states_set: str = "smact14",
comp_tuple: bool = False,
oxidation_states_set: str = "icsd24",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

Breaking change is properly documented but requires version update

The change from "smact14" to "icsd24" as the default oxidation state set is well-documented in the code with:

  • Clear deprecation notice mentioning this is a breaking change for SMACT 3.0
  • Detailed documentation of all available oxidation state sets
  • Explicit instructions for backwards compatibility
  • Tests that cover both oxidation state sets

However, this breaking change requires:

  • Increment of the major version number to 3.0 as per the documentation
  • Migration of the change notice to a more visible location like README.md
🔗 Analysis chain

Breaking change: Default oxidation states updated

The change of default oxidation state set from "smact14" to "icsd24" is a breaking change that could affect existing code.

Let's check if this change is properly documented:

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if the change is documented in README or changelog
rg -i "breaking.*change.*oxidation" README.md CHANGELOG.md
rg -i "default.*oxidation.*state" README.md CHANGELOG.md

Length of output: 225


Script:

#!/bin/bash
# Let's check for any documentation files in the repository
fd -e md -e rst

# Check for any mentions of oxidation states in documentation
rg -i "oxidation.*state" -g "*.{md,rst}"

# Check the specific function to understand the impact
ast-grep --pattern 'def smact_filter($$$) { $$$ }'

Length of output: 5444


Script:

#!/bin/bash
# Let's check the specific file for documentation about the oxidation state sets
rg -A 5 "icsd24|smact14" smact/screening.py

# Check the oxidation states module for details about these sets
rg -A 5 "icsd24|smact14" smact/oxidation_states.py

# Look for any test files that might be affected
fd -e py -e test test | xargs rg "smact14"

Length of output: 5961

@AntObi AntObi merged commit 0728f46 into master Dec 2, 2024
13 checks passed
@AntObi AntObi deleted the change_oxi_defaults branch December 2, 2024 17:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant