Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add unit test for helper function json._check_type #716

Open
wants to merge 11 commits into
base: master
Choose a base branch
from

Conversation

DanielYang59
Copy link
Contributor

@DanielYang59 DanielYang59 commented Oct 21, 2024

Summary

  • Add unit test for helper function json._check_type, to use it with confident, ping the original author @mturiansky in case you have any comment, thanks
  • Use full qualified name __qualname__ to include the local scope
  • Use isclass to check class and update comment

Summary by CodeRabbit

  • New Features

    • Enhanced type checking functionality for improved accuracy and clarity.
    • Added unit tests for the _check_type function, covering various data types including classes, callables, and popular libraries like NumPy, pandas, and PyTorch.
  • Bug Fixes

    • Improved robustness of the _check_type function to correctly handle class objects and subclasses.
  • Documentation

    • Updated comments in the _check_type function for better clarity on its functionality.

@@ -113,7 +111,9 @@ class B(A):
mro = type(obj).mro()
except TypeError:
return False
return any(o.__module__ + "." + o.__name__ == ts for o in mro for ts in type_str)
return any(
o.__module__ + "." + o.__qualname__ == ts for o in mro for ts in type_str
Copy link
Contributor Author

@DanielYang59 DanielYang59 Oct 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe using full qualified name to include the local scope might be better (though checking the type of a local type might not be a real-world use case, but it doesn't change the behaviour)? @mturiansky In case I'm wrong here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't claim to be an expert on any of this. I just came up with a working solution for the problem I was trying to address. If you think this is a better approach and leads to the same result, then no push-back from me. I'd need to refresh myself on what half of these variables/functions do to be able to make any meaningful comment.

Copy link
Contributor Author

@DanielYang59 DanielYang59 Oct 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just came up with a working solution for the problem I was trying to address.

Thanks a ton for making everyone's life easier in the first place!

If you think this is a better approach and leads to the same result

It just covers more edge cases, but with unit test it seems to do what we want it to do. Free feel to comment if I miss any type that should be tested :)

I'd need to refresh myself on what half of these variables/functions do to be able to make any meaningful comment.

Take your time I would leave this as draft for a while

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would mark it as ready for review now and feel free to comment :)

Copy link

codecov bot commented Oct 21, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 82.42%. Comparing base (189b6e6) to head (e0f1fd8).

Additional details and impacted files
@@           Coverage Diff           @@
##           master     #716   +/-   ##
=======================================
  Coverage   82.42%   82.42%           
=======================================
  Files          27       27           
  Lines        1582     1582           
  Branches      285      285           
=======================================
  Hits         1304     1304           
  Misses        215      215           
  Partials       63       63           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@@ -113,7 +111,9 @@ class B(A):
mro = type(obj).mro()
except TypeError:
Copy link
Contributor Author

@DanielYang59 DanielYang59 Oct 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could be wrong here, I don't think this TypeError is related to callable type:

def func():
    print("hello")

print(type(func).mro())  # >>> [<class 'function'>, <class 'object'>]

But more likely that we cannot call type on a class constructor:

class A:
    def __init__(self):
        pass

print(type(A).mro())  # >>> TypeError: unbound method type.mro() needs an argument

Perhaps should be befffc1:

from inspect import isclass


if isclass(obj):
        return False

mro = type(obj).mro()

Correct me if I'm wrong :)

@DanielYang59 DanielYang59 force-pushed the add-test-check-type branch 4 times, most recently from dc694d7 to a1de0aa Compare October 21, 2024 12:24
Copy link

coderabbitai bot commented Oct 24, 2024

Walkthrough

The changes in this pull request focus on enhancing the _check_type function within the src/monty/json.py file by refining its type checking capabilities. The function's signature has been updated to specify that the obj parameter is of type object, and it now includes a check for class objects using isclass. Additionally, a new test class TestCheckType has been added to tests/test_json.py, which includes multiple unit tests to validate the behavior of the updated _check_type function across various data types.

Changes

File Change Summary
src/monty/json.py - Updated _check_type function signature to specify obj: object.
- Added import statement for isclass.
- Adjusted logic to return False for class objects and used __qualname__ for type string.
- Improved comments for clarity.
tests/test_json.py - Added TestCheckType class with seven test methods to validate _check_type: test_check_subclass, test_check_class, test_callable, test_numpy, test_pandas, test_torch, test_pydantic.
- Imported _check_type for testing.

Possibly related PRs

🐰 In the code we hop and play,
Enhancing types in a clever way.
With tests to check each little part,
Our _check_type shines with a smart heart!
Hooray for clarity, let’s cheer and say,
"Type checking's fun, hip-hip-hooray!" 🎉


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@DanielYang59 DanielYang59 marked this pull request as ready for review October 25, 2024 06:40
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Outside diff range and nitpick comments (4)
tests/test_json.py (4)

1075-1098: Add docstring to describe test purpose and cases.

The test is well-structured and thoroughly verifies subclass relationships, but would benefit from a docstring explaining the test cases and expected behavior.

 def test_check_subclass(self):
+    """Test that _check_type correctly handles class inheritance.
+    
+    Verifies:
+    1. Instance of class A is identified as type A but not type B
+    2. Instance of class B (subclass of A) is identified as both type B and A
+    """
     class A:
         pass

1143-1155: Consider expanding NumPy type coverage.

While the test covers basic NumPy types, consider adding tests for other common NumPy types like int64, float32, etc.

     def test_numpy(self):
+        """Test that _check_type correctly identifies various NumPy types."""
         # Test NumPy array
         arr = np.array([1, 2, 3])
         assert _check_type(arr, "numpy.ndarray")
         assert isinstance(arr, np.ndarray)
 
         # Test NumPy generic
         scalar = np.float64(3.14)
         assert _check_type(scalar, "numpy.generic")
         assert isinstance(scalar, np.generic)
+
+        # Test additional NumPy types
+        assert all([
+            _check_type(np.int64(42), "numpy.generic"),
+            _check_type(np.float32(3.14), "numpy.generic"),
+            _check_type(np.bool_(True), "numpy.generic")
+        ])

1176-1182: Consider expanding PyTorch type coverage.

While the basic tensor test is good, consider adding tests for different tensor types (e.g., cuda tensors if available) and other PyTorch objects.

     @pytest.mark.skipif(torch is None, reason="torch is not installed")
     def test_torch(self):
+        """Test that _check_type correctly identifies PyTorch types."""
         tensor = torch.tensor([1, 2, 3])
         assert _check_type(tensor, "torch.Tensor")
         assert isinstance(tensor, torch.Tensor)
+
+        # Test different tensor types
+        assert all([
+            _check_type(torch.FloatTensor([1.0]), "torch.Tensor"),
+            _check_type(torch.LongTensor([1]), "torch.Tensor"),
+            _check_type(torch.BoolTensor([True]), "torch.Tensor")
+        ])

1183-1199: Consider adding edge cases for pydantic and pint tests.

While the basic tests are good, consider adding tests for:

  • Nested pydantic models
  • Custom pydantic field types
  • Complex pint units and conversions
     @pytest.mark.skipif(pydantic is None, reason="pydantic is not installed")
     def test_pydantic(self):
+        """Test that _check_type correctly identifies pydantic models."""
         class MyModel(pydantic.BaseModel):
             name: str
+            
+        class NestedModel(pydantic.BaseModel):
+            model: MyModel
 
         model_instance = MyModel(name="Alice")
+        nested_instance = NestedModel(model=model_instance)
 
         assert _check_type(model_instance, "pydantic.main.BaseModel")
         assert isinstance(model_instance, pydantic.BaseModel)
+        assert _check_type(nested_instance, "pydantic.main.BaseModel")
 
     @pytest.mark.skipif(pint is None, reason="pint is not installed")
     def test_pint(self):
+        """Test that _check_type correctly identifies pint quantities."""
         ureg = pint.UnitRegistry()
         qty = 3 * ureg.meter
+        complex_qty = 3 * ureg.meter / (ureg.second * ureg.second)
 
         assert _check_type(qty, "pint.registry.Quantity")
         assert isinstance(qty, pint.Quantity)
+        assert _check_type(complex_qty, "pint.registry.Quantity")
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 1798d59 and da245a1.

📒 Files selected for processing (2)
  • src/monty/json.py (3 hunks)
  • tests/test_json.py (2 hunks)
🔇 Additional comments (9)
src/monty/json.py (6)

19-19: LGTM: Import statement is correctly placed.

The addition of isclass import from the inspect module is well-placed alongside other inspect imports and is necessary for the enhanced type checking functionality.


71-71: LGTM: Improved type annotation.

The addition of object type annotation for the obj parameter improves code clarity and IDE support.


Line range hint 73-95: LGTM: Enhanced docstring with clear examples.

The docstring improvements provide clear examples and better explain the type checking behavior, particularly for subclass relationships. The examples effectively demonstrate the function's behavior with class inheritance.


97-101: LGTM: Added class object handling.

The addition of the isclass check correctly prevents type checking on class objects, which aligns with the function's intended purpose of checking instance types.


104-106: LGTM: Improved type resolution with __qualname__.

The use of __qualname__ instead of __name__ provides better support for nested classes by including the full qualified name, which is more accurate for type checking.


Line range hint 71-106: Verify the impact of _check_type changes.

Let's verify that the enhanced type checking is working as expected throughout the codebase.

✅ Verification successful

No issues found with _check_type implementation

The verification shows that _check_type is well-tested and used consistently throughout the codebase:

  • Comprehensive test coverage in tests/test_json.py verifies inheritance behavior and type checking for various classes (numpy, pandas, torch, pydantic)
  • Production usage in src/monty/json.py aligns with the test cases and docstring examples
  • The function is used internally within the json module and imported by msgpack.py and serialization.py through the MontyDecoder/MontyEncoder classes
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Find all usages of _check_type to ensure they work with the new implementation

# Search for direct calls to _check_type
echo "Direct calls to _check_type:"
rg "_check_type\(" --type python

# Search for potential test cases
echo -e "\nTest cases for _check_type:"
rg "def.*test.*check.*type" --type python

Length of output: 278


Script:

#!/bin/bash
# Let's try without the --type flag since it's not supported
# Search for direct calls to _check_type
echo "Direct calls to _check_type:"
rg "_check_type\("

# Search for potential test cases
echo -e "\nTest cases for _check_type:"
rg "def.*test.*check.*type"

# Look for the module imports to understand usage context
echo -e "\nImports of the module:"
rg "from.*monty.*json.*import|import.*monty.*json"

Length of output: 2473

tests/test_json.py (3)

1074-1075: LGTM! Well-structured test class.

The test class follows pytest conventions and has a clear, descriptive name.


1099-1115: LGTM! Comprehensive class object testing.

The test effectively verifies that _check_type behaves like isinstance when dealing with class objects, with clear documentation of expected behavior.


1156-1175: LGTM! Comprehensive pandas type testing.

The test effectively verifies both direct type checking and inheritance relationships for pandas objects, with appropriate conditional execution using skipif.

tests/test_json.py Show resolved Hide resolved
@mturiansky
Copy link
Contributor

I took a look. I'd say it all looks reasonable to me.

From my understanding, it seems like cases where __qualname__ is actually needed over __name__ are rather contrived, but it does seem to be more correct.

@DanielYang59
Copy link
Contributor Author

Thank you!

From my understanding, it seems like cases where qualname is actually needed over name are rather contrived, but it does seem to be more correct.

Yes I agree, fully qualified name just covers some edge cases (like getting a local func)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants