-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/add sensitive metadata #365
Conversation
WalkthroughThe pull request introduces several changes across multiple files. A new 📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #365 +/- ##
=========================================
Coverage 100.00% 100.00%
=========================================
Files 16 16
Lines 1272 1285 +13
=========================================
+ Hits 1272 1285 +13
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report in Codecov by Sentry.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (3)
cuenca_validations/types/general.py (1)
88-91
: LGTM! Consider adding docstrings.The
Metadata
class is well-designed with appropriate attributes for handling sensitive data. Consider adding docstrings to document:
- The purpose of the class
- The meaning of each attribute
- Example usage
@dataclass class Metadata: + """Metadata for annotating fields with sensitivity and logging information. + + Attributes: + sensitive (bool): Whether the field contains sensitive information. + log_chars (int): Number of characters to include in logs. Defaults to 0 + which means the entire field will be masked. + + Example: + ```python + password: Annotated[str, Metadata(sensitive=True)] + card_number: Annotated[str, Metadata(sensitive=True, log_chars=4)] + ``` + """ sensitive: bool log_chars: int = 0tests/test_types.py (1)
592-610
: Consider adding more test cases.The current test coverage is good but could be enhanced by adding:
- Edge cases for
log_chars
(negative values, values longer than field length)- Verification that sensitive data is properly masked in logs
Consider adding these test cases:
def test_metadata_edge_cases(): # Test negative log_chars with pytest.raises(ValueError): MetadataModel.model_validate({ "secret": "secret", "partial_secret": "1234567890" }, metadata=[Metadata(sensitive=True, log_chars=-1)]) # Test log_chars longer than field length model = MetadataModel( secret="secret", partial_secret="1234567890" ) assert model.partial_secret == "1234567890" # Should not truncate def test_metadata_logging(): import logging # Setup logging stream = StringIO() handler = logging.StreamHandler(stream) logger = logging.getLogger("test_logger") logger.addHandler(handler) # Test logging of sensitive fields model = MetadataModel( secret="super-secret", partial_secret="1234567890" ) logger.info("Data: %s, %s", model.secret, model.partial_secret) # Verify that sensitive data is masked log_output = stream.getvalue() assert "super-secret" not in log_output assert "1234" in log_output # First 4 chars visible assert "567890" not in log_output # Rest is maskedcuenca_validations/types/requests.py (1)
498-500
: LGTM! Consider masking the password in the example.The
password
field is correctly annotated as sensitive. However, the example inmodel_config
shows the password in plain text.model_config = ConfigDict( - json_schema_extra={'example': {'password': 'supersecret'}}, + json_schema_extra={'example': {'password': '********'}}, )
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
cuenca_validations/types/__init__.py
(2 hunks)cuenca_validations/types/general.py
(2 hunks)cuenca_validations/types/requests.py
(2 hunks)cuenca_validations/version.py
(1 hunks)tests/test_types.py
(3 hunks)
✅ Files skipped from review due to trivial changes (1)
- cuenca_validations/version.py
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: Analyze (python)
🔇 Additional comments (1)
cuenca_validations/types/__init__.py (1)
107-107
: LGTM!The
Metadata
class is correctly exposed in the public API and imported from the appropriate module.Also applies to: 156-156
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
tests/test_types.py (2)
593-602
: Enhance password validation regex and error handling.The password validation function could be improved:
- The regex pattern could be compiled once for better performance
- The error message could be more descriptive
Consider this implementation:
+import re + +_REPEATED_DIGITS_PATTERN = re.compile(r'(\d).*\1') + def validate_password(password: str) -> str: """ Example of a custom validator Check if the password contains repeated numbers """ - import re - - if re.search(r'(\d).*\1', password): - raise ValueError("Password cannot contain repeated digits") + if _REPEATED_DIGITS_PATTERN.search(password): + raise ValueError("Password cannot contain repeated digits (e.g., '1' in 'pass1word1')") return password
615-637
: Enhance test coverage for edge cases.The test function could be improved to cover more scenarios:
- Test invalid passwords with repeated digits
- Test the behavior when fields are empty or None
- Test compatibility with other metadata as suggested
Consider adding these test cases:
def test_metadata(): model = MetadataModel( password="Mypass123", secret="super-secret", partial_secret="1234567890", ) password_field = MetadataModel.model_fields["password"] secret_field = MetadataModel.model_fields["secret"] partial_field = MetadataModel.model_fields["partial_secret"] assert get_log_config(password_field).masked is True assert get_log_config(password_field).unmasked_chars_length == 0 assert get_log_config(secret_field).masked is True assert get_log_config(secret_field).unmasked_chars_length == 0 assert get_log_config(partial_field).masked is True assert get_log_config(partial_field).unmasked_chars_length == 4 assert model.password == "Mypass123" assert model.secret == "super-secret" assert model.partial_secret == "1234567890" + + # Test invalid password with repeated digits + with pytest.raises(ValueError, match="Password cannot contain repeated digits"): + MetadataModel( + password="Pass1word1", + secret="super-secret", + partial_secret="1234567890", + ) + + # Test empty fields + with pytest.raises(ValidationError): + MetadataModel( + password="", + secret="", + partial_secret="", + )
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
cuenca_validations/types/__init__.py
(2 hunks)cuenca_validations/types/general.py
(2 hunks)cuenca_validations/types/helpers.py
(1 hunks)cuenca_validations/types/requests.py
(2 hunks)cuenca_validations/version.py
(1 hunks)tests/test_types.py
(3 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
- cuenca_validations/version.py
- cuenca_validations/types/init.py
- cuenca_validations/types/general.py
- cuenca_validations/types/requests.py
🔇 Additional comments (1)
tests/test_types.py (1)
605-612
: Add test cases for field compatibility with other metadata.As suggested in the past review comment, we should verify that LogConfig can coexist with other Pydantic metadata.
Consider adding a field that combines LogConfig with other Pydantic metadata:
class MetadataModel(BaseModel): password: Annotated[ str, AfterValidator(validate_password), LogConfig(masked=True) ] secret: Annotated[str, LogConfig(masked=True)] partial_secret: Annotated[ str, LogConfig(masked=True, unmasked_chars_length=4) ] + combined_field: Annotated[ + str, + Field(min_length=8), + AfterValidator(validate_password), + LogConfig(masked=True) + ]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
tests/test_types.py (1)
595-604
: Enhance the validator implementation.Consider the following improvements:
- Move the
re
import to the module level with other imports- Consider using a more efficient implementation using sets
- Enhance the docstring with examples and return type
-def validate_repeated_digits(password: str) -> str: - """ - Example of a custom validator - Check if the str contains repeated numbers - """ - import re - - if re.search(r'(\d).*\1', password): - raise ValueError("str cannot contain repeated digits") - return password +import re # Move to module level + +def validate_repeated_digits(password: str) -> str: + """Validates that a string does not contain repeated digits. + + Args: + password: The string to validate + + Returns: + str: The validated string + + Raises: + ValueError: If the string contains repeated digits + + Examples: + >>> validate_repeated_digits("abc123") # Valid + >>> validate_repeated_digits("abc112") # Raises ValueError + """ + seen_digits = set() + for char in password: + if char.isdigit(): + if char in seen_digits: + raise ValueError("str cannot contain repeated digits") + seen_digits.add(char) + return password
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
cuenca_validations/types/helpers.py
(1 hunks)cuenca_validations/version.py
(1 hunks)tests/test_types.py
(3 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- cuenca_validations/version.py
- cuenca_validations/types/helpers.py
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: Analyze (python)
🔇 Additional comments (2)
tests/test_types.py (2)
5-5
: LGTM! Import statements are well-organized.The new imports are properly organized and all are utilized in the implementation.
Also applies to: 9-10, 29-30
607-615
: Expand test coverage for metadata compatibility.As suggested in the past review comments, add test cases to validate compatibility with additional Pydantic metadata like
Field
and other validators.class LogConfigModel(BaseModel): password: Annotated[Password, LogConfig(masked=True)] validated: Annotated[ str, AfterValidator(validate_repeated_digits), LogConfig(masked=True) ] secret: Annotated[str, LogConfig(masked=True)] partial_secret: Annotated[ str, LogConfig(masked=True, unmasked_chars_length=4) ] + # Add test cases for additional metadata + with_field: Annotated[str, Field(min_length=3), LogConfig(masked=True)] + with_multiple: Annotated[ + str, + Field(max_length=10), + AfterValidator(str.lower), + LogConfig(masked=True) + ]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
tests/test_types.py (2)
595-604
: Enhance the validator implementation.Consider the following improvements:
- Move the
import re
statement to the top of the file with other imports- Make the error message more descriptive by including examples
- Consider using a more efficient regex pattern
-def validate_repeated_digits(password: str) -> str: +import re + +def validate_repeated_digits(password: str) -> str: """ Example of a custom validator Check if the str contains repeated numbers """ - import re - if re.search(r'(\d).*\1', password): - raise ValueError("str cannot contain repeated digits") + raise ValueError("String cannot contain repeated digits (e.g., '11' or '1a1' are invalid)") return password
607-616
: Add test cases for Pydantic metadata compatibility.As mentioned in the previous review, we should validate that
LogConfig
works correctly with other Pydantic metadata likeField
andBeforeValidator
.class LogConfigModel(BaseModel): password: Annotated[Password, LogConfig(masked=True)] validated: Annotated[ str, AfterValidator(validate_repeated_digits), LogConfig(masked=True) ] secret: Annotated[str, LogConfig(masked=True)] partial_secret: Annotated[ str, LogConfig(masked=True, unmasked_chars_length=4) ] unmasked: Annotated[str, LogConfig(masked=False)] + # Add test cases for other Pydantic metadata + field_with_constraints: Annotated[ + str, + Field(min_length=3, max_length=10), + LogConfig(masked=True) + ] + field_with_before_validator: Annotated[ + str, + BeforeValidator(str.lower), + LogConfig(masked=True) + ]
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
cuenca_validations/version.py
(1 hunks)tests/test_types.py
(3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- cuenca_validations/version.py
🔇 Additional comments (2)
tests/test_types.py (2)
Line range hint
5-30
: LGTM! Import statements are well-organized.All new imports are necessary and properly structured.
619-646
: 🛠️ Refactor suggestionEnhance test coverage with additional test cases.
The current test suite should be expanded to include:
- Negative test cases for invalid configurations
- Edge cases for unmasked_chars_length
- Tests for interaction with other Pydantic metadata
+@pytest.mark.parametrize( + "invalid_config", + [ + {"masked": True, "unmasked_chars_length": -1}, # negative length + {"masked": True, "unmasked_chars_length": 1000}, # unreasonable length + {"masked": "invalid"}, # invalid type for masked + ], +) +def test_log_config_invalid_configs(invalid_config): + with pytest.raises(ValueError): + LogConfig(**invalid_config) +def test_log_config_with_pydantic_metadata(): + model = LogConfigModel( + field_with_constraints="test123", + field_with_before_validator="TEST", + password="Mypass123.", + validated="str123", + secret="super-secret", + partial_secret="1234567890", + unmasked="unmasked", + ) + + # Test Field constraints + with pytest.raises(ValueError): + model.field_with_constraints = "ab" # too short + + # Test BeforeValidator + assert model.field_with_before_validator == "test"Likely invalid or redundant comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (4)
cuenca_validations/types/general.py (1)
88-92
: Add type hints and documentation to the LogConfig class.The class would benefit from docstrings explaining its purpose and attributes, along with proper type hints.
@dataclass class LogConfig: + """Configuration for field-level logging behavior. + + Attributes: + masked: Whether the field value should be masked in logs + unmasked_chars_length: Number of characters to leave unmasked from the start + excluded: Whether to exclude the field from logs entirely + """ - masked: bool = False - unmasked_chars_length: int = 0 - excluded: bool = False + masked: bool = False + unmasked_chars_length: int = 0 + excluded: bool = False + + def __post_init__(self): + if self.unmasked_chars_length < 0: + raise ValueError("unmasked_chars_length must be non-negative")tests/test_types.py (3)
595-604
: Move the re import to the top of the file.The import statement should be at the module level, not inside the function.
+import re from typing import Annotated import pytest from freezegun import freeze_time from pydantic import AfterValidator, BaseModel, SecretStr, ValidationError from pydantic.fields import FieldInfo def validate_repeated_digits(password: str) -> str: """ Example of a custom validator Check if the str contains repeated numbers """ - import re if re.search(r'(\d).*\1', password): raise ValueError("str cannot contain repeated digits") return password
607-618
: Add test cases for interaction with Field metadata.The model should verify that LogConfig can coexist with other Pydantic metadata like Field.
class LogConfigModel(BaseModel): password: Annotated[Password, LogConfig(masked=True)] validated: Annotated[ str, AfterValidator(validate_repeated_digits), LogConfig(masked=True) ] secret: Annotated[str, LogConfig(masked=True)] partial_secret: Annotated[ str, LogConfig(masked=True, unmasked_chars_length=4) ] unmasked: Annotated[str, LogConfig(masked=False)] excluded: Annotated[str, LogConfig(excluded=True)] + with_field: Annotated[ + str, + Field(min_length=3, max_length=10), + LogConfig(masked=True) + ]
620-647
: Add negative test cases for invalid unmasked_chars_length.The test suite should verify that negative values for unmasked_chars_length are rejected.
@pytest.mark.parametrize( "field_name,expected_masked,expected_unmasked_length,expected_excluded", [ ("password", True, 0, False), ("validated", True, 0, False), ("secret", True, 0, False), ("partial_secret", True, 4, False), ("unmasked", False, 0, False), ("excluded", False, 0, True), + ("with_field", True, 0, False), ], ) def test_log_config( field_name, expected_masked, expected_unmasked_length, expected_excluded ): model = LogConfigModel( password="Mypass123.", validated="str123", secret="super-secret", partial_secret="1234567890", unmasked="unmasked", excluded="excluded", + with_field="test123", ) field = model.model_fields[field_name] config = get_log_config(field) assert config.masked is expected_masked assert config.unmasked_chars_length == expected_unmasked_length assert config.excluded is expected_excluded +def test_log_config_invalid_values(): + """Test that invalid values for unmasked_chars_length are rejected.""" + with pytest.raises(ValueError, match="unmasked_chars_length must be non-negative"): + LogConfig(masked=True, unmasked_chars_length=-1)
Description
This PR introduces the
LogConfig
class to provide a flexible way to annotate fields with metadata, ensuring sensitive data is properly handled during logging.Changes
Summary by CodeRabbit
Summary by CodeRabbit
Release Notes v2.1.0
New Features
LogConfig
for enhanced logging configuration.LogConfigModel
for managing sensitive fields with custom validation.get_log_config
function to retrieve logging configurations.Improvements
Maintenance