Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEAT] Native pydantic support #91

Open
dabdine opened this issue Jun 13, 2024 · 5 comments
Open

[FEAT] Native pydantic support #91

dabdine opened this issue Jun 13, 2024 · 5 comments
Labels
feature request Requests for new functionality

Comments

@dabdine
Copy link

dabdine commented Jun 13, 2024

Hola! Working with some rules now that I'm loading from markdown front-matter. To parse / validate the front-matter I'm using Pydantic (v2). During parsing, I wanted to natively transform a string field containing a rule-engine rule to a Rule class object. Here's the approach I took:

from typing import Any, Generic, Literal, Optional, TypeVar, cast

from pydantic import BaseModel, Field, GetCoreSchemaHandler, GetJsonSchemaHandler
from pydantic.json_schema import JsonSchemaValue
from pydantic_core import core_schema
from rule_engine import Context, DataType, Rule
from rule_engine.errors import SymbolResolutionError
from typing_extensions import get_args

SchemaType = TypeVar("SchemaType", bound=BaseModel)


class PydanticRule(Rule, Generic[SchemaType]):  # type: ignore
    """
    A class to store a Python `rule-engine` rule as a Pydantic model.
    """

    @classmethod
    def __get_pydantic_core_schema__(
        cls,
        _source_type: Any,
        _handler: GetCoreSchemaHandler,
    ) -> core_schema.CoreSchema:
        model_fields = cast(BaseModel, get_args(_source_type)[0]).model_fields

        def _python_to_rule_type(value: Any) -> DataType:
            # TODO: Handle additional datatypes, complex types (unions, etc.)
            try:
                # check if value is a literal
                if hasattr(value, "__origin__") and value.__origin__ is Literal:
                    return DataType.STRING
                return DataType.from_type(value)
            except TypeError:
                return DataType.UNDEFINED

        def resolve_pydantic_to_rule(field: str) -> DataType:
            if field not in model_fields:
                raise SymbolResolutionError(field)
            return _python_to_rule_type(model_fields[field].annotation)

        def validate_from_str(value: str) -> Rule:
            return Rule(
                value,
                context=Context(type_resolver=resolve_pydantic_to_rule),
            )

        from_str_schema = core_schema.chain_schema(
            [
                core_schema.str_schema(),
                core_schema.no_info_plain_validator_function(validate_from_str),
            ]
        )

        return core_schema.json_or_python_schema(
            json_schema=from_str_schema,
            python_schema=core_schema.union_schema(
                [
                    # check if it's an instance first before doing any further work
                    core_schema.is_instance_schema(Rule),
                    from_str_schema,
                ]
            ),
            serialization=core_schema.plain_serializer_function_ser_schema(
                lambda instance: str(cast(Rule, instance).text)
            ),
        )

    @classmethod
    def __get_pydantic_json_schema__(
        cls, _core_schema: core_schema.CoreSchema, handler: GetJsonSchemaHandler
    ) -> JsonSchemaValue:
        # Use the same schema that would be used for `int`
        return handler(core_schema.str_schema())


# example models
class OperatingSystem(BaseModel):
    vendor: str = Field(..., description="The vendor of the operating system", title="Operating system vendor")
    product: str = Field(..., description="The name of the operating system", title="Operating system name")
    family: Literal["linux", "windows", "macos"] = Field(
        ..., description="The family of the operating system", title="Operating system family"
    )
    version: Optional[str] = Field(
        None, description="The version of the operating system", title="Operating system version"
    )
    arch: Optional[str] = Field(
        None,
        description="The architecture of the operating system, (e.g. x86_64, x86, arm64)",
        title="Operating system architecture",
    )


class SomeModel(BaseModel):
    os: PydanticRule[OperatingSystem]


# define the rule that is read into the model
model = SomeModel.model_validate({"os": "vendor == 'Apple' and product == 'Mac OS X' and family == 'macos'"})

# test the rule against an input operating system
print(model.os.matches(OperatingSystem(vendor="Apple", product="Mac OS X", family="macos").model_dump()))

The PydanticRule takes a generic type parameter that is used to define the schema supplied to Context when Rule is instantiated. This allows the benefit of syntax/symbol error detection when the rule is compiled (read into the pydantic model) instead of at runtime.

I think it's a good idea to leave pydantic out of this lib (unless folks really need it). However, it may make sense to create a separate lib that contains the types so rule-engine can be used this way.

Also, we'd probably want to spend more time on the pydantic/python -> rule-engine type conversion. I haven't fully tested that yet.

@zeroSteiner
Copy link
Owner

That sounds neat. To clarify though, is there a specific ask here? I've not used Pydantic at all and I've only just recently started using type annotations in my Python code but have yet to start using them in this project in particular.

@dabdine
Copy link
Author

dabdine commented Jun 13, 2024

The ask is to natively support Pydantic by adding the pydantic schema methods (__get_pydantic_core_schema__, __get_pydantic_json_schema__) to the Rule class, just throwing out thoughts that if that's implemented it's probably best to do it in a separate module so there isn't a dependency on Pydantic.

@zeroSteiner
Copy link
Owner

That sounds cool. I like the idea of offering optional integrations with other popular libraries. I'm planning on doing something similar with SQLAlchemy eventually so I can get type info for my ORM models.

@zeroSteiner zeroSteiner added the feature request Requests for new functionality label Jun 13, 2024
@Divjyot
Copy link

Divjyot commented Jun 21, 2024

@zeroSteiner Thanks for creating this lib and found how easy to write rule are in english. This might be slightly related to this PR, however I am looking to understand from regular user of Pydantic Classes:

I trying to write type_resolver for my Pydantic class which has fields that of types str, int, Enum, Nested-Pydantic such as:

class Person(BaseModel):
  name : str 
  gender : GenderEnum
  dob : datetime
  licence:LicenceModel
  
class LicenceModel(BaseModel):
  lic_number :  int
  lic_type : LicTypeEnum 

so, how can I create context for rule-engine for datatype Person.

context = rule_engine.Context( 
  resolver = rule_engine.resolve_attribute,
 type_resolver = rule_engine.type_resolver_from_dict({
        
        'name':     rule_engine.DataType.STRING,
        'gender':  ??,
        'licence':   ??,
        'dob':  rule_engine.DataType.DATETIME
    })

1.1 I tried to set type { "licence.lic_number" : rule_engine.DataType. FLOAT } however, compiling Rule('person.lic_number == 123') failed with error AttributeResolutionError

1.2 What type can be set for gender and licence ? If it not possible in current version, would it be the case that I should convert the Pydantic Model into purely dict and then determine the type of the gender and licence.

@zeroSteiner
Copy link
Owner

Unfortunately, there are two things that are unsupported by what you're trying to do.

  1. This ticket, there is no Pydantic integration and I don't know if I'll work on it because I don't use Pydantic. I might look into it to see how much work it'd be.
  2. There is no OBJECT data type for defining typed compound data. The existing compound types, e.g. ARRAY, MAPPING, and SET all require their member types to be the same. There isn't a ticket for this, but I am planning on implementing it in the next release, which will be v4.6. At that point, you'd at least be able to do this, albeit with a bit more effort because of the lack of Pydantic support. It'd likely involve defining a type like license_t = DataType.OBJECT('License', {'lic_number': DataType.FLOAT, 'lic_type': DataType.STRING}) and then person_t = DataType.OBJECT('Person', {'license': license_t, .... That's the plan anyways.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request Requests for new functionality
Projects
None yet
Development

No branches or pull requests

3 participants