Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a ValidationOptions::with_registry method #682

Open
rayokota opened this issue Jan 26, 2025 · 9 comments · May be fixed by #685
Open

Add a ValidationOptions::with_registry method #682

rayokota opened this issue Jan 26, 2025 · 9 comments · May be fixed by #685

Comments

@rayokota
Copy link

Hi, thanks for this library. I'm currently porting some code that uses the Python jsonschema package. I already have a Registry instance in hand and in Python I'm using the following validate method:

validate(instance=value, schema=parsed_schema, registry=ref_registry)

Would it be possible for this library to add a corresponding ValidationOptions::with_registry method to match the ability to pass a pre-built Registry instance to the validator?

@Stranger6667
Copy link
Owner

Hi @rayokota

Thank you for opening this issue! Yes, it is something I plan to do

@Stranger6667
Copy link
Owner

Right now there are ValidationOptions::with_resource and ValidationOptions::with_resources which allow for providing additional resources for reference resolution.

Would it be enough for your use case if they are exposed via Python bindings? at least in the first iteration. There are a few things I'd like to resolve with registries before exposing them via the jsonschema API. I.e. fewer allocations internally during construction + providing cheaper clones by wrapping the registry into Arc / moving it to hash array mapped trie for cheaper modifications (though I need to think more about this use case when such modifications are needed)

@rayokota
Copy link
Author

Thanks for looking into this @Stranger6667 , I really appreciate it.

Yes, I could use the ValidationOptions::with_resources API, but right now Registry does not have a way for me to get the resources out of it. Would it be possible to provide a method on Registry that returns an iterator of the resources so that I can pass the iterator into ValidationOptions::with_resources?

@Stranger6667
Copy link
Owner

Sure thing! I am happy to learn more about your use case!

To clarify, your main use case is still in Python, right?

If so, then it looks like referencing::Registry should also be available via Python bindings + Registry should provide a public method to iterate over resources. For example Registry::resources that returns impl Iterator<Item = (&Uri<String>, Arc<Resource>)> and corresponding Python method that will have a similar interface. I am not 100% sure about what Python API would be the most ergonomic here, but maybe if Registry ought to be exposed via bindings anyway, then maybe it would be better to add a registry argument to validate / validator_for / etc. And internally it will either iterate over those resources or maybe some ValidationOptions::with_registry could work too.

Could you share some code you are porting? It will help me to have a better overview of what could be done here to enable this use case for you.

@rayokota
Copy link
Author

rayokota commented Jan 31, 2025

Actually, my use case is I am porting Python code to pure Rust code. So the new Rust code will not rely on any of the old Python code.

More specifically, I am creating a pure Rust client that can talk to Confluent Schema Registry. I am porting a Python client that has these lines

In my Rust code, I have a Registry instance, and I want to call validate. So I either need a ValidationOptions::with_registry method (the preferred solution) or an iterator on the Registry to get all the resources, which I can then pass to ValidationOptions::with_resources.

So the use case is entirely in Rust (but mimics the Python code). I'm happy to answer any questions to clarify the use case. Thanks!

@rayokota
Copy link
Author

rayokota commented Jan 31, 2025

I was looking at the Python implementation, and I think that in order to support ValidationOptions::with_registry, this repo just needs to implement something like the following in Rust:

https://github.com/python-jsonschema/jsonschema/blob/e7bff41afab3e5a54f074e55124ae2421c345bea/jsonschema/validators.py#L284-L289

which relies on a Registry::combine method, which would also be useful in Rust

@Stranger6667 Stranger6667 linked a pull request Jan 31, 2025 that will close this issue
@Stranger6667
Copy link
Owner

Thank you for the context! I added a straightforward implementation in #685 which is quite inefficient but works. Let me know if that is sufficient

I'll add some changes to Python bindings too and will make a release. Later I plan to rework the internals of Registry and make it more performant (this is mostly around avoiding cloning subtrees for anchors via pinning). Hopefully the current implementation won't be too slow in your case

@rayokota
Copy link
Author

Yes, this works for me. Thanks again @Stranger6667 ! :)

@Stranger6667
Copy link
Owner

Stranger6667 commented Feb 1, 2025

I’ve spent some more time looking at it and made some progress with registry refactoring I wanted to do. It will take some more time but I’ll ship it together with the interface we discussed and some nice performance improvements for registries specifically (much fewer subtree clones during processing)

Also there will be some unsafe and I need to audit it better and add Miri to CI, however, the approach should work (it was suggested by jonhoo some time ago at a conference, so I have some more confidence that it is the most performant way to make it work)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants