Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhancement: Parameter to use JSON values as default values. #63

Open
zbalkan opened this issue Oct 7, 2022 · 7 comments
Open

Enhancement: Parameter to use JSON values as default values. #63

zbalkan opened this issue Oct 7, 2022 · 7 comments

Comments

@zbalkan
Copy link

zbalkan commented Oct 7, 2022

Summary

JSON files used to generate already includes valid values. During conversion accepting them as the default values allows easy management later. This should be optional as it should be an intentional decision.

@wolverdude
Copy link
Owner

That makes sense, though which you should choose as the default and why isn't clear. I suppose a collections.Counter object could be used to tally up the values and choose the most common one.

This touches on a similar feature, which might actually be better: examples. That would allow users to see not just what type of data shows up in the schema but the actual data itself.

There are some challenges inherent in this though, since you may want an example not only at leaf nodes but even some parent nodes (tuples, for example). GenSON is also structured in such a way as to make it easy to add a new type of node, but not easy to add a new type of keyword across all nodes. I'll have to think on how to make that a little easier before adding a feature like this.

@shaikmoeed
Copy link

Hi @wolverdude, any update on this? It would be nice to have this option and you exactly mentioned my requirement i.e., examples which bought me here.

@zbalkan
Copy link
Author

zbalkan commented Sep 21, 2023

The default and examples fields provide two similar but semantically separate bits of information. However, they are not used in validation but just hints for the user, the end result would be the same. I can see both being implemented has value while it may be harder for a generator like this to handle this logic properly. Fingers crossed.

@shaikmoeed
Copy link

@zbalkan Agree with you. Do you have any recommendations to get example field using other generators?

@zbalkan
Copy link
Author

zbalkan commented Sep 21, 2023

No. I needed to hack my way around to parse the generated result, then add examples as a separate field in the schema. Since schema is also a valid JSON, it is a relatively simple workaround.

@wolverdude
Copy link
Owner

Cool. It seems like there's decent demand for this. It depends on keyword extensions, and having those would also make it easy to add a whole lot of other features (i.e. every JSON-Schema keyword not currently supported), but it's a tricky problem. I've been pondering it on and off today.

At first, I thought keywords was a cross-cutting concern from SchemaStrategy, but it isn't: keywords is dependent on and somewhat logically entangled with SchemaStrategy. I eventually realized that SchemaStrategy is actually a special case of keywords, but modeling it that way in code would introduce more complexity than is needed, and it would break backwards compatibility.

The solution I'm settling on is to create a KeywordStrategy class that defines add_object(),add_schema(), and to_schema() methods like SchemaStrategy. These classes can define one or more keywords. They can be attached to SchemaStrategys, and BaseSchemaStrategy will have join points to execute any attached KeywordStrategy code and integrate it into the resultant schema.

The nice thing about this is that it will modularize the existing code more and provide a way to extend for keywords specifically, independently of any SchemaStrategys. That said extension in this model is a pain: One must subclass every existing SchemaStrategy in order to add a single keyword to them.

To make this easier, I'll create a way to conveniently apply a KeywordStrategy to all SchemaStrategys (and maybe another for all scalars). I haven't worked out exactly how that will work yet, but I figure this should cover most extension use-cases.

Then I could add example, but default would probably be a case for extension since there's no defined procedure to determine what the default value is.

Does that sound like it'll work for you?

@zbalkan
Copy link
Author

zbalkan commented Sep 22, 2023

If I understood correctly, the alternative KeywordStrategy would add a lot of complexity. Since we have workarounds for now, there is no hurry.

When it comes to the default, if there is a single json object with less complexity, it is possible to assume whatever value is given in the source json can be assumed as the default. But if there's at least one array of objects, it means there are multiple candidates for default values. Inferring a default would increase the complexity without additional value.

I would suggest that finding a clear, simple and working way of adding examples can be a better option. After the implementation the pros and cons of adding a keyword would be clear, along with the alternative ways to implement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants