Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DRAFT] Deep nesting check #926

Closed
wants to merge 16 commits into from

Conversation

pjfanning
Copy link
Member

  • a proposal for how we might police the depth f input docs
  • if this approach is generally ok, I can add more javadoc and refactor the new test

@pjfanning pjfanning self-assigned this Feb 25, 2023
@pjfanning pjfanning marked this pull request as draft February 25, 2023 14:15
@@ -1547,4 +1549,19 @@ protected void loadMoreGuaranteed() throws IOException {

// Can't declare as deprecated, for now, but shouldn't be needed
protected void _finishString() throws IOException { }

protected final void createChildArrayContext(final int lineNr, final int colNr) {
_streamReadConstraints.validateDepth(++_depth);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to measure performance impact; if there is some, we could instead read max depth into local setting, inline check (that is, method local to parser).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes - I'll set up some benchmark testing

@cowtowncoder
Copy link
Member

Ah. So far I had been thinking of storing the depth layer in actual parsing context object (since it's always +1 of parent).
But it is true that we could instead just hold on to dynamic nesting in parser/generator, and avoid extra storage.
There is slightly higher risk of missed updates (everywhere where child context created/parent referenced we MUST update depth) but maybe it's worth it.

I think this approach makes sense; added some comments.

*
* @since 2.15
*/
public Builder maxDepth(final int maxDepth) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maxDepth could work; another possibility maxNesting. Not sure what terminology is most commonly used with JSON parsers.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've changed this to maxNestingDepth

@pjfanning
Copy link
Member Author

@cowtowncoder I've done a little benchmark testing using https://github.com/pjfanning/jackson-nest-bench - the results are a little erratic. I don't really have dedicated hardware for doing accurate benchmark testing. I'll do some more runs.

One side issue is that I think I made mistake with using IllegalStateException in StreamReadConstraints. I used it in the validateString method and am using for consistency on the method for validating the nesting depth. Would it be ok to use a checked exception (one that does extend RuntimeException)? Maybe, JsonParseException. validateString is new in jackson 2.15. Changing the exception type will break some code in jackson-databind and the jackson-dataformat* jars but can be quickly fixed.

@cowtowncoder
Copy link
Member

@cowtowncoder I've done a little benchmark testing using https://github.com/pjfanning/jackson-nest-bench - the results are a little erratic. I don't really have dedicated hardware for doing accurate benchmark testing. I'll do some more runs.

Ok. I should probably try building and running on my end too, on my desktop, to get another datapoint.

One side issue is that I think I made mistake with using IllegalStateException in StreamReadConstraints. I used it in the validateString method and am using for consistency on the method for validating the nesting depth. Would it be ok to use a checked exception (one that does extend RuntimeException)? Maybe, JsonParseException. validateString is new in jackson 2.15. Changing the exception type will break some code in jackson-databind and the jackson-dataformat* jars but can be quickly fixed.

Ah yes, I have been thinking about this. I think that IllegalStateException is not optimal and we should throw a JsonParseException instead, yes.
For all violations.

@pjfanning
Copy link
Member Author

Looks like it would be difficult in Jackson v2 to use a checked exception, like JsonParseException, for the string length checks. This is because the exceptions are thrown in TextBuffer class and that class does not have declared exceptions - on methods like getContentsAsString() for instance. Maybe, we can change this in jackson-core v3?

@pjfanning
Copy link
Member Author

@cowtowncoder in https://github.com/pjfanning/jackson-nest-bench, there seems no benefit in moving the validate method so that the limit is stored locally (the V2 implemention in jackson-nest-bench). For some reason, there does appear to be a little bit of a slow down in the Reader based JsonParser but not much proportionally. The byte stream parsers seem to not slow down at all.

@cowtowncoder
Copy link
Member

Ok, one non-trivial concern: not all format implementations extend ParserBase -- some use MinimalParserBase.
For those, I think it's necessary to duplicate much of handling at layer above.
Not sure if there is any reasonable way around that, just noting the concern.

Good to know performance might not be affected: I should really try building and running jackson-benchmarks for 2.15 with and without modifications.

@pjfanning
Copy link
Member Author

The parsing context implementations seem quite different - so different that what I have put in ParserBase won't work for the use cases where ParserMinimalBase is used instead. I think I'll end up having to write specific implementations of the validations for each class that extends ParserMinimalBase.

I have moved the _nestingDepth to ParserMinimalBase.

@cowtowncoder
Copy link
Member

cowtowncoder commented Mar 5, 2023

@pjfanning Correct, different implementations are needed for various reasons, for better or worse.

One thing I managed to do is to run preliminary checks on jackson-benchmarks vs 2.14.2 and 2.15.0-SNAPSHOT (last full runs are from year ago for 2.13.3), and it does not look like performance of 2.15.0-SNAPSHOT before these changes is measurably slower for the benchmark(s) in question. For what that's worth.
But at least it is possible to isolate effects of changes; I can build from this branch/PR locally, see if I spot any immediate differences.

EDIT: no significant performance degradation with changes. Might have 2-3% overall reduction from 2.14 to 2.15 for json, but hard to say -- and definitely not due to nesting change, even if there is one.

@pjfanning
Copy link
Member Author

pjfanning commented Mar 5, 2023

close in favour of #943

@pjfanning pjfanning closed this Mar 5, 2023
@pjfanning pjfanning deleted the deep-nesting-test branch March 15, 2023 14:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants