Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON functions fail on large input due to Jackson's string length limit #17843

Closed
arghya18 opened this issue Jun 12, 2023 · 5 comments · Fixed by #17854
Closed

JSON functions fail on large input due to Jackson's string length limit #17843

arghya18 opened this issue Jun 12, 2023 · 5 comments · Fixed by #17854
Assignees
Labels
bug Something isn't working RELEASE-BLOCKER

Comments

@arghya18
Copy link

As discussed in slack
https://trinodb.slack.com/archives/CGB0QHWSW/p1686050073739899
getting below error for json_extract_scalar function after upgrading from 407 to 419.

I also could not find a way to change the limit

java.io.UncheckedIOException: com.fasterxml.jackson.core.exc.StreamConstraintsException: String length (20054016) exceeds the maximum length (20000000)
	at io.trino.operator.scalar.JsonExtract.extract(JsonExtract.java:145)
	at io.trino.operator.scalar.JsonFunctions.varcharJsonExtractScalar(JsonFunctions.java:417)
	at io.trino.$gen.PageFilter_20230606_110342_81.filter(Unknown Source)
	at io.trino.$gen.PageFilter_20230606_110342_81.filter(Unknown Source)
	at io.trino.operator.project.PageProcessor.createWorkProcessor(PageProcessor.java:128)
	at io.trino.operator.ScanFilterAndProjectOperator$SplitToPages.lambda$processPageSource$1(ScanFilterAndProjectOperator.java:291)
	at io.trino.operator.WorkProcessorUtils.lambda$flatMap$4(WorkProcessorUtils.java:286)
	at io.trino.operator.WorkProcessorUtils$3.process(WorkProcessorUtils.java:360)
	at io.trino.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:413)
	at io.trino.operator.WorkProcessorUtils$3.process(WorkProcessorUtils.java:347)
	at io.trino.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:413)
	at io.trino.operator.WorkProcessorUtils$3.process(WorkProcessorUtils.java:347)
	at io.trino.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:413)
	at io.trino.operator.WorkProcessorUtils.getNextState(WorkProcessorUtils.java:262)
	at io.trino.operator.WorkProcessorUtils$BlockingProcess.process(WorkProcessorUtils.java:208)
	at io.trino.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:413)
	at io.trino.operator.WorkProcessorUtils.lambda$flatten$6(WorkProcessorUtils.java:318)
	at io.trino.operator.WorkProcessorUtils$3.process(WorkProcessorUtils.java:360)
	at io.trino.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:413)
	at io.trino.operator.WorkProcessorUtils$3.process(WorkProcessorUtils.java:347)
	at io.trino.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:413)
	at io.trino.operator.WorkProcessorUtils.getNextState(WorkProcessorUtils.java:262)
	at io.trino.operator.WorkProcessorUtils.lambda$processStateMonitor$2(WorkProcessorUtils.java:241)
	at io.trino.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:413)
	at io.trino.operator.WorkProcessorUtils.getNextState(WorkProcessorUtils.java:262)
	at io.trino.operator.WorkProcessorUtils.lambda$finishWhen$3(WorkProcessorUtils.java:256)
	at io.trino.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:413)
	at io.trino.operator.WorkProcessorSourceOperatorAdapter.getOutput(WorkProcessorSourceOperatorAdapter.java:146)
	at io.trino.operator.Driver.processInternal(Driver.java:402)
	at io.trino.operator.Driver.lambda$process$8(Driver.java:305)
	at io.trino.operator.Driver.tryWithLock(Driver.java:701)
	at io.trino.operator.Driver.process(Driver.java:297)
	at io.trino.operator.Driver.processForDuration(Driver.java:268)
	at io.trino.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:888)
	at io.trino.execution.executor.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:187)
	at io.trino.execution.executor.TaskExecutor$TaskRunner.run(TaskExecutor.java:556)
	at io.trino.$gen.Trino_419____20230606_104539_2.run(Unknown Source)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
	at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: com.fasterxml.jackson.core.exc.StreamConstraintsException: String length (20054016) exceeds the maximum length (20000000)
	at com.fasterxml.jackson.core.StreamReadConstraints.validateStringLength(StreamReadConstraints.java:295)
	at com.fasterxml.jackson.core.util.ReadConstrainedTextBuffer.validateStringLength(ReadConstrainedTextBuffer.java:27)
	at com.fasterxml.jackson.core.util.TextBuffer.finishCurrentSegment(TextBuffer.java:939)
	at com.fasterxml.jackson.core.json.ReaderBasedJsonParser._finishString2(ReaderBasedJsonParser.java:2240)
	at com.fasterxml.jackson.core.json.ReaderBasedJsonParser._finishString(ReaderBasedJsonParser.java:2206)
	at com.fasterxml.jackson.core.json.ReaderBasedJsonParser.getText(ReaderBasedJsonParser.java:323)
	at io.trino.operator.scalar.JsonExtract$ScalarValueJsonExtractor.extract(JsonExtract.java:278)
	at io.trino.operator.scalar.JsonExtract$ScalarValueJsonExtractor.extract(JsonExtract.java:264)
	at io.trino.operator.scalar.JsonExtract$ObjectFieldJsonExtractor.processJsonObject(JsonExtract.java:234)
	at io.trino.operator.scalar.JsonExtract$ObjectFieldJsonExtractor.extract(JsonExtract.java:208)
	at io.trino.operator.scalar.JsonExtract.extract(JsonExtract.java:137)
	... 39 more
@kokosing kokosing added bug Something isn't working RELEASE-BLOCKER labels Jun 12, 2023
@chenjian2664
Copy link
Contributor

Related to FasterXML/jackson-core#863, In FasterXML/jackson-core#1014 the default value was increased to 20M.
What's the our decision? should we apply limit on the supported json string length? @kokosing

@kokosing
Copy link
Member

@wendigo @hashhar Are we going to follow with the same approach as in airlift/airlift#1069?

@hashhar
Copy link
Member

hashhar commented Jun 12, 2023

Yes, the solution would look similar. There are a lot of direct instantiations of JsonFactory in Trino. So we either need to add a provider for JsonFactory as well in Airlift which we can use everywhere or refactor Trino such that the JsonFactory creation all passes through a single place and then add a modernizer rule to disallow direct JsonFactory creation?

@hashhar hashhar changed the title Jackson String Length Error JSON functions fail on large input due to Jackson's string length limit Jun 12, 2023
@chenjian2664
Copy link
Contributor

I'll take this

@chenjian2664 chenjian2664 self-assigned this Jun 12, 2023
@kokosing
Copy link
Member

Please coordinate with @wendigo who is also working on it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working RELEASE-BLOCKER
Development

Successfully merging a pull request may close this issue.

5 participants