Spark 3 - java.lang.ArrayStoreException: java.lang.invoke.SerializedLambda #84

zHaytam · 2021-04-06T09:50:08Z

Hello,

We're trying to write a dataframe to redshift, using Spark 3.0.1 (on emr) and your connector, but we receive the following error:
WARN TaskSetManager: Lost task 0.0 in stage 2.0 (TID 2, 10.80.139.254, executor 1): java.lang.ArrayStoreException: java.lang.invoke.SerializedLambda

Packages added:

com.amazon.redshift:redshift-jdbc42-no-awssdk:1.2.36.1060
io.github.spark-redshift-community:spark-redshift_2.12:4.2.0
org.apache.spark:spark-avro_2.12:3.0.1

The text was updated successfully, but these errors were encountered:

jsleight · 2021-04-06T17:55:21Z

(I also saw your stackoverflow, so reading a bit from there that you suspect it is crashing on the write to s3 on some pretty simple data)

The write to s3 code is here, do you know which format you are writing with? That would help make a narrower example.

zHaytam · 2021-04-06T17:57:15Z

Hello,

We tried with Avro (default) and CSV, they both throw the same exception. I also read that part of the source code, I suspect that maybe it's because of either convertedRows or convertedSchema?

Thanks

jsleight · 2021-04-06T18:21:39Z

Could be, although the converters are just for decimal, date, and timestamp -- which aren't in your example. There is something about making the schema columns be lowercase, which would impact your example -- could see if all lowercase column names helps?

Otherwise, I'd check to see that this isn't a case of spark giving you a misinformative error (e.g., via lazy execution and the issue is actually somewhere else but this was the first spark action). Could try swapping out the redshift write with just an s3 write to the same path.

zHaytam · 2021-04-06T18:25:15Z

All the columns in the dataframe we're trying to write are lowercase.
Also, we are able to write the dataframe to s3 to the same path (without the conversions).

jsleight · 2021-04-07T15:34:16Z

Do you have an example df? The example you linked in stackoverflow has columns called ["ID", "TYPE", "CODE"] which are all uppercase. If you have decimal, date, or timestamp types in your df, then a bug in the converters seems more likely.

zHaytam · 2021-04-07T16:00:52Z

The dataframe that we tried is this:

name	id	type	count
x	0	cf	7

Nothing advanced.

jsleight · 2021-04-07T17:53:49Z

@88manpreet have any ideas? I don't see anything in the converters that should cause this error

88manpreet · 2021-05-25T02:55:26Z

@zHaytam sorry missed getting back and prioritizing this earlier. Is this issue still happening?

I tried to reproduce it in the integration tests for both avro and csv format.
Diff: https://gist.github.com/88manpreet/8049611246ee306628dfc3e9df7eb2ad

which I think imitates the above behavior. I could see the temp files created in the scratch path for both avro and csv format.

I also didn't see anything obviously wrong with the converters. I will keep trying to reproduce it in different ways.
I also noticed that the redshift-jdbc42-no-awssdk you are using is the same one we are using.

@zHaytam in the meantime is it possible to test this case with the latest version v5.0.3?

Would it also be possible for you to share the patch of the relevant code you are using to run into this scenario?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Spark 3 - java.lang.ArrayStoreException: java.lang.invoke.SerializedLambda #84

Spark 3 - java.lang.ArrayStoreException: java.lang.invoke.SerializedLambda #84

zHaytam commented Apr 6, 2021

jsleight commented Apr 6, 2021

zHaytam commented Apr 6, 2021

jsleight commented Apr 6, 2021

zHaytam commented Apr 6, 2021

jsleight commented Apr 7, 2021

zHaytam commented Apr 7, 2021 •

edited

Loading

jsleight commented Apr 7, 2021

88manpreet commented May 25, 2021 •

edited

Loading

Spark 3 - java.lang.ArrayStoreException: java.lang.invoke.SerializedLambda #84

Spark 3 - java.lang.ArrayStoreException: java.lang.invoke.SerializedLambda #84

Comments

zHaytam commented Apr 6, 2021

jsleight commented Apr 6, 2021

zHaytam commented Apr 6, 2021

jsleight commented Apr 6, 2021

zHaytam commented Apr 6, 2021

jsleight commented Apr 7, 2021

zHaytam commented Apr 7, 2021 • edited Loading

jsleight commented Apr 7, 2021

88manpreet commented May 25, 2021 • edited Loading

zHaytam commented Apr 7, 2021 •

edited

Loading

88manpreet commented May 25, 2021 •

edited

Loading