Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

iceberg's spark-runtime version 0.14 jar contains scala classes rather than 0.13 may cause ClassCastException #5732

Closed
KarlManong opened this issue Sep 9, 2022 · 11 comments · Fixed by #5754

Comments

@KarlManong
Copy link
Contributor

Apache Iceberg version

0.14.0 (latest release)

Query engine

Other

Please describe the bug 🐞

When using kyuubi, I got an exception:

企业微信截图_7d656f74-2b0f-416d-b643-1816c8973e38

After compared version 0.13.2 and 0.14.0, I found scala classed contained in 0.14.x which cause some ClassCastExceptions .
image

@KarlManong KarlManong changed the title iceberg's spark-runtime version 0.14 jar contains scala classes rather than 0.13 may cause ClassCastExceptions iceberg's spark-runtime version 0.14 jar contains scala classes rather than 0.13 may cause ClassCastException Sep 9, 2022
@Fokko
Copy link
Contributor

Fokko commented Sep 11, 2022

@KarlManong do you have more information on the Spark version and the version of Iceberg JARs that you are using? Would be good to be able to replicate.

@ajantha-bhat
Copy link
Member

image

I do see a scala folder in 0.14.0 when I compare spark-3.2 Iceberg runtime jars from 0.13.2 and 0.14.0

@jotarada
Copy link

I have a collision problem when doing assembly on spark 3.2.1, scala 2.12.12 when moving from iceberg runtime 0.13.2 to 0.14.0 or 0.14.1

@ajantha-bhat
Copy link
Member

@KarlManong : What are the steps to reproduce the issue?

@jotarada mentioned "in my case is doing sbt assembly, as it tries to build a fat jar with all the dependencies. Trying to add scala with existing scala in iceberg crashes"

@KarlManong
Copy link
Contributor Author

The key point is setting "spark.executor.userClassPathFirst" to true.

@Fokko @ajantha-bhat I have uploaded files to https://gist.github.com/KarlManong/cb30a80624788f4f7804c232dc2f743c.

Septs:

  1. please change the metastore uri at file 'spark-defaults.conf'
  2. run docker build
  3. run docker run -it <image-id> /bin/bash
  4. execute ./bin/kyuubi start in the container
  5. execute ./bin/beeline -u "jdbc:hive2://localhost:10009" in the container

Thank You!

@ajantha-bhat
Copy link
Member

It finally mapped down to #4009

this adds dependency on scala.collection.compat which brings scala code to the runtime jars (and also other spark iceberg jars)

@ajantha-bhat
Copy link
Member

@KarlManong , @jotarada : Can you please build a runtime jar from this PR (#5754) and confirm whether the issue is resolved?

@KarlManong
Copy link
Contributor Author

@ajantha-bhat verified, Thank You! Will this patch backport to branch-0.14?

@ajantha-bhat
Copy link
Member

@ajantha-bhat verified, Thank You! Will this patch backport to branch-0.14?

Even if we backport, it has to come via a release.
So, this fix will be available in the next release.

@Fokko
Copy link
Contributor

Fokko commented Sep 15, 2022

@KarlManong You could give it a try using the development snapshot: apache/iceberg-docs#162

@Fokko
Copy link
Contributor

Fokko commented Sep 16, 2022

I also added 0.14.2 as a discussion point for the upcoming sync: https://docs.google.com/document/d/1YuGhUdukLP5gGiqCbk0A5_Wifqe2CZWgOd3TbhY3UQg/edit#

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants