UnsupportedOperationException: sun.misc.Unsafe or java.nio.DirectByteBuffer.<init>(long, int) not available #8207

VaibhavFRI · 2024-12-11T05:26:47Z

Backend

VL (Velox)

Bug description

I encountered an UnsupportedOperationException while running a Spark job with JDK17 using Gluten with the Velox backend on an ARM-based platform. The error occurs during execution, indicating that sun.misc.Unsafe or the java.nio.DirectByteBuffer constructor is not available.

Error message:
org.apache.gluten.exception.GlutenException: Error during calling Java code from native code:
java.lang.UnsupportedOperationException: sun.misc.Unsafe or java.nio.DirectByteBuffer.(long, int) not available

Command used to run spark job:
spark-submit --class com.example.KMeansExample --properties-file spark-config.conf --jars /path/to/gluten-velox-bundle-spark3.5_2.12-ubuntu_22.04_aarch_64-1.3.0-SNAPSHOT.jar target/<spark-build.jar>

Gluten Version: 1.3.0-SNAPSHOT
Spark Version: 3.5.2
JDK Version: 17
Platform: ARM (AWS Graviton)
Backend: Velox
OS: Ubuntu 22.04

Spark version

Spark-3.5.x

Spark configurations

spark.executor.instances 1
spark.executor.cores 1
spark.task.cpus 1
spark.dynamicAllocation.enabled false
spark.cores.max 1

spark.executor.memory 56g
spark.driver.memory 4g

spark.memory.offHeap.enabled true
spark.memory.offHeap.size 20g
spark.executor.memoryOverhead 1g

spark.driver.extraJavaOptions "--illegal-access=permit -Dio.netty.tryReflectionSetAccessible=true --add-opens java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.util=ALL-UNNAMED"
spark.executor.extraJavaOptions "--illegal-access=permit -Dio.netty.tryReflectionSetAccessible=true --add-opens java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.util=ALL-UNNAMED"

spark.plugins org.apache.gluten.GlutenPlugin
spark.gluten.sql.columnar.forceShuffledHashJoin true
spark.shuffle.manager org.apache.spark.shuffle.sort.ColumnarShuffleManager

spark.executor.extraClassPath '/pathto/gluten-velox-bundle-spark3.5_2.12-ubuntu_22.04_aarch_64-1.3.0-SNAPSHOT.jar'
spark.driver.extraClassPath '/pathto/gluten-velox-bundle-spark3.5_2.12-ubuntu_22.04_aarch_64-1.3.0-SNAPSHOT.jar'

System information

No response

Relevant logs

Caused by: org.apache.gluten.exception.GlutenException: org.apache.gluten.exception.GlutenException: Error during calling Java code from native code: java.lang.UnsupportedOperationException: sun.misc.Unsafe or java.nio.DirectByteBuffer.<init>(long, int) not available
        at io.netty.util.internal.PlatformDependent.directBuffer(PlatformDependent.java:534)
        at org.apache.gluten.vectorized.LowCopyFileSegmentJniByteInputStream.read(LowCopyFileSegmentJniByteInputStream.java:100)
        at org.apache.gluten.vectorized.ColumnarBatchOutIterator.nativeNext(Native Method)
        at org.apache.gluten.vectorized.ColumnarBatchOutIterator.next0(ColumnarBatchOutIterator.java:62)
        at org.apache.gluten.iterator.ClosableIterator.next(ClosableIterator.java:51)
        at org.apache.gluten.vectorized.ColumnarBatchSerializerInstance$TaskDeserializationStream.liftedTree1$1(ColumnarBatchSerializer.scala:180)
        at org.apache.gluten.vectorized.ColumnarBatchSerializerInstance$TaskDeserializationStream.readValue(ColumnarBatchSerializer.scala:179)
        at org.apache.spark.serializer.DeserializationStream$$anon$2.getNext(Serializer.scala:188)
        at org.apache.spark.serializer.DeserializationStream$$anon$2.getNext(Serializer.scala:185)
        at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73)
        at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:490)
        at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
        at org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:31)
        at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37)
        at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
        at scala.collection.Iterator.isEmpty(Iterator.scala:387)
        at scala.collection.Iterator.isEmpty$(Iterator.scala:387)
        at scala.collection.AbstractIterator.isEmpty(Iterator.scala:1431)
        at org.apache.gluten.execution.VeloxColumnarToRowExec$.toRowIterator(VeloxColumnarToRowExec.scala:121)
        at org.apache.gluten.execution.VeloxColumnarToRowExec.$anonfun$doExecuteInternal$1(VeloxColumnarToRowExec.scala:77)
        at org.apache.spark.rdd.RDD.$anonfun$mapPartitions$2(RDD.scala:858)
        at org.apache.spark.rdd.RDD.$anonfun$mapPartitions$2$adapted(RDD.scala:858)
        at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
        at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
        at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
        at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93)
        at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166)
        at org.apache.spark.scheduler.Task.run(Task.scala:141)
        at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620)
        at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
        at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
        at java.base/java.lang.Thread.run(Thread.java:840)

        at org.apache.gluten.iterator.ClosableIterator.next(ClosableIterator.java:53)
        at org.apache.gluten.vectorized.ColumnarBatchSerializerInstance$TaskDeserializationStream.liftedTree1$1(ColumnarBatchSerializer.scala:180)
        at org.apache.gluten.vectorized.ColumnarBatchSerializerInstance$TaskDeserializationStream.readValue(ColumnarBatchSerializer.scala:179)
        at org.apache.spark.serializer.DeserializationStream$$anon$2.getNext(Serializer.scala:188)
        at org.apache.spark.serializer.DeserializationStream$$anon$2.getNext(Serializer.scala:185)
        at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73)
        at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:490)
        at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
        at org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:31)
        at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37)
        at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
        at scala.collection.Iterator.isEmpty(Iterator.scala:387)
        at scala.collection.Iterator.isEmpty$(Iterator.scala:387)
        at scala.collection.AbstractIterator.isEmpty(Iterator.scala:1431)
        at org.apache.gluten.execution.VeloxColumnarToRowExec$.toRowIterator(VeloxColumnarToRowExec.scala:121)
        at org.apache.gluten.execution.VeloxColumnarToRowExec.$anonfun$doExecuteInternal$1(VeloxColumnarToRowExec.scala:77)
        at org.apache.spark.rdd.RDD.$anonfun$mapPartitions$2(RDD.scala:858)
        at org.apache.spark.rdd.RDD.$anonfun$mapPartitions$2$adapted(RDD.scala:858)
        at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
        at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
        at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
        at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93)
        at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166)
        at org.apache.spark.scheduler.Task.run(Task.scala:141)
        at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620)
        at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
        at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
        at java.base/java.lang.Thread.run(Thread.java:840)

zhztheplayer · 2024-12-17T02:40:29Z

@VaibhavFRI Not sure if including all recommended JVM options for Spark 3.5.2 will help. Would you want to try?

https://github.com/apache/spark/blob/v3.5.2/launcher/src/main/java/org/apache/spark/launcher/JavaModuleOptions.java

FelixYBW · 2024-12-17T18:59:38Z

Did you build and run Gluten both use jdk17? It looks your DirectByteBuffer. called by Gluten code is different from the one you used to run Gluten.

VaibhavFRI added bug Something isn't working triage labels Dec 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

UnsupportedOperationException: sun.misc.Unsafe or java.nio.DirectByteBuffer.<init>(long, int) not available #8207

UnsupportedOperationException: sun.misc.Unsafe or java.nio.DirectByteBuffer.<init>(long, int) not available #8207

VaibhavFRI commented Dec 11, 2024

zhztheplayer commented Dec 17, 2024 •

edited

Loading

FelixYBW commented Dec 17, 2024

UnsupportedOperationException: sun.misc.Unsafe or java.nio.DirectByteBuffer.<init>(long, int) not available #8207

UnsupportedOperationException: sun.misc.Unsafe or java.nio.DirectByteBuffer.<init>(long, int) not available #8207

Comments

VaibhavFRI commented Dec 11, 2024

Backend

Bug description

Spark version

Spark configurations

System information

Relevant logs

zhztheplayer commented Dec 17, 2024 • edited Loading

FelixYBW commented Dec 17, 2024

zhztheplayer commented Dec 17, 2024 •

edited

Loading