You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I encountered an UnsupportedOperationException while running a Spark job with JDK17 using Gluten with the Velox backend on an ARM-based platform. The error occurs during execution, indicating that sun.misc.Unsafe or the java.nio.DirectByteBuffer constructor is not available.
Error message:
org.apache.gluten.exception.GlutenException: Error during calling Java code from native code:
java.lang.UnsupportedOperationException: sun.misc.Unsafe or java.nio.DirectByteBuffer.(long, int) not available
Command used to run spark job:
spark-submit --class com.example.KMeansExample --properties-file spark-config.conf --jars /path/to/gluten-velox-bundle-spark3.5_2.12-ubuntu_22.04_aarch_64-1.3.0-SNAPSHOT.jar target/<spark-build.jar>
Caused by: org.apache.gluten.exception.GlutenException: org.apache.gluten.exception.GlutenException: Error during calling Java code from native code: java.lang.UnsupportedOperationException: sun.misc.Unsafe or java.nio.DirectByteBuffer.<init>(long, int) not available
at io.netty.util.internal.PlatformDependent.directBuffer(PlatformDependent.java:534)
at org.apache.gluten.vectorized.LowCopyFileSegmentJniByteInputStream.read(LowCopyFileSegmentJniByteInputStream.java:100)
at org.apache.gluten.vectorized.ColumnarBatchOutIterator.nativeNext(Native Method)
at org.apache.gluten.vectorized.ColumnarBatchOutIterator.next0(ColumnarBatchOutIterator.java:62)
at org.apache.gluten.iterator.ClosableIterator.next(ClosableIterator.java:51)
at org.apache.gluten.vectorized.ColumnarBatchSerializerInstance$TaskDeserializationStream.liftedTree1$1(ColumnarBatchSerializer.scala:180)
at org.apache.gluten.vectorized.ColumnarBatchSerializerInstance$TaskDeserializationStream.readValue(ColumnarBatchSerializer.scala:179)
at org.apache.spark.serializer.DeserializationStream$$anon$2.getNext(Serializer.scala:188)
at org.apache.spark.serializer.DeserializationStream$$anon$2.getNext(Serializer.scala:185)
at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:490)
at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
at org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:31)
at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37)
at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
at scala.collection.Iterator.isEmpty(Iterator.scala:387)
at scala.collection.Iterator.isEmpty$(Iterator.scala:387)
at scala.collection.AbstractIterator.isEmpty(Iterator.scala:1431)
at org.apache.gluten.execution.VeloxColumnarToRowExec$.toRowIterator(VeloxColumnarToRowExec.scala:121)
at org.apache.gluten.execution.VeloxColumnarToRowExec.$anonfun$doExecuteInternal$1(VeloxColumnarToRowExec.scala:77)
at org.apache.spark.rdd.RDD.$anonfun$mapPartitions$2(RDD.scala:858)
at org.apache.spark.rdd.RDD.$anonfun$mapPartitions$2$adapted(RDD.scala:858)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93)
at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166)
at org.apache.spark.scheduler.Task.run(Task.scala:141)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.base/java.lang.Thread.run(Thread.java:840)
at org.apache.gluten.iterator.ClosableIterator.next(ClosableIterator.java:53)
at org.apache.gluten.vectorized.ColumnarBatchSerializerInstance$TaskDeserializationStream.liftedTree1$1(ColumnarBatchSerializer.scala:180)
at org.apache.gluten.vectorized.ColumnarBatchSerializerInstance$TaskDeserializationStream.readValue(ColumnarBatchSerializer.scala:179)
at org.apache.spark.serializer.DeserializationStream$$anon$2.getNext(Serializer.scala:188)
at org.apache.spark.serializer.DeserializationStream$$anon$2.getNext(Serializer.scala:185)
at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:490)
at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
at org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:31)
at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37)
at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
at scala.collection.Iterator.isEmpty(Iterator.scala:387)
at scala.collection.Iterator.isEmpty$(Iterator.scala:387)
at scala.collection.AbstractIterator.isEmpty(Iterator.scala:1431)
at org.apache.gluten.execution.VeloxColumnarToRowExec$.toRowIterator(VeloxColumnarToRowExec.scala:121)
at org.apache.gluten.execution.VeloxColumnarToRowExec.$anonfun$doExecuteInternal$1(VeloxColumnarToRowExec.scala:77)
at org.apache.spark.rdd.RDD.$anonfun$mapPartitions$2(RDD.scala:858)
at org.apache.spark.rdd.RDD.$anonfun$mapPartitions$2$adapted(RDD.scala:858)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93)
at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166)
at org.apache.spark.scheduler.Task.run(Task.scala:141)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.base/java.lang.Thread.run(Thread.java:840)
The text was updated successfully, but these errors were encountered:
Backend
VL (Velox)
Bug description
I encountered an UnsupportedOperationException while running a Spark job with JDK17 using Gluten with the Velox backend on an ARM-based platform. The error occurs during execution, indicating that sun.misc.Unsafe or the java.nio.DirectByteBuffer constructor is not available.
Error message:
org.apache.gluten.exception.GlutenException: Error during calling Java code from native code:
java.lang.UnsupportedOperationException: sun.misc.Unsafe or java.nio.DirectByteBuffer.(long, int) not available
Command used to run spark job:
spark-submit --class com.example.KMeansExample --properties-file spark-config.conf --jars /path/to/gluten-velox-bundle-spark3.5_2.12-ubuntu_22.04_aarch_64-1.3.0-SNAPSHOT.jar target/<spark-build.jar>
Gluten Version: 1.3.0-SNAPSHOT
Spark Version: 3.5.2
JDK Version: 17
Platform: ARM (AWS Graviton)
Backend: Velox
OS: Ubuntu 22.04
Spark version
Spark-3.5.x
Spark configurations
spark.executor.instances 1
spark.executor.cores 1
spark.task.cpus 1
spark.dynamicAllocation.enabled false
spark.cores.max 1
spark.executor.memory 56g
spark.driver.memory 4g
spark.memory.offHeap.enabled true
spark.memory.offHeap.size 20g
spark.executor.memoryOverhead 1g
spark.driver.extraJavaOptions "--illegal-access=permit -Dio.netty.tryReflectionSetAccessible=true --add-opens java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.util=ALL-UNNAMED"
spark.executor.extraJavaOptions "--illegal-access=permit -Dio.netty.tryReflectionSetAccessible=true --add-opens java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.util=ALL-UNNAMED"
spark.plugins org.apache.gluten.GlutenPlugin
spark.gluten.sql.columnar.forceShuffledHashJoin true
spark.shuffle.manager org.apache.spark.shuffle.sort.ColumnarShuffleManager
spark.executor.extraClassPath '/pathto/gluten-velox-bundle-spark3.5_2.12-ubuntu_22.04_aarch_64-1.3.0-SNAPSHOT.jar'
spark.driver.extraClassPath '/pathto/gluten-velox-bundle-spark3.5_2.12-ubuntu_22.04_aarch_64-1.3.0-SNAPSHOT.jar'
System information
No response
Relevant logs
The text was updated successfully, but these errors were encountered: