-
Notifications
You must be signed in to change notification settings - Fork 732
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allocation Error in PPML Image in Kubernetes: Initial Job Has Not Accepted Any Resource #5178
Comments
Hi @vi0eros BTW, |
Thank you for your response. I have verified that I am indeed running on an SGX-enabled platform. However, my EPC memory is somewhat limited. To address this, I have set the following parameter in --conf spark.kubernetes.sgx.enabled=false Specifically, I am wondering if the PPML image can run without SGX enabled, given that my EPC memory is quite limited. I have also removed the resources section from apiVersion: v1
kind: Pod
spec:
containers:
- name: spark-executor
env:
- name: ATTESTATION
value: false
- name: ATTESTATION_URL
value: your_attestation_url
- name: MALLOC_ARENA_MAX
value: 4
volumeMounts:
- name: device-plugin
mountPath: /var/lib/kubelet/device-plugins
- name: aesm-socket
mountPath: /var/run/aesmd/aesm.socket
- name: nfs-storage
mountPath: /ppml/data
# Removed resources section
volumes:
- name: device-plugin
hostPath:
path: /var/lib/kubelet/device-plugins
- name: aesm-socket
hostPath:
path: /var/run/aesmd/aesm.socket
- name: nfs-storage
persistentVolumeClaim:
claimName: nfsvolumeclaim Is it possible to run the PPML image in a non-SGX mode, and if so, are there any additional configurations or adjustments needed to ensure that it functions correctly under these conditions? Thank you for your assistance. |
Hi @vi0eros The example and image you chose are designed to run Apache Spark within Intel SGX. |
Given this information, I have decided to switch to using an Apache Spark image for my needs. This resolves my issue. |
You are welcome. :) |
Description:
When executing the script
ppml/trusted-bigdata/scripts/start-pyspark-pi-on-k8s-client-sgx.sh
using the$RUNTIME_K8S_SPARK_IMAGE
, I encountered a resource allocation error. The error log shows the following message repeatedly:However, when using the
apache/spark:v3.1.3
image, the job runs successfully and completes as expected.Script Details:
The script ppml/trusted-bigdata/scripts/start-pyspark-pi-on-k8s-client-sgx.sh is as follows:
Logs with PPML Image:
Script with Apache Spark Image:
Logs with Apache Spark Image:
Additional Information:
Building the PPML Image:
The PPML image is built using the following script
ppml/trusted-bigdata/custom-image/build-custom-image.sh
:Starting the PPML Image:
The PPML image is started using the following commands:
I am a beginner, please help me.
The text was updated successfully, but these errors were encountered: