-
Notifications
You must be signed in to change notification settings - Fork 166
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
how to check Federated Learning Job is working? #419
Comments
@MooreZheng @JoeyHwong-gk please give me some suggestions |
Hi there, In the provided example, the federated learning job simulates independent training on two separate edge nodes using their respective training data. The federated learning process combines these independent models' weights on the cloud, achieving the requirements of federated learning. After job completion, you can find the merged model's weights in the Hope this helps! Feel free to ask if you have further questions. |
Thanks for your reply @JoeyHwong-gk. Besides, I have two questions. Q1: Which version is recommended for the images of train and aggregation? I use isula instead of docker. After directly |
For Q1: For Q2: Feel free to review the logs for insights into the job's status and any potential issues. If you encounter specific errors in the logs, please provide those details for further assistance. |
/assign @jaypume |
Thanks for your reply @JoeyHwong-gk @jaypume. For Q1: Relevant useful information is as follows: On EDGE1_NODE,
and
|
I apologize for the inconvenience you faced with pulling the images. It's possible that network issues caused the problem. As an alternative, I recommend trying to build the containers directly using the build_image.sh script. This way, you can bypass potential network-related problems and create the containers locally. |
What happened:
I followed "Using Federated Learning Job in Surface Defect Detection Scenario".As the last step,"After the job completed, you will find the model generated on the directory /model in $EDGE1_NODE and $EDGE2_NODE."
So how can i check the job is completed or is working,and using
kubectl get federatedlearningjob surface-defect-detection
only shows NAME and AGEEnvironment:
openEuler 22.03 LTS
kubernetes v1.21.1
kubeedge v1.14.2
edgemesh v1.14.0
sedna v0.6.0
Sedna Version
Kubernets Version
KubeEdge Version
CloudSide Environment:
Hardware configuration
OS
Kernel
Others
EdgeSide Environment:
Hardware configuration
OS
Kernel
Others
The text was updated successfully, but these errors were encountered: