v0.6.0 release
Highlights
-
RayService
- RayService starts to support Ray Serve multi-app API (#1136, #1156)
- RayService stability improvements (#1231, #1207, #1173)
- RayService observability (#1230)
- RayService examples
- [RayService] Stable Diffusion example (#1181, @kevin85421)
- MobileNet example (#1175, @kevin85421)
- RayService troubleshooting handbook (#1221)
-
RayJob refactoring (#1177)
RayService
- [RayService][Observability] Add more logging for RayService troubleshooting (#1230, @kevin85421)
- [Bug] Long image pull time will trigger blue-green upgrade after the head is ready (#1231, @kevin85421)
- [RayService] Stable Diffusion example (#1181, @kevin85421)
- [RayService] Update docs to use multi-app (#1179, @zcin)
- [RayService] Change runtime env for e2e autoscaling test (#1178, @zcin)
- [RayService] Add e2e tests (#1167, @zcin)
- [RayService][docs] Improve explanation for config file and in-place updates (#1229, @zcin)
- [RayService][Doc] RayService troubleshooting handbook (#1221, @kevin85421)
- [Doc] Improve RayService doc (#1235, @kevin85421)
- [Doc] Improve FAQ page and RayService troubleshooting guide (#1225, @kevin85421)
- [RayService] Add RayService alb ingress CR (#1169, @sihanwang41)
- [RayService] Add support for multi-app config in yaml-string format (#1156, @zcin)
- [rayservice] Add support for getting multi-app status (#1136, @zcin)
- [Refactor] Remove Dashboard Agent service (#1207, @kevin85421)
- [Bug] KubeRay operator fails to get serve deployment status due to 500 Internal Server Error (#1173, @kevin85421)
- MobileNet example (#1175, @kevin85421)
- [Bug] fix RayActorOptionSpec.items.spec.serveConfig.deployments.rayActorOptions.memory int32 data type (#1220, @kevin85421)
RayJob
- [RayJob] Submit job using K8s job instead of checking Status and using DashboardHTTPClient (#1177, @architkulkarni)
- [Doc] [RayJob] Add documentation for submitterPodTemplate (#1228, @architkulkarni)
Autoscaler
- [release blocker][Feature] Only Autoscaler can make decisions to delete Pods (#1253, @kevin85421)
- [release blocker][Autoscaler] Randomly delete Pods when scaling down the cluster (#1251, @kevin85421)
Helm
- [Helm][RBAC] Introduce the option crNamespacedRbacEnable to enable or disable the creation of Role/RoleBinding for RayCluster preparation (#1162, @kevin85421)
- [Bug] Allow zero replica for workers for Helm (#968, @ducviet00)
- [Bug] KubeRay tries to create ClusterRoleBinding when singleNamespaceInstall and rbacEnable are set to true (#1190, @kevin85421)
KubeRay API Server
- Add support for openshift routes (#1183, @blublinsky)
- Adding API server support for service account (#1148, @blublinsky)
Documentation
- [release v0.6.0] Update tags and versions (#1270, @kevin85421)
- [release v0.6.0-rc.1] Update tags and versions (#1264, @kevin85421)
- [release v0.6.0-rc.0] Update tags and versions (#1237, @kevin85421)
- [Doc] Develop Ray Serve Python script on KubeRay (#1250, @kevin85421)
- [Doc] Fix the order of comments in sample Job YAML file (#1242, @architkulkarni)
- [Doc] Upload a screenshot for the Serve page in Ray dashboard (#1236, @kevin85421)
- [Doc] GKE GPU cluster setup (#1223, @kevin85421)
- [Doc][Website] Add complete document link (#1224, @yuxiaoba)
- Add FAQ page (#1150, @Yicheng-Lu-llll)
- [Doc] Add gofumpt lint instructions (#1180, @architkulkarni)
- [Doc] Add
helm update
command to chart validation step in release process (#1165, @architkulkarni) - [Doc] Add git fetch --tags command to release instructions (#1164, @architkulkarni)
- Add KubeRay related blogs (#1147, @tedhtchang)
- [2.5.0 Release] Change version numbers 2.4.0 -> 2.5.0 (#1151, @ArturNiederfahrenhorst)
- [Sample YAML] Bump ray version in pod security YAML to 2.4.0 (#1160, @architkulkarni)
- Add instruction to skip unit tests in DEVELOPMENT.md (#1171, @architkulkarni)
- Fix typo (#1241, @mmourafiq)
- Fix typo (#1232, @mmourafiq)
CI
- [CI] Add
kind
-in-Docker test to Buildkite CI (#1243, @architkulkarni) - [CI] Remove unnecessary release.yaml workflow (#1168, @architkulkarni)
Others
- Pin operator version in single namespace installation(#1193) (#1210, @wjzhou)
- RayCluster updates status frequently (#1211, @kevin85421)
- Improve the observability of the init container (#1149, @Yicheng-Lu-llll)
- [Ray Observability] Disk usage in Dashboard (#1152, @kevin85421)