Skip to content

Releases: kubeflow/arena

v0.4.0

12 May 04:03
829b0e9
Compare
Choose a tag to compare
  1. Add GPU support for PS
  2. Support Kubernetes 1.18 and above
  3. Fix the bug of deploying Prometheus

Please follow the Get started Guide to install.

v0.3.3

16 Mar 02:14
dfc8706
Compare
Choose a tag to compare
  1. Support non-root installation
  2. Add train init framework
  3. Fix the bug of using Estimator

Please follow the Get started Guide to install.

v0.3.2

15 Feb 09:46
f80d615
Compare
Choose a tag to compare
  1. Fix evaluator & chief validation
  2. Fix incorrect cpu resource variable, should be psCPU
  3. Set exit code as 2 when delete job failed

Please follow the Get started Guide to install.

v0.3.1

25 Dec 14:03
b96e1ac
Compare
Choose a tag to compare
  1. Upgrade Deployment version from extensions/v1beta1 to apps/v1
  2. Fix the issue of incorrect number of allocated GPUs
  3. Upgrade Helm to v2.14.1

Please follow the Get started Guide to install.

v0.3.1-beta

02 Dec 11:40
Compare
Choose a tag to compare
v0.3.1-beta Pre-release
Pre-release

Some bugs are fixed.

Please follow the Get started Guide to install.

v0.3.1-alpha

23 Aug 03:14
Compare
Choose a tag to compare
v0.3.1-alpha Pre-release
Pre-release

New Features:

Please follow the Get started Guide to install.

v0.3.0

19 Aug 03:34
Compare
Choose a tag to compare

New Features:

  • Add Priority class support for MPIJob and TFJob
  • Display Unhealthy GPU devices
  • Support custom serving
  • Add tarball installation for Linux and Mac

Please follow the Get started Guide to install.

v0.2.0

15 Jul 12:51
Compare
Choose a tag to compare
  • Support spark and volcano Job
  • Support multiple users and add PodSecurityContext for Training Job
  • Support TensorRT
  • Add priorities and preemption for mpijob
  • Refactoring code to remove dependency of helm create
  • Enhance cluster management

v0.1.0

21 Jan 02:45
Compare
Choose a tag to compare
  • support tfjob and mpijob for training
  • support GPU and RDMA
  • support serving
  • manage training job's life cycle, including job status, logs, GPU usage
  • mange GPU resource in the cluster

v0.1.0-rc.3

19 Jan 03:42
Compare
Choose a tag to compare
v0.1.0-rc.3 Pre-release
Pre-release
  • support mpi_operator
  • monitor GPU usage during training
  • support serving