An opaque token associated with an entity object used to implement optimistic concurrency control.
Field | Type | Label | Description |
---|---|---|---|
value | string |
A unique ID assigned to offers from a host.
Field | Type | Label | Description |
---|---|---|---|
value | string |
A unique ID assigned to a Job. This is a UUID in RFC4122 format.
Field | Type | Label | Description |
---|---|---|---|
value | string |
Key, value pair used to store free form user-data.
Field | Type | Label | Description |
---|---|---|---|
key | string | ||
value | string |
Opaque data passed to Peloton from the client. Passing an empty string in the structure will unset the existing data.
Field | Type | Label | Description |
---|---|---|---|
data | string |
A unique ID assigned to a pod. It should be treated as an opaque token.
Field | Type | Label | Description |
---|---|---|---|
value | string |
A unique name assigned to a pod. By default, the pod name is in the format of JobID-<InstanceID>.
Field | Type | Label | Description |
---|---|---|---|
value | string |
A unique ID assigned to a Resource Pool. This is a UUID in RFC4122 format.
Field | Type | Label | Description |
---|---|---|---|
value | string |
Revision of an entity info, such as JobSpec etc.
Field | Type | Label | Description |
---|---|---|---|
version | uint64 | Version number of the entity info which is monotonically increasing. Clients can use this to guide against race conditions using MVCC. | |
created_at | uint64 | The timestamp when the entity info is created | |
updated_at | uint64 | The timestamp when the entity info is updated | |
updated_by | string | The user or service that updated the entity info |
Secret is used to store secrets per job and contains ID, absolute container mount path and base64 encoded secret data
Field | Type | Label | Description |
---|---|---|---|
secret_id | SecretID | UUID of the secret | |
path | string | Path at which the secret file will be mounted in the container | |
value | Secret.Value | Secret value |
Field | Type | Label | Description |
---|---|---|---|
data | bytes | Secret data as byte array |
A unique ID assigned to a Secret. This is a UUID in RFC4122 format.
Field | Type | Label | Description |
---|---|---|---|
value | string |
Time range specified by min and max timestamps. Time range is left closed and right open: [min, max)
Field | Type | Label | Description |
---|---|---|---|
min | .google.protobuf.Timestamp | ||
max | .google.protobuf.Timestamp |
A unique ID assigned to a Volume. This is a UUID in RFC4122 format.
Field | Type | Label | Description |
---|---|---|---|
value | string |
Order by clause of a query
Field | Type | Label | Description |
---|---|---|---|
order | OrderBy.Order | ||
property | PropertyPath |
Generic pagination for a list of records to be returned by a query
Field | Type | Label | Description |
---|---|---|---|
offset | uint32 | Offset of the pagination for a query result | |
limit | uint32 | Limit of the pagination for a query result | |
total | uint32 | Total number of records for a query result |
Pagination query spec used as argument to queries that returns a Pagination result.
Field | Type | Label | Description |
---|---|---|---|
offset | uint32 | Offset of the query for pagination | |
limit | uint32 | Limit per page of the query for pagination | |
order_by | OrderBy | repeated | List of fields to be order by in sequence |
max_limit | uint32 | Max limit of the pagination result. |
A dot separated path to a object property such as config.name or runtime.creationTime for a job object.
Field | Type | Label | Description |
---|---|---|---|
value | string |
Name | Number | Description |
---|---|---|
ORDER_BY_INVALID | 0 | |
ORDER_BY_ASC | 1 | |
ORDER_BY_DESC | 2 |
AndConstraint represents a logical 'and' of constraints.
Field | Type | Label | Description |
---|---|---|---|
constraints | Constraint | repeated |
Constraint represents a host label constraint or a related pods label constraint. This is used to require that a host have certain label constraints or to require that the pods already running on the host have certain label constraints.
Field | Type | Label | Description |
---|---|---|---|
type | Constraint.Type | ||
label_constraint | LabelConstraint | ||
and_constraint | AndConstraint | ||
or_constraint | OrConstraint |
A single application container running inside a pod
Field | Type | Label | Description |
---|---|---|---|
name | string | Name of the container. Each container in a pod must have a unique name. Cannot be updated. | |
resource | ResourceSpec | Resource config of the container | |
container | .mesos.v1.ContainerInfo | Container config of the container | |
command | .mesos.v1.CommandInfo | Command line config of the container | |
executor | .mesos.v1.ExecutorInfo | Custom executor config of the task. | |
liveness_check | HealthCheckSpec | Liveness health check config of the container | |
readiness_check | HealthCheckSpec | Readiness health check config of the container This is currently not supported. | |
ports | PortSpec | repeated | List of network ports to be allocated for the pod |
Runtime status of a container in a pod
Field | Type | Label | Description |
---|---|---|---|
name | string | Name of the container | |
state | ContainerState | Runtime state of the container | |
ports | ContainerStatus.PortsEntry | repeated | Dynamic ports reserved on the host while this container is running |
message | string | The message that explains the current state of a container such as why the container is failed. Only track the latest one if the container has been retried and failed multiple times. | |
reason | string | The reason that explains the current state of a container. Only track the latest one if the container has been retried and failed multiple times. | |
failure_count | uint32 | The number of times the container has failed after retries. | |
healthy | HealthStatus | The result of the health check | |
image | string | The image the container is running | |
start_time | string | The time when the container starts to run. Will be unset if the pod hasn't started running yet. The time is represented in RFC3339 form with UTC timezone. | |
completion_time | string | The time when the container terminated. Will be unset if the pod hasn't completed yet. The time is represented in RFC3339 form with UTC timezone. | |
terminationStatus | TerminationStatus | Termination status of the task. Set only if the task is in a non-successful terminal state such as CONTAINER_STATE_FAILED or CONTAINER_STATE_KILLED. |
Field | Type | Label | Description |
---|---|---|---|
key | string | ||
value | uint32 |
Health check configuration for a container
Field | Type | Label | Description |
---|---|---|---|
enabled | bool | Whether the health check is enabled. | |
initial_interval_secs | uint32 | Start time wait in seconds. Zero or empty value would use default value of 15 from Mesos. | |
interval_secs | uint32 | Interval in seconds between two health checks. Zero or empty value would use default value of 10 from Mesos. | |
max_consecutive_failures | uint32 | Max number of consecutive failures before failing health check. Zero or empty value would use default value of 3 from Mesos. | |
timeout_secs | uint32 | Health check command timeout in seconds. Zero or empty value would use default value of 20 from Mesos. | |
type | HealthCheckSpec.HealthCheckType | ||
command_check | HealthCheckSpec.CommandCheck | Only applicable when type is COMMAND . |
Field | Type | Label | Description |
---|---|---|---|
command | string | Health check command to be executed. Note that this command by default inherits all environment varibles from the container it's monitoring, unless unshare_environments is set to true. |
|
unshare_environments | bool | If set, this check will not share the environment variables of the container. |
The result of the health check
Field | Type | Label | Description |
---|---|---|---|
state | HealthState | The health check state | |
output | string | The output of the health check run |
Pod InstanceID range [from, to)
Field | Type | Label | Description |
---|---|---|---|
from | uint32 | ||
to | uint32 |
LabelConstraint represents a constraint on the number of occurrences of a given label from the set of host labels or pod labels present on the host.
Field | Type | Label | Description |
---|---|---|---|
kind | LabelConstraint.Kind | Determines which labels the constraint should apply to. | |
condition | LabelConstraint.Condition | Determines which constraint there should be on the number of occurrences of the label. | |
label | .peloton.api.v1alpha.peloton.Label | The label which this defines a constraint on: For Kind == HOST, each attribute on Mesos agent is transformed to a label, with hostname as a special label which is always inferred from agent hostname and set. |
|
requirement | uint32 | A limit on the number of occurrences of the label. |
OrConstraint represents a logical 'or' of constraints.
Field | Type | Label | Description |
---|---|---|---|
constraints | Constraint | repeated |
Persistent volume configuration for a pod.
Field | Type | Label | Description |
---|---|---|---|
container_path | string | Volume mount path inside container. | |
size_mb | uint32 | Volume size in MB. |
Pod events of a particular run of a job instance.
Field | Type | Label | Description |
---|---|---|---|
pod_id | .peloton.api.v1alpha.peloton.PodID | The current pod ID | |
actual_state | string | Actual state of a pod | |
desired_state | string | Goal State of a pod | |
timestamp | string | The time when the event was created. The time is represented in RFC3339 form with UTC timezone. | |
version | .peloton.api.v1alpha.peloton.EntityVersion | The entity version currently used by the pod. | |
desired_version | .peloton.api.v1alpha.peloton.EntityVersion | The desired entity version that should be used by the pod. | |
agent_id | string | The agentID for the pod | |
hostname | string | The host on which the pod is running | |
message | string | Short human friendly message explaining state. | |
reason | string | The short reason for the pod event | |
prev_pod_id | .peloton.api.v1alpha.peloton.PodID | The previous pod ID | |
healthy | string | The health check result of the pod | |
desired_pod_id | .peloton.api.v1alpha.peloton.PodID | The desired pod ID |
Info of a pod in a Job
Field | Type | Label | Description |
---|---|---|---|
spec | PodSpec | Configuration of the pod | |
status | PodStatus | Runtime status of the pod |
Pod configuration for a given job instance Note that only add string/slice/ptr type into PodConfig directly due to the limitation of go reflection inside our pod specific config logic.
Field | Type | Label | Description |
---|---|---|---|
pod_name | .peloton.api.v1alpha.peloton.PodName | Name of the pod | |
labels | .peloton.api.v1alpha.peloton.Label | repeated | List of user-defined labels for the pod |
init_containers | ContainerSpec | repeated | List of initialization containers belonging to the pod. These containers are assumed to run to completion and are executed in order prior to containers being started. If any init container fails, the pod is considered to have failed. Init containers cannot be configured to have readiness or liveness health checks. |
containers | ContainerSpec | repeated | List of containers belonging to the pod. These will be started in parallel after init containers terminate. There must be at least one container in a pod. |
constraint | Constraint | Constraint on the attributes of the host or labels on pods on the host that this pod should run on. Use AndConstraint /OrConstraint to compose multiple constraints if necessary. |
|
restart_policy | RestartPolicy | Pod restart policy on failures | |
volume | PersistentVolumeSpec | Persistent volume config of the pod. | |
preemption_policy | PreemptionPolicy | Preemption policy of the pod | |
controller | bool | Whether this is a controller pod. A controller is a special batch pod which controls other pods inside a job. E.g. spark driver pods in a spark job will be a controller pod. | |
kill_grace_period_seconds | uint32 | This is used to set the amount of time between when the executor sends the SIGTERM message to gracefully terminate a pod and when it kills it by sending SIGKILL. If you do not set the grace period duration the default is 30 seconds. | |
revocable | bool | revocable represents pod to use physical or slack resources. |
Runtime status of a pod instance in a Job
Field | Type | Label | Description |
---|---|---|---|
state | PodState | Runtime state of the pod | |
pod_id | .peloton.api.v1alpha.peloton.PodID | The current pod ID for this pod | |
start_time | string | The time when the pod starts to run. Will be unset if the pod hasn't started running yet. The time is represented in RFC3339 form with UTC timezone. | |
completion_time | string | The time when the pod is completed. Will be unset if the pod hasn't completed yet. The time is represented in RFC3339 form with UTC timezone. | |
host | string | The name of the host where the pod is running | |
init_containers_status | ContainerStatus | repeated | Status of the init containers. |
containers_status | ContainerStatus | repeated | Status of the containers. |
desired_state | PodState | The desired state of the pod which should be eventually reached by the system. | |
message | string | The message that explains the current state of a pod. | |
reason | string | The reason that explains the current state of a pod. See Mesos TaskStatus.Reason for more details. | |
failure_count | uint32 | The number of times the pod has failed after retries. | |
volume_id | .peloton.api.v1alpha.peloton.VolumeID | persistent volume id | |
version | .peloton.api.v1alpha.peloton.EntityVersion | The entity version currently used by the pod. TODO Avoid leaking job abstractions into public pod APIs. Remove after internal protobuf structures are defined. | |
desired_version | .peloton.api.v1alpha.peloton.EntityVersion | The desired entity version that should be used by the pod. TODO Avoid leaking job abstractions into public pod APIs. Remove after internal protobuf structures are defined. | |
agent_id | .mesos.v1.AgentID | the id of mesos agent on the host to be launched. | |
revision | .peloton.api.v1alpha.peloton.Revision | Revision of the current pod status. | |
prev_pod_id | .peloton.api.v1alpha.peloton.PodID | The pod id of the previous pod. | |
resource_usage | PodStatus.ResourceUsageEntry | repeated | The resource usage for this pod. The map key is each resource kind in string format and the map value is the number of unit-seconds of that resource used by the job. Example: if a pod that uses 1 CPU and finishes in 10 seconds, this map will contain <"cpu":10> |
desired_pod_id | .peloton.api.v1alpha.peloton.PodID | The desired pod ID for this pod | |
desiredHost | string | The name of the host where the pod should be running on upon restart. It is used for best effort in-place update/restart. |
Field | Type | Label | Description |
---|---|---|---|
key | string | ||
value | double |
Summary information about a pod
Field | Type | Label | Description |
---|---|---|---|
pod_name | .peloton.api.v1alpha.peloton.PodName | Name of the pod | |
status | PodStatus | Runtime status of the pod |
Network port configuration for a container
Field | Type | Label | Description |
---|---|---|---|
name | string | Name of the network port, e.g. http, tchannel. Required field. | |
value | uint32 | Static port number if any. If unset, will be dynamically allocated by the scheduler | |
env_name | string | Environment variable name to be exported when running a container for this port. Required field for dynamic port. |
Preemption policy for a pod
Field | Type | Label | Description |
---|---|---|---|
kill_on_preempt | bool | This policy defines if the pod should be restarted after it is preempted. If set to true the pod will not be rescheduled after it is preempted. If set to false the pod will be rescheduled. Defaults to false |
QuerySpec specifies the list of query criteria for pods. All indexed fields should be part of this message. And all fields in this message have to be indexed too.
Field | Type | Label | Description |
---|---|---|---|
pagination | .peloton.api.v1alpha.query.PaginationSpec | The spec of how to do pagination for the query results. | |
pod_states | PodState | repeated | List of pod states to query the pods. Will match all pods if the list is empty. |
names | .peloton.api.v1alpha.peloton.PodName | repeated | List of pod names to query the pods. Will match all names if the list is empty. |
hosts | string | repeated | List of hosts to query the pods. Will match all hosts if the list is empty. |
Resource configuration for a container.
Field | Type | Label | Description |
---|---|---|---|
cpu_limit | double | CPU limit in number of CPU cores | |
mem_limit_mb | double | Memory limit in MB | |
disk_limit_mb | double | Disk limit in MB | |
fd_limit | uint32 | File descriptor limit | |
gpu_limit | double | GPU limit in number of GPUs |
Restart policy for a pod.
Field | Type | Label | Description |
---|---|---|---|
max_failures | uint32 | Max number of pod failures can occur before giving up scheduling retry, no backoff for now. Default 0 means no retry on failures. |
TerminationStatus contains details about termination of a task. It mainly contains Peloton-specific reasons for termination.
Field | Type | Label | Description |
---|---|---|---|
reason | TerminationStatus.Reason | Reason for termination. | |
exit_code | uint32 | If non-zero, exit status when the container terminated. | |
signal | string | Name of signal received by the container when it terminated. |
Name | Number | Description |
---|---|---|
CONSTRAINT_TYPE_INVALID | 0 | Reserved for compatibility. |
CONSTRAINT_TYPE_LABEL | 1 | |
CONSTRAINT_TYPE_AND | 2 | |
CONSTRAINT_TYPE_OR | 3 |
Runtime states of a container in a pod
Name | Number | Description |
---|---|---|
CONTAINER_STATE_INVALID | 0 | Invalid state. |
CONTAINER_STATE_PENDING | 1 | The container has not been created yet |
CONTAINER_STATE_LAUNCHED | 2 | The container has been launched |
CONTAINER_STATE_STARTING | 3 | The container is being started on a host |
CONTAINER_STATE_RUNNING | 4 | The container is running on a host |
CONTAINER_STATE_SUCCEEDED | 5 | The container terminated with an exit code of zero |
CONTAINER_STATE_FAILED | 6 | The container terminated with a non-zero exit code |
CONTAINER_STATE_KILLING | 7 | The container is being killed |
CONTAINER_STATE_KILLED | 8 | Execution of the container was terminated by the system |
Name | Number | Description |
---|---|---|
HEALTH_CHECK_TYPE_UNKNOWN | 0 | Reserved for future compatibility of new types. |
HEALTH_CHECK_TYPE_COMMAND | 1 | Command line based health check |
HEALTH_CHECK_TYPE_HTTP | 2 | HTTP endpoint based health check |
HEALTH_CHECK_TYPE_GRPC | 3 | gRPC based health check |
HealthState is the health check state of a container
Name | Number | Description |
---|---|---|
HEALTH_STATE_INVALID | 0 | Default value. |
HEALTH_STATE_DISABLED | 1 | If the health check config is not enabled in the container config, then the health state is DISABLED. |
HEALTH_STATE_UNKNOWN | 2 | If the health check config is enabled in the container config, but the container has not reported the output of the health check yet, then the health state is UNKNOWN. |
HEALTH_STATE_HEALTHY | 3 | In a Mesos event, If the healthy field is true and the reason field is REASON_TASK_HEALTH_CHECK_STATUS_UPDATED the health state of the container is HEALTHY |
HEALTH_STATE_UNHEALTHY | 4 | In a Mesos event, If the healthy field is false and the reason field is REASON_TASK_HEALTH_CHECK_STATUS_UPDATED the health state of the container is UNHEALTHY |
Condition represents a constraint on the number of occurrences of the label.
Name | Number | Description |
---|---|---|
LABEL_CONSTRAINT_CONDITION_INVALID | 0 | |
LABEL_CONSTRAINT_CONDITION_LESS_THAN | 1 | |
LABEL_CONSTRAINT_CONDITION_EQUAL | 2 | |
LABEL_CONSTRAINT_CONDITION_GREATER_THAN | 3 |
Kind represents whatever the constraint applies to the labels on the host or to the labels of the pods that are located on the host.
Name | Number | Description |
---|---|---|
LABEL_CONSTRAINT_KIND_INVALID | 0 | |
LABEL_CONSTRAINT_KIND_POD | 1 | |
LABEL_CONSTRAINT_KIND_HOST | 2 |
Runtime states of a pod instance
Name | Number | Description |
---|---|---|
POD_STATE_INVALID | 0 | Invalid state. |
POD_STATE_INITIALIZED | 1 | The pod is being initialized |
POD_STATE_PENDING | 2 | The pod is pending and waiting for resources |
POD_STATE_READY | 3 | The pod has been allocated with resources and ready for placement |
POD_STATE_PLACING | 4 | The pod is being placed to a host based on its resource requirements and constraints |
POD_STATE_PLACED | 5 | The pod has been assigned to a host matching the resource requirements and constraints |
POD_STATE_LAUNCHING | 6 | The pod is taken from resmgr to be launched |
POD_STATE_LAUNCHED | 7 | The pod is being launched in Job manager |
POD_STATE_STARTING | 8 | Either init containers are starting/running or the main containers in the pod are being started by Mesos agent |
POD_STATE_RUNNING | 9 | All containers in the pod are running |
POD_STATE_SUCCEEDED | 10 | All containers in the pod terminated with an exit code of zero |
POD_STATE_FAILED | 11 | At least on container in the pod terminated with a non-zero exit code |
POD_STATE_LOST | 12 | The pod is lost |
POD_STATE_KILLING | 13 | The pod is being killed |
POD_STATE_KILLED | 14 | At least one of the containers in the pod was terminated by the system |
POD_STATE_PREEMPTING | 15 | The pod is being preempted by another one on the node |
POD_STATE_DELETED | 16 | The pod is to be deleted after termination |
Reason lists various causes for a task termination
Name | Number | Description |
---|---|---|
TERMINATION_STATUS_REASON_INVALID | 0 | Default value. |
TERMINATION_STATUS_REASON_KILLED_ON_REQUEST | 1 | Task was killed because a stop request was received from a client. |
TERMINATION_STATUS_REASON_FAILED | 2 | Task failed. See also TerminationStatus.exit_code, TerminationStatus.signal and ContainerStatus.message. |
TERMINATION_STATUS_REASON_KILLED_HOST_MAINTENANCE | 3 | Task was killed to put the host in to maintenance. |
TERMINATION_STATUS_REASON_PREEMPTED_RESOURCES | 4 | Tasked was killed to reclaim resources allocated to it. |
Field | Type | Label | Description |
---|---|---|---|
hostname | string | The hostname of the host | |
ip | string | The IP address of the host | |
state | HostState | The current state of the host |
Name | Number | Description |
---|---|---|
HOST_STATE_INVALID | 0 | |
HOST_STATE_UNKNOWN | 1 | Reserved for future compatibility of new states. |
HOST_STATE_UP | 2 | The host is healthy |
HOST_STATE_DRAINING | 3 | The tasks running on the host are being rescheduled. There will be no further placement of tasks on the host |
HOST_STATE_DRAINED | 4 | There are no tasks running on the host and is ready to be put into maintenance. |
HOST_STATE_DOWN | 5 | The host is in maintenance. |
The max limit of resources CONTROLLER
(see TaskType) tasks can use in
this resource pool. This is defined as a percentage of the resource pool's
reservation. If undefined there is no maximum limit for controller tasks
i.e. controller tasks will not be treated differently. For eg if the
resource pool's reservation is defined as:
cpu:100 mem:1000 disk:1000 gpu:10
And the ControllerLimit = 10 ,Then the maximum resources the controller tasks can use is 10% of the reservation, i.e.
cpu:10 mem:100 disk:100 gpu:1
Field | Type | Label | Description |
---|---|---|---|
max_percent | double |
Field | Type | Label | Description |
---|---|---|---|
respool_id | .peloton.api.v1alpha.peloton.ResourcePoolID | Resource Pool Id | |
spec | ResourcePoolSpec | ResourcePool spec | |
parent | .peloton.api.v1alpha.peloton.ResourcePoolID | Resource Pool's parent TODO: parent duplicated from ResourcePoolConfig | |
children | .peloton.api.v1alpha.peloton.ResourcePoolID | repeated | Resource Pool's children |
usages | ResourceUsage | repeated | Resource usage for each resource kind |
path | ResourcePoolPath | Resource Pool Path |
A fully qualified path to a resource pool in a resource pool hierrarchy. The path to a resource pool can be defined as an absolute path, starting from the root node and separated by a slash.
The resource hierarchy is anchored at a node called the root, designated by a slash "/".
For the below resource hierarchy ; the "compute" resource pool would be desgignated by path: /infrastructure/compute root ├─ infrastructure │ └─ compute └─ marketplace
Field | Type | Label | Description |
---|---|---|---|
value | string |
Resource Pool configuration
Field | Type | Label | Description |
---|---|---|---|
revision | .peloton.api.v1alpha.peloton.Revision | Revision of the Resource Pool config | |
name | string | Name of the resource pool | |
owning_team | string | Owning team of the pool | |
ldap_groups | string | repeated | LDAP groups of the pool |
description | string | Description of the resource pool | |
resources | ResourceSpec | repeated | Resource config of the Resource Pool |
parent | .peloton.api.v1alpha.peloton.ResourcePoolID | Resource Pool's parent | |
policy | SchedulingPolicy | Task Scheduling policy | |
controller_limit | ControllerLimit | The controller limit for this resource pool | |
slack_limit | SlackLimit | Cap on max non-slack resources[mem,disk] in percentage that can be used by revocable task. |
Resource configuration for a resource
Field | Type | Label | Description |
---|---|---|---|
kind | string | Type of the resource | |
reservation | double | Reservation/min of the resource | |
limit | double | Limit of the resource | |
share | double | Share on the resource pool | |
type | ReservationType | ReservationType indicates the the type of reservation There are two kind of reservation 1. ELASTIC 2. STATIC |
Field | Type | Label | Description |
---|---|---|---|
kind | string | Type of the resource | |
allocation | double | Allocation of the resource | |
slack | double | slack is the resource which is allocated but not used and mesos will give those resources as revocable offers |
The max limit of resources REVOCABLE
(see TaskType) tasks can use in
this resource pool. This is defined as a percentage of the resource pool's
reservation. If undefined there is no maximum limit for revocable tasks
i.e. revocable tasks will not be treated differently. For eg if the
resource pool's reservation is defined as:
cpu:100 mem:1000 disk:1000
And the SlackLimit = 10 ,Then the maximum resources the revocable tasks can use is 10% of the reservation, i.e.
mem:100 disk:100
For cpu, it will use revocable resources.
Field | Type | Label | Description |
---|---|---|---|
maxPercent | double |
ReservationType indicates reservation type for the resourcepool
Name | Number | Description |
---|---|---|
RESERVATION_TYPE_INVALID | 0 | |
RESERVATION_TYPE_ELASTIC | 1 | ELASTIC reservation enables resource pool to be elastic in reservation , which means other resource pool can take resources from this resource pool as well as this resource pool also can take resources from any other resource pool. This is the by default behavior for the resource pool |
RESERVATION_TYPE_STATIC | 2 | STATIC reservation enables resource pool to be static in reservation , which means irrespective of the demand this resource pool will have atleast reservation as entitlement value. No other resource pool can take resources from this resource pool. If demand for this resource pool is high it can take resources from other resource pools. By default value for reservation type ELASTIC. |
Scheduling policy for Resource Pool.
Name | Number | Description |
---|---|---|
SCHEDULING_POLICY_INVALID | 0 | |
SCHEDULING_POLICY_PRIORITY_FIFO | 1 | This scheduling policy will return item for highest priority in FIFO order |
PodFilter specifies the pod(s) to watch. Watch on pods is restricted to a single job.
Field | Type | Label | Description |
---|---|---|---|
job_id | .peloton.api.v1alpha.peloton.JobID | The JobID of the pods that will be monitored. Mandatory. | |
pod_names | .peloton.api.v1alpha.peloton.PodName | repeated | Names of the pods to watch. If empty, all pods in the job will be monitored. |
StatelessJobFilter specifies the job(s) to watch.
Field | Type | Label | Description |
---|---|---|---|
job_ids | .peloton.api.v1alpha.peloton.JobID | repeated | The IDs of the jobs to watch. If unset, all jobs will be monitored. |
Persistent volume information.
Field | Type | Label | Description |
---|---|---|---|
volume_id | .peloton.api.v1alpha.peloton.VolumeID | ID of the persistent volume. | |
pod_name | .peloton.api.v1alpha.peloton.PodName | ID of the pod that owns the volume. | |
hostname | string | Hostname of the persisted volume. | |
state | VolumeState | Current state of the volume. | |
desired_state | VolumeState | Goal state of the volume. | |
size_mb | uint32 | Volume size in MB. | |
container_path | string | Volume mount path inside container. | |
create_time | string | Volume creation time. | |
update_time | string | Volume info last update time. |
States of a persistent volume
Name | Number | Description |
---|---|---|
VOLUME_STATE_INVALID | 0 | Reserved for future compatibility of new states. |
VOLUME_STATE_INITIALIZED | 1 | The persistent volume is being initialized. |
VOLUME_STATE_CREATED | 2 | The persistent volume is created successfully. |
VOLUME_STATE_DELETED | 3 | The persistent volume is deleted. |
Request message for PodService.BrowsePodSandbox method
Field | Type | Label | Description |
---|---|---|---|
pod_name | .peloton.api.v1alpha.peloton.PodName | The pod name. | |
pod_id | .peloton.api.v1alpha.peloton.PodID | Get the sandbox path of a particular pod identified using the pod identifier. If not provided, the sandbox path for the latest pod is returned. |
Response message for PodService.BrowsePodSandbox method Return errors: NOT_FOUND: if the pod is not found. ABORT: if the pod has not been run.
Field | Type | Label | Description |
---|---|---|---|
hostname | string | The hostname of the sandbox. | |
port | string | The port of the sandbox. | |
paths | string | repeated | The list of sandbox file paths. TODO: distinguish files and directories in the sandbox |
mesos_master_hostname | string | Mesos Master hostname and port. | |
mesos_master_port | string |
Request message for PodService.DeletePodEvents method
Field | Type | Label | Description |
---|---|---|---|
pod_name | .peloton.api.v1alpha.peloton.PodName | The pod name. | |
pod_id | .peloton.api.v1alpha.peloton.PodID | Delete the events of a particular pod identified using the pod identifier. |
Response message for PodService.DeletePodEvents method Return errors: NOT_FOUND: if the pod is not found.
Request message for PodService.GetPodCache method
Field | Type | Label | Description |
---|---|---|---|
pod_name | .peloton.api.v1alpha.peloton.PodName | The pod name. |
Response message for PodService.GetPodCache method Return errors: NOT_FOUND: if the pod is not found.
Field | Type | Label | Description |
---|---|---|---|
status | .peloton.api.v1alpha.pod.PodStatus | The runtime status of the pod. |
Request message for PodService.GetPodEvents method
Field | Type | Label | Description |
---|---|---|---|
pod_name | .peloton.api.v1alpha.peloton.PodName | The pod name. | |
pod_id | .peloton.api.v1alpha.peloton.PodID | Get the events of a particular pod identified using the pod identifier. If not provided, events for the latest pod are returned. |
Response message for PodService.GetPodEvents method Return errors: NOT_FOUND: if the pod is not found.
Field | Type | Label | Description |
---|---|---|---|
events | .peloton.api.v1alpha.pod.PodEvent | repeated |
Request message for PodService.GetPod method
Field | Type | Label | Description |
---|---|---|---|
pod_name | .peloton.api.v1alpha.peloton.PodName | The pod name. | |
status_only | bool | If set to true, only return the pod status and not the configuration. |
Response message for PodService.GetPod method Return errors: NOT_FOUND: if the pod is not found.
Field | Type | Label | Description |
---|---|---|---|
current | .peloton.api.v1alpha.pod.PodInfo | Returns the status and configuration (if requested) for the current run of the pod. | |
previous | .peloton.api.v1alpha.pod.PodInfo | repeated | Returns the status and configuration (if requested) for previous runs of the pod. |
Request message for PodService.RefreshPod method
Field | Type | Label | Description |
---|---|---|---|
pod_name | .peloton.api.v1alpha.peloton.PodName | The pod name. |
Response message for PodService.RefreshPod method Return errors: NOT_FOUND: if the pod is not found.
Request message for PodService.RestartPod method
Field | Type | Label | Description |
---|---|---|---|
pod_name | .peloton.api.v1alpha.peloton.PodName | The pod name. |
Response message for PodService.RestartPod method Return errors: NOT_FOUND: if the pod is not found.
Request message for PodService.StartPod method
Field | Type | Label | Description |
---|---|---|---|
pod_name | .peloton.api.v1alpha.peloton.PodName | The pod name. |
Response message for PodService.StartPod method Return errors: NOT_FOUND: if the pod is not found.
Request message for PodService.StopPod method
Field | Type | Label | Description |
---|---|---|---|
pod_name | .peloton.api.v1alpha.peloton.PodName | The pod name. |
Response message for PodService.StopPod method Return errors: NOT_FOUND: if the pod is not found.
Pod service defines the pod related methods.
Methods which mutate the state of the pod.
Method Name | Request Type | Response Type | Description |
---|---|---|---|
StartPod | StartPodRequest | StartPodResponse | Start the pod. Will be a no-op for pod that is currently running. The pod is started asynchronously after the API call returns. |
StopPod | StopPodRequest | StopPodResponse | Stop the pod. Will be no-op for a pod that is currently stopped. The pod is stopped asynchronously after the API call returns. |
RestartPod | RestartPodRequest | RestartPodResponse | Restart a the pod. Will start a pod that is currently stopped. Will first stop the pod that is currently running and then start it again. This is an asynchronous call. |
GetPod | GetPodRequest | GetPodResponse | Get the info of a pod in a job. Return the current run as well as the terminal state of previous runs. |
GetPodEvents | GetPodEventsRequest | GetPodEventsResponse | Get the state transitions for a pod (pod events) for a given run of the pod. |
BrowsePodSandbox | BrowsePodSandboxRequest | BrowsePodSandboxResponse | Return the list of file paths inside the sandbox for a given run of a pod. The client can use the Mesos Agent HTTP endpoints to read and download the files. http://mesos.apache.org/documentation/latest/endpoints |
RefreshPod | RefreshPodRequest | RefreshPodResponse | Allows user to load pod runtime state from DB and re-execute the action associated with current state. |
GetPodCache | GetPodCacheRequest | GetPodCacheResponse | Get the cache of a pod stored in Peloton. |
DeletePodEvents | DeletePodEventsRequest | DeletePodEventsResponse | Delete the events of a given run of a pod. This is used to prevent the events for a given pod from growing without bounds. |
Request message for HostService.CompleteMaintenance method.
Field | Type | Label | Description |
---|---|---|---|
hostnames | string | repeated | List of hosts put be brought back up |
Response message for HostService.CompleteMaintenance method. Return errors: NOT_FOUND: if the hosts are not found.
Request message for HostService.QueryHosts method.
Field | Type | Label | Description |
---|---|---|---|
host_states | .peloton.api.v1alpha.host.HostState | repeated | List of host states to query the hosts. Will return all hosts if the list is empty. |
Response message for HostService.QueryHosts method. Return errors:
Field | Type | Label | Description |
---|---|---|---|
host_infos | .peloton.api.v1alpha.host.HostInfo | repeated | List of hosts that match the host query criteria. |
Request message for HostService.StartMaintenance method.
Field | Type | Label | Description |
---|---|---|---|
hostnames | string | repeated | List of hosts to be put into maintenance |
Response message for HostService.StartMaintenance method. Return errors: NOT_FOUND: if the hosts are not found.
HostService defines the host related methods such as query hosts, start maintenance, complete maintenance etc.
Method Name | Request Type | Response Type | Description |
---|---|---|---|
QueryHosts | QueryHostsRequest | QueryHostsResponse | Get hosts which are in one of the specified states |
StartMaintenance | StartMaintenanceRequest | StartMaintenanceResponse | Start maintenance on the specified hosts |
CompleteMaintenance | CompleteMaintenanceRequest | CompleteMaintenanceResponse | Complete maintenance on the specified hosts |
Request message for ResourcePoolService.CreateResourcePool method.
Field | Type | Label | Description |
---|---|---|---|
respool_id | .peloton.api.v1alpha.peloton.ResourcePoolID | The unique resource pool UUID specified by the client. This can be used by the client to re-create a failed resource pool without the side-effect of creating duplicated resource pool. If unset, the server will create a new UUID for the resource pool. | |
spec | ResourcePoolSpec | The detailed configuration of the resource pool be to created. |
Response message for ResourcePoolService.CreateResourcePool method. Return errors: ALREADY_EXISTS: if the resource pool already exists. INVALID_ARGUMENT: if the resource pool config is invalid.
Field | Type | Label | Description |
---|---|---|---|
respool_id | .peloton.api.v1alpha.peloton.ResourcePoolID | The ID of the newly created resource pool. |
Request message for ResourcePoolService.DeleteResourcePool method.
Field | Type | Label | Description |
---|---|---|---|
respool_id | .peloton.api.v1alpha.peloton.ResourcePoolID | The ID of the resource pool to be deleted. |
Response message for ResourcePoolService.DeleteResourcePool method. Return errors: NOT_FOUND: if the resource pool is not found. INVALID_ARGUMENT: if the resource pool is not leaf node. FAILED_PRECONDITION: if the resource pool is busy. INTERNAL: if the resource pool fail to delete for internal errors.
Request message for ResourcePoolService.GetResourcePool method.
Field | Type | Label | Description |
---|---|---|---|
respool_id | .peloton.api.v1alpha.peloton.ResourcePoolID | The ID of the resource pool to get the detailed information. | |
include_child_pools | bool | Whether or not to include the resource pool info of the direct children |
Response message for ResourcePoolService.GetResourcePool method. Return errors: NOT_FOUND: if the resource pool is not found.
Field | Type | Label | Description |
---|---|---|---|
respool | ResourcePoolInfo | The detailed information of the resource pool. | |
child_respools | ResourcePoolInfo | repeated | The list of child resource pools. |
Request message for ResourcePoolService.LookupResourcePoolID method.
Field | Type | Label | Description |
---|---|---|---|
path | ResourcePoolPath | The resource pool path to look up the resource pool ID. |
Response message for ResourcePoolService.LookupResourcePoolID method. Response message for ResourcePoolService.UpdateResourcePool method. Return errors: NOT_FOUND: if the resource pool is not found. INVALID_ARGUMENT: if the resource pool path is invalid.
Field | Type | Label | Description |
---|---|---|---|
respool_id | .peloton.api.v1alpha.peloton.ResourcePoolID | The resource pool ID for the given resource pool path. |
Request message for ResourcePoolService.QueryResourcePools method.
TODO Filters
Response message for ResourcePoolService.QueryResourcePools method. Return errors:
Field | Type | Label | Description |
---|---|---|---|
respools | ResourcePoolInfo | repeated |
Request message for ResourcePoolService.UpdateResourcePool method.
Field | Type | Label | Description |
---|---|---|---|
respool_id | .peloton.api.v1alpha.peloton.ResourcePoolID | The ID of the resource pool to update the configuration. | |
spec | ResourcePoolSpec | The configuration of the resource pool to be updated. |
Response message for ResourcePoolService.UpdateResourcePool method. Return errors: NOT_FOUND: if the resource pool is not found.
ResourcePoolService defines the resource pool related methods such as create, get, delete and upgrade resource pools.
Method Name | Request Type | Response Type | Description |
---|---|---|---|
CreateResourcePool | CreateResourcePoolRequest | CreateResourcePoolResponse | Create a resource pool entity for a given config |
GetResourcePool | GetResourcePoolRequest | GetResourcePoolResponse | Get the resource pool entity |
DeleteResourcePool | DeleteResourcePoolRequest | DeleteResourcePoolResponse | Delete a resource pool entity |
UpdateResourcePool | UpdateResourcePoolRequest | UpdateResourcePoolResponse | Modify a resource pool entity |
LookupResourcePoolID | LookupResourcePoolIDRequest | LookupResourcePoolIDResponse | Lookup the resource pool ID for a given resource pool path |
QueryResourcePools | QueryResourcePoolsRequest | QueryResourcePoolsResponse | Query the resource pools. |
Configuration of a job creation.
Field | Type | Label | Description |
---|---|---|---|
batch_size | uint32 | Batch size for the creation which controls how many instances may be created at the same time. | |
max_instance_retries | uint32 | Maximum number of times a failing instance will be retried during the creation. If the value is 0, the instance can be retried for infinite times. | |
max_tolerable_instance_failures | uint32 | Maximum number of instance failures before the creation is declared to be failed. If the value is 0, there is no limit for max failure instances and the creation is marked successful even if all of the instances fail. | |
start_paused | bool | If set to true, indicates that the creation should start in the paused state, requiring an explicit resume to roll forward. |
Information of a job, such as job spec and status
Field | Type | Label | Description |
---|---|---|---|
job_id | .peloton.api.v1alpha.peloton.JobID | Job ID | |
spec | JobSpec | Job configuration | |
status | JobStatus | Job runtime status |
Stateless job configuration.
Field | Type | Label | Description |
---|---|---|---|
revision | .peloton.api.v1alpha.peloton.Revision | Revision of the job config | |
name | string | Name of the job | |
owner | string | Owner of the job | |
owning_team | string | Owning team of the job | |
ldap_groups | string | repeated | LDAP groups of the job |
description | string | Description of the job | |
labels | .peloton.api.v1alpha.peloton.Label | repeated | List of user-defined labels for the job |
instance_count | uint32 | Number of instances of the job | |
sla | SlaSpec | SLA config of the job | |
default_spec | .peloton.api.v1alpha.pod.PodSpec | Default pod configuration of the job | |
instance_spec | JobSpec.InstanceSpecEntry | repeated | Instance specific pod config which overwrites the default one |
respool_id | .peloton.api.v1alpha.peloton.ResourcePoolID | Resource Pool ID where this job belongs to |
Field | Type | Label | Description |
---|---|---|---|
key | uint32 | ||
value | .peloton.api.v1alpha.pod.PodSpec |
The current runtime status of a Job.
Field | Type | Label | Description |
---|---|---|---|
revision | .peloton.api.v1alpha.peloton.Revision | Revision of the current job status. Version in the revision is incremented every time job status changes. Thus, it can be used to order the different job status updates. | |
state | JobState | State of the job | |
creation_time | string | The time when the job was created. The time is represented in RFC3339 form with UTC timezone. | |
pod_stats | JobStatus.PodStatsEntry | repeated | The number of pods grouped by each pod state. The map key is the pod.PodState in string format and the map value is the number of tasks in the particular state. |
desired_state | JobState | Goal state of the job. | |
version | .peloton.api.v1alpha.peloton.EntityVersion | The current version of the job. It is used to implement optimistic concurrency control for all job write APIs. The current job configuration can be fetched based on the current resource version. | |
workflow_status | WorkflowStatus | Status of ongoing update/restart workflow. | |
pod_configuration_version_stats | JobStatus.PodConfigurationVersionStatsEntry | repeated | The number of tasks grouped by which configuration version they are on. The map key is the job configuration version and the map value is the number of tasks using that particular job configuration version. The job configuration version in the map key can be fed as the value of the entity version in the GetJobRequest to fetch the job configuration. |
Field | Type | Label | Description |
---|---|---|---|
key | string | ||
value | uint32 |
Field | Type | Label | Description |
---|---|---|---|
key | string | ||
value | uint32 |
Summary of job spec and status. The summary will be returned by List or Query API calls. These calls will return a large number of jobs, so the content in the job summary has to be kept minimal.
Field | Type | Label | Description |
---|---|---|---|
job_id | .peloton.api.v1alpha.peloton.JobID | Job ID | |
name | string | Name of the job | |
owner | string | Owner of the job | |
owning_team | string | Owning team of the job | |
labels | .peloton.api.v1alpha.peloton.Label | repeated | List of user-defined labels for the job |
instance_count | uint32 | Number of instances of the job | |
respool_id | .peloton.api.v1alpha.peloton.ResourcePoolID | Resource Pool ID where this job belongs to | |
status | JobStatus | Job runtime status |
QuerySpec specifies the list of query criteria for jobs. All indexed fields should be part of this message. And all fields in this message have to be indexed too.
Field | Type | Label | Description |
---|---|---|---|
pagination | .peloton.api.v1alpha.query.PaginationSpec | The spec of how to do pagination for the query results. | |
labels | .peloton.api.v1alpha.peloton.Label | repeated | List of labels to query the jobs. Will match all jobs if the list is empty. |
keywords | string | repeated | List of keywords to query the jobs. Will match all jobs if the list is empty. When set, will do a wildcard match on owner, name, labels, description. |
job_states | JobState | repeated | List of job states to query the jobs. Will match all jobs if the list is empty. |
respool | .peloton.api.v1alpha.respool.ResourcePoolPath | The resource pool to query the jobs. Will match jobs from all resource pools if unset. | |
owner | string | Query jobs by owner. This is case sensitive and will look for jobs with owner matching the exact owner string. Will match all jobs if owner is unset. | |
name | string | Query jobs by name. This is case sensitive and will look for jobs with name matching the name string. Will support partial name match. Will match all jobs if name is unset. | |
creation_time_range | .peloton.api.v1alpha.peloton.TimeRange | Query jobs by creation time range. This will look for all jobs that were created within a specified time range. This search will operate based on job creation time. | |
completion_time_range | .peloton.api.v1alpha.peloton.TimeRange | Query jobs by completion time range. This will look for all jobs that were completed within a specified time range. This search will operate based on job completion time. |
SLA configuration for a stateless job
Field | Type | Label | Description |
---|---|---|---|
priority | uint32 | Priority of a job. Higher value takes priority over lower value when making scheduling decisions as well as preemption decisions. | |
preemptible | bool | Whether all the job instances are preemptible. If so, it might be scheduled elastic resources from other resource pools and subject to preemption when the demands of other resource pools increase. For stateless jobs, this field will overrule preemptible configuration in the pod spec. | |
revocable | bool | Whether all the job instances are revocable. If so, it might be scheduled using revocable resources and subject to preemption when there is resource contention on the host. For stateless jobs, this field will overrule revocable configuration in the pod spec. | |
maximum_unavailable_instances | uint32 | Maximum number of job instances which can be unavailable at a given time. |
Configuration of a job update.
Field | Type | Label | Description |
---|---|---|---|
batch_size | uint32 | Batch size for the update which controls how many instances may be updated at the same time. | |
rollback_on_failure | bool | If configured, the update be automatically rolled back to the previous job configuration on failure. | |
max_instance_retries | uint32 | Maximum number of times a failing instance will be retried during the update. If the value is 0, the instance can be retried for infinite times. | |
max_tolerable_instance_failures | uint32 | Maximum number of instance failures before the update is declared to be failed. If the value is 0, there is no limit for max failure instances and the update is marked successful even if all of the instances fail. | |
start_paused | bool | If set to true, indicates that the update should start in the paused state, requiring an explicit resume to roll forward. | |
in_place | bool | If set to true, peloton would try to place the task restarted/updated on the host it previously run on. It is best effort, and has no guarantee of success. |
WorkflowEvents are workflow state change events for a job or pod on workflow operations
Field | Type | Label | Description |
---|---|---|---|
type | WorkflowType | Workflow type. | |
timestamp | string | Timestamp of the event represented in RFC3339 form with UTC timezone. | |
state | WorkflowState | Current runtime state of the workflow. |
Information about a workflow including its status and specification
Field | Type | Label | Description |
---|---|---|---|
status | WorkflowStatus | Workflow status | |
update_spec | UpdateSpec | Update specification for update workflow | |
restart_batch_size | uint32 | Batch size provided for restart workflow | |
restart_ranges | .peloton.api.v1alpha.pod.InstanceIDRange | repeated | Instance ranges provided for restart workflow |
opaque_data | .peloton.api.v1alpha.peloton.OpaqueData | Opaque data supplied by the client | |
events | WorkflowEvent | repeated | job workflow events represents update state changes |
instances_added | .peloton.api.v1alpha.pod.InstanceIDRange | repeated | Instances added by update workflow |
instances_removed | .peloton.api.v1alpha.pod.InstanceIDRange | repeated | Instances removed by update workflow |
instances_updated | .peloton.api.v1alpha.pod.InstanceIDRange | repeated | Instances updated by update workflow |
Runtime status of a job workflow.
Field | Type | Label | Description |
---|---|---|---|
type | WorkflowType | Workflow type. | |
state | WorkflowState | Current runtime state of the workflow. | |
num_instances_completed | uint32 | Number of instances completed. | |
num_instances_remaining | uint32 | Number of instances remaining. | |
num_instances_failed | uint32 | Number of instances which failed to come up after the workflow. | |
instances_current | uint32 | repeated | Current instances being operated on. |
version | .peloton.api.v1alpha.peloton.EntityVersion | Job version the workflow moved the job object to. | |
prev_version | .peloton.api.v1alpha.peloton.EntityVersion | Previous job version of the job object. | |
creation_time | string | The time when the workflow was created. The time is represented in RFC3339 form with UTC timezone. | |
update_time | string | The time when the workflow was last updated. The time is represented in RFC3339 form with UTC timezone. | |
prev_state | WorkflowState | Previous runtime state of the workflow. |
Runtime states of a Job.
Name | Number | Description |
---|---|---|
JOB_STATE_INVALID | 0 | Invalid job state. |
JOB_STATE_INITIALIZED | 1 | The job has been initialized and persisted in DB. |
JOB_STATE_PENDING | 2 | All tasks have been created and persisted in DB, but no task is RUNNING yet. |
JOB_STATE_RUNNING | 3 | Any of the tasks in the job is in RUNNING state. |
JOB_STATE_SUCCEEDED | 4 | All tasks in the job are in SUCCEEDED state. |
JOB_STATE_FAILED | 5 | All tasks in the job are in terminated state and one or more tasks is in FAILED state. |
JOB_STATE_KILLED | 6 | All tasks in the job are in terminated state and one or more tasks in the job is killed by the user. |
JOB_STATE_KILLING | 7 | All tasks in the job have been requested to be killed by the user. |
JOB_STATE_UNINITIALIZED | 8 | The job is partially created and is not ready to be scheduled |
JOB_STATE_DELETED | 9 | The job has been deleted. |
Runtime state of a job workflow.
Name | Number | Description |
---|---|---|
WORKFLOW_STATE_INVALID | 0 | Invalid protobuf value |
WORKFLOW_STATE_INITIALIZED | 1 | The operation has been created but not started yet. |
WORKFLOW_STATE_ROLLING_FORWARD | 2 | The workflow is rolling forward |
WORKFLOW_STATE_PAUSED | 3 | The workflow has been paused |
WORKFLOW_STATE_SUCCEEDED | 4 | The workflow has completed successfully |
WORKFLOW_STATE_ABORTED | 5 | The update was aborted/cancelled |
WORKFLOW_STATE_FAILED | 6 | The workflow has failed to complete. |
WORKFLOW_STATE_ROLLING_BACKWARD | 7 | The update is rolling backward |
WORKFLOW_STATE_ROLLED_BACK | 8 | The update was rolled back due to failure |
The different types of job rolling workflows supported.
Name | Number | Description |
---|---|---|
WORKFLOW_TYPE_INVALID | 0 | Invalid protobuf value. |
WORKFLOW_TYPE_UPDATE | 1 | Job update workflow. |
WORKFLOW_TYPE_RESTART | 2 | Restart pods in a job. |
CancelRequest is request for method WatchService.Cancel
Field | Type | Label | Description |
---|---|---|---|
watch_id | uint64 | ID of the watch session to cancel. |
CancelRequest is response for method WatchService.Cancel Return errors: NOT_FOUND: Watch ID not found
WatchRequest is request for method WatchService.Watch. It specifies the objects that should be monitored for changes.
Field | Type | Label | Description |
---|---|---|---|
start_revision | uint64 | The revision from which to start getting changes. If unspecified, the server will return changes after the current revision. The server may choose to maintain only a limited number of historical revisions; a start revision older than the oldest revision available at the server will result in an error and the watch stream will be closed. Note: Initial implementations will not support historical revisions, so if the client sets a value for this field, it will receive an OUT_OF_RANGE error immediately. | |
stateless_job_filter | .peloton.api.v1alpha.watch.StatelessJobFilter | Criteria to select the stateless jobs to watch. If unset, no jobs will be watched. | |
pod_filter | .peloton.api.v1alpha.watch.PodFilter | Criteria to select the pods to watch. If unset, no pods will be watched. |
WatchResponse is response method for WatchService.Watch. It contains the objects that have changed. Return errors: OUT_OF_RANGE: Requested start-revision is too old INVALID_ARGUMENT: Requested start-revision is newer than server revision RESOURCE_EXHAUSTED: Number of concurrent watches exceeded CANCELLED: Watch cancelled
Field | Type | Label | Description |
---|---|---|---|
watch_id | uint64 | Unique identifier for the watch session | |
revision | uint64 | Server revision when the response results were created | |
stateless_jobs | .peloton.api.v1alpha.job.stateless.JobSummary | repeated | Stateless jobs that have changed. |
stateless_jobs_not_found | .peloton.api.v1alpha.peloton.JobID | repeated | Stateless job IDs that were not found. |
pods | .peloton.api.v1alpha.pod.PodSummary | repeated | Pods that have changed. |
pods_not_found | .peloton.api.v1alpha.peloton.PodName | repeated | Names of pods that were not found. |
Watch service defines the methods for getting notifications on changes to Peloton objects. A watch is long-running request where a client specifies the kind of objects that it is interested in as well as a revision, either current or historical. The server continuously streams back changes from that revision till the client cancels the watch (or the connection is lost). The server may support only a limited amount of historical revisions to keep the load on the server reasonable. Historical revisions are mainly provided for clients to recover from transient errors without having to rebuild a snapshot of the system (which can be expensive for both sides). Also, implementations may limit the number of concurrent watch requests that can be serviced so that the server is not overloaded.
Method Name | Request Type | Response Type | Description |
---|---|---|---|
Watch | WatchRequest | WatchResponse | Create a watch to get notified about changes to Peloton objects. Changed objects are streamed back to the caller till the watch is cancelled. |
Cancel | CancelRequest | CancelResponse | Cancel a watch. The watch stream will get an error indicating watch was cancelled and the stream will be closed. |
Request message for VolumeService.DeleteVolume method.
Field | Type | Label | Description |
---|---|---|---|
volume_id | .peloton.api.v1alpha.peloton.VolumeID | volume id for the delete request. |
Response message for VolumeService.DeleteVolume method. Return errors: NOT_FOUND: if the volume is not found.
Request message for VolumeService.GetVolumes method.
Field | Type | Label | Description |
---|---|---|---|
volume_id | .peloton.api.v1alpha.peloton.VolumeID | the volume id. |
Response message for VolumeService.GetVolumes method. Return errors: NOT_FOUND: if the volume is not found.
Field | Type | Label | Description |
---|---|---|---|
result | .peloton.api.v1alpha.volume.PersistentVolumeInfo | volume info result. |
Request message for VolumeService.ListVolumes method.
Field | Type | Label | Description |
---|---|---|---|
job_id | .peloton.api.v1alpha.peloton.JobID | job ID for the volumes. |
Response message for VolumeService.ListVolumes method. Return errors: NOT_FOUND: if the job is not found.
Field | Type | Label | Description |
---|---|---|---|
volumes | ListVolumesResponse.VolumesEntry | repeated | volumes result map from volume uuid to volume info. |
Field | Type | Label | Description |
---|---|---|---|
key | string | ||
value | .peloton.api.v1alpha.volume.PersistentVolumeInfo |
Volume Manager service interface
Method Name | Request Type | Response Type | Description |
---|---|---|---|
ListVolumes | ListVolumesRequest | ListVolumesResponse | List associated volumes for given job. |
GetVolume | GetVolumeRequest | GetVolumeResponse | Get volume data. |
DeleteVolume | DeleteVolumeRequest | DeleteVolumeResponse | Delete a persistent volume. |
Request message for JobService.AbortJobWorkflow method.
Field | Type | Label | Description |
---|---|---|---|
job_id | .peloton.api.v1alpha.peloton.JobID | The job identifier. | |
version | .peloton.api.v1alpha.peloton.EntityVersion | The current version of the job. | |
opaque_data | .peloton.api.v1alpha.peloton.OpaqueData | Opaque data supplied by the client |
Response message for JobService.AbortJobWorkflow method. Response message for JobService.RestartJob method. Return errors: NOT_FOUND: if the job ID is not found. ABORTED: if the job version is invalid.
Field | Type | Label | Description |
---|---|---|---|
version | .peloton.api.v1alpha.peloton.EntityVersion | The new version of the job. |
Request message for JobService.CreateJob method.
Field | Type | Label | Description |
---|---|---|---|
job_id | .peloton.api.v1alpha.peloton.JobID | The unique job UUID specified by the client. This can be used by the client to re-create a deleted job. If unset, the server will create a new UUID for the job for each invocation. | |
spec | .peloton.api.v1alpha.job.stateless.JobSpec | The configuration of the job to be created. | |
secrets | .peloton.api.v1alpha.peloton.Secret | repeated | Experimental: This is a batch feature. The implementation is subject to change (or removal) from stateless. The list of secrets for this job |
create_spec | .peloton.api.v1alpha.job.stateless.CreateSpec | The creation SLA specification. | |
opaque_data | .peloton.api.v1alpha.peloton.OpaqueData | Opaque data supplied by the client |
Response message for JobService.CreateJob method. Return errors: ALREADY_EXISTS: if the job ID already exists INVALID_ARGUMENT: if the job ID or job config is invalid. NOT_FOUND: if the resource pool is not found.
Field | Type | Label | Description |
---|---|---|---|
job_id | .peloton.api.v1alpha.peloton.JobID | The job ID of the newly created job. Will be the same as the one in CreateJobRequest if provided. Otherwise, a new job ID will be generated by the server. | |
version | .peloton.api.v1alpha.peloton.EntityVersion | The current version of the job. |
Request message for JobService.DeleteJob method.
Field | Type | Label | Description |
---|---|---|---|
job_id | .peloton.api.v1alpha.peloton.JobID | The job to be deleted. | |
version | .peloton.api.v1alpha.peloton.EntityVersion | The current version of the job. It is used to implement optimistic concurrency control. | |
force | bool | If set to true, it will force a delete of the job even if it is running. The job will be first stopped and deleted. This step cannot be undone, and the job cannot be re-created (with same uuid) till the delete is complete. So, it is recommended to not set force to true. |
Response message for JobService.DeleteJob method. Response message for JobService.RestartJob method. Return errors: NOT_FOUND: if the job ID is not found. ABORTED: if the job version is invalid or job is still running. FailedPrecondition: if the job has not been stopped before delete.
Request message for JobService.GetJobCache method.
Field | Type | Label | Description |
---|---|---|---|
job_id | .peloton.api.v1alpha.peloton.JobID | The job ID to look up the job. |
Response message for JobService.GetJobCache method. Return errors: NOT_FOUND: if the job ID is not found.
Field | Type | Label | Description |
---|---|---|---|
spec | .peloton.api.v1alpha.job.stateless.JobSpec | The job configuration in cache of the matching job. | |
status | .peloton.api.v1alpha.job.stateless.JobStatus | The job runtime in cache of the matching job. |
Request message for JobService.GetJobIDFromJobName method.
Field | Type | Label | Description |
---|---|---|---|
job_name | string | Job name to lookup for job UUID. |
Response message for JobService.GetJobIDFromJobName method. Return errors: NOT_FOUND: if the job name is not found.
Field | Type | Label | Description |
---|---|---|---|
job_id | .peloton.api.v1alpha.peloton.JobID | repeated | The job UUIDs for the job name. Job UUIDs are sorted by descending create timestamp. |
Request message for JobService.GetJob method.
Field | Type | Label | Description |
---|---|---|---|
job_id | .peloton.api.v1alpha.peloton.JobID | The job ID to look up the job. | |
version | .peloton.api.v1alpha.peloton.EntityVersion | The version of the job object to fetch. If not provided, then the latest job configuration specification and runtime status are returned. If provided, only the job configuration specification (and no runtime) at a given version is returned. | |
summary_only | bool | If set to true, only return the job summary. |
Response message for JobService.GetJob method. Return errors: NOT_FOUND: if the job ID is not found.
Field | Type | Label | Description |
---|---|---|---|
job_info | .peloton.api.v1alpha.job.stateless.JobInfo | The configuration specification and runtime status of the job. | |
summary | .peloton.api.v1alpha.job.stateless.JobSummary | The job summary. | |
secrets | .peloton.api.v1alpha.peloton.Secret | repeated | The list of secrets for this job, secret.Value will be empty. SecretID and path will be populated, so that caller can identify which secret is associated with this job. |
workflow_info | .peloton.api.v1alpha.job.stateless.WorkflowInfo | Information about the current/last completed workflow including its state and specification. |
Request message for JobService.GetReplaceJobDiffRequest method.
Field | Type | Label | Description |
---|---|---|---|
job_id | .peloton.api.v1alpha.peloton.JobID | The job ID to be updated. | |
version | .peloton.api.v1alpha.peloton.EntityVersion | The current version of the job. | |
spec | .peloton.api.v1alpha.job.stateless.JobSpec | The new job configuration to be applied. |
Response message for JobService.GetReplaceJobDiff method. Return errors: INVALID_ARGUMENT: if the job ID or job config is invalid. NOT_FOUND: if the job ID is not found. ABORTED: if the job version is invalid.
Field | Type | Label | Description |
---|---|---|---|
instances_added | .peloton.api.v1alpha.pod.InstanceIDRange | repeated | Instances which are being added |
instances_removed | .peloton.api.v1alpha.pod.InstanceIDRange | repeated | Instances which are being removed |
instances_updated | .peloton.api.v1alpha.pod.InstanceIDRange | repeated | Instances which are being updated |
instances_unchanged | .peloton.api.v1alpha.pod.InstanceIDRange | repeated | Instances which are unchanged |
Request message for JobService.GetWorkflowEvents
Field | Type | Label | Description |
---|---|---|---|
job_id | .peloton.api.v1alpha.peloton.JobID | The job ID to look up the job. | |
instance_id | uint32 | The instance to get workflow events. |
Response message for JobService.GetWorkflowEvents Return errors: NOT_FOUND: if the job ID is not found.
Field | Type | Label | Description |
---|---|---|---|
events | .peloton.api.v1alpha.job.stateless.WorkflowEvent | repeated | Workflow events for the given workflow |
Request message for JobService.ListJobUpdates method.
Field | Type | Label | Description |
---|---|---|---|
job_id | .peloton.api.v1alpha.peloton.JobID | The job identifier. |
Response message for JobService.ListJobUpdates method. Return errors: NOT_FOUND: if the job ID is not found.
Field | Type | Label | Description |
---|---|---|---|
workflow_infos | .peloton.api.v1alpha.job.stateless.WorkflowInfo | repeated |
Request message for JobService.ListJobs method.
Response message for JobService.ListJobs method.
Field | Type | Label | Description |
---|---|---|---|
jobs | .peloton.api.v1alpha.job.stateless.JobSummary | repeated | List of all jobs. |
Request message for JobService.ListPods method.
Field | Type | Label | Description |
---|---|---|---|
job_id | .peloton.api.v1alpha.peloton.JobID | The job identifier of the pods to list. | |
range | .peloton.api.v1alpha.pod.InstanceIDRange | The instance ID range of the pods to list. If unset, all pods in the job will be returned. |
Response message for JobService.ListPods method. Return errors: NOT_FOUND: if the job ID is not found.
Field | Type | Label | Description |
---|---|---|---|
pods | .peloton.api.v1alpha.pod.PodSummary | repeated | Pod summary for all matching pods. |
Request message for JobService.PatchJob method.
Field | Type | Label | Description |
---|---|---|---|
job_id | .peloton.api.v1alpha.peloton.JobID | The job ID to be updated. | |
version | .peloton.api.v1alpha.peloton.EntityVersion | The current version of the job. It is used to implement optimistic concurrency control. | |
spec | .peloton.api.v1alpha.job.stateless.JobSpec | The new job configuration to be patched. | |
secrets | .peloton.api.v1alpha.peloton.Secret | repeated | The list of secrets for this job |
update_spec | .peloton.api.v1alpha.job.stateless.UpdateSpec | The update SLA specification. | |
opaque_data | .peloton.api.v1alpha.peloton.OpaqueData | Opaque data supplied by the client |
Response message for JobService.PatchJob method. Return errors: INVALID_ARGUMENT: if the job ID or job config is invalid. NOT_FOUND: if the job ID is not found. ABORTED: if the job version is invalid.
Field | Type | Label | Description |
---|---|---|---|
version | .peloton.api.v1alpha.peloton.EntityVersion | The new version of the job. |
Request message for JobService.PauseJobWorkflow method.
Field | Type | Label | Description |
---|---|---|---|
job_id | .peloton.api.v1alpha.peloton.JobID | The job identifier. | |
version | .peloton.api.v1alpha.peloton.EntityVersion | The current version of the job. | |
opaque_data | .peloton.api.v1alpha.peloton.OpaqueData | Opaque data supplied by the client |
Response message for JobService.PauseJobWorkflow method. Response message for JobService.RestartJob method. Return errors: NOT_FOUND: if the job ID is not found. ABORTED: if the job version is invalid.
Field | Type | Label | Description |
---|---|---|---|
version | .peloton.api.v1alpha.peloton.EntityVersion | The new version of the job. |
Request message for JobService.QueryJobs method.
Field | Type | Label | Description |
---|---|---|---|
spec | .peloton.api.v1alpha.job.stateless.QuerySpec | The spec of query criteria for the jobs. |
Response message for JobService.QueryJobs method. Return errors: INVALID_ARGUMENT: if the resource pool path or job states are invalid.
Field | Type | Label | Description |
---|---|---|---|
records | .peloton.api.v1alpha.job.stateless.JobSummary | repeated | List of jobs that match the job query criteria. |
pagination | .peloton.api.v1alpha.query.Pagination | Pagination result of the job query. | |
spec | .peloton.api.v1alpha.job.stateless.QuerySpec | Return the spec of query criteria from the request. |
Request message for JobService.QueryPods method.
Field | Type | Label | Description |
---|---|---|---|
job_id | .peloton.api.v1alpha.peloton.JobID | The job identifier of the pods to query. | |
spec | .peloton.api.v1alpha.pod.QuerySpec | The spec of query criteria for the pods. | |
pagination | .peloton.api.v1alpha.query.PaginationSpec | The spec of how to do pagination for the query results. | |
summary_only | bool | If set to true, only return the pod status and not the configuration. |
Response message for JobService.QueryPods method. Return errors: NOT_FOUND: if the job ID is not found.
Field | Type | Label | Description |
---|---|---|---|
pods | .peloton.api.v1alpha.pod.PodInfo | repeated | List of pods that match the pod query criteria. |
pagination | .peloton.api.v1alpha.query.Pagination | Pagination result of the pod query. |
Request message for JobService.RefreshJob method.
Field | Type | Label | Description |
---|---|---|---|
job_id | .peloton.api.v1alpha.peloton.JobID | The job ID to look up the job. |
Response message for JobService.RefreshJob method. Return errors: NOT_FOUND: if the job ID is not found.
Request message for JobService.ReplaceJob method.
Field | Type | Label | Description |
---|---|---|---|
job_id | .peloton.api.v1alpha.peloton.JobID | The job ID to be updated. | |
version | .peloton.api.v1alpha.peloton.EntityVersion | The current version of the job. It is used to implement optimistic concurrency control. | |
spec | .peloton.api.v1alpha.job.stateless.JobSpec | The new job configuration to be applied. | |
secrets | .peloton.api.v1alpha.peloton.Secret | repeated | The list of secrets for this job |
update_spec | .peloton.api.v1alpha.job.stateless.UpdateSpec | The update SLA specification. | |
opaque_data | .peloton.api.v1alpha.peloton.OpaqueData | Opaque data supplied by the client |
Response message for JobService.ReplaceJob method. Return errors: INVALID_ARGUMENT: if the job ID or job config is invalid. NOT_FOUND: if the job ID is not found. ABORTED: if the job version is invalid.
Field | Type | Label | Description |
---|---|---|---|
version | .peloton.api.v1alpha.peloton.EntityVersion | The new version of the job. |
Request message for JobService.RestartJob method.
Field | Type | Label | Description |
---|---|---|---|
job_id | .peloton.api.v1alpha.peloton.JobID | The job to restart. | |
version | .peloton.api.v1alpha.peloton.EntityVersion | The current version of the job. It is used to implement optimistic concurrency control. | |
batch_size | uint32 | Batch size for the restart request which controls how many instances may be restarted at the same time. | |
ranges | .peloton.api.v1alpha.pod.InstanceIDRange | repeated | The pods to restart, default to all. |
opaque_data | .peloton.api.v1alpha.peloton.OpaqueData | Opaque data supplied by the client |
Response message for JobService.RestartJob method. Return errors: NOT_FOUND: if the job ID is not found. ABORTED: if the job version is invalid.
Field | Type | Label | Description |
---|---|---|---|
version | .peloton.api.v1alpha.peloton.EntityVersion | The new version of the job. |
Request message for JobService.ResumeJobWorkflow method.
Field | Type | Label | Description |
---|---|---|---|
job_id | .peloton.api.v1alpha.peloton.JobID | The job identifier. | |
version | .peloton.api.v1alpha.peloton.EntityVersion | The current version of the job. | |
opaque_data | .peloton.api.v1alpha.peloton.OpaqueData | Opaque data supplied by the client |
Response message for JobService.ResumeJobWorkflow method. Response message for JobService.RestartJob method. Return errors: NOT_FOUND: if the job ID is not found. ABORTED: if the job version is invalid.
Field | Type | Label | Description |
---|---|---|---|
version | .peloton.api.v1alpha.peloton.EntityVersion | The new version of the job. |
Request message for JobService.StartJob method.
Field | Type | Label | Description |
---|---|---|---|
job_id | .peloton.api.v1alpha.peloton.JobID | The job to start | |
version | .peloton.api.v1alpha.peloton.EntityVersion | The current version of the job. It is used to implement optimistic concurrency control. |
Response message for JobService.StartJob method. Response message for JobService.RestartJob method. Return errors: NOT_FOUND: if the job ID is not found. ABORTED: if the job version is invalid.
Field | Type | Label | Description |
---|---|---|---|
version | .peloton.api.v1alpha.peloton.EntityVersion | The new version of the job. |
Request message for JobService.StopJob method.
Field | Type | Label | Description |
---|---|---|---|
job_id | .peloton.api.v1alpha.peloton.JobID | The job to stop | |
version | .peloton.api.v1alpha.peloton.EntityVersion | The current version of the job. It is used to implement optimistic concurrency control. |
Response message for JobService.StopJob method. Response message for JobService.RestartJob method. Return errors: NOT_FOUND: if the job ID is not found. ABORTED: if the job version is invalid.
Field | Type | Label | Description |
---|---|---|---|
version | .peloton.api.v1alpha.peloton.EntityVersion | The new version of the job. |
Job service defines the job related methods such as create, get, query and kill jobs.
Methods which mutate the state of the job.
Method Name | Request Type | Response Type | Description |
---|---|---|---|
CreateJob | CreateJobRequest | CreateJobResponse | Create a new job with the given configuration. |
ReplaceJob | ReplaceJobRequest | ReplaceJobResponse | Replace the configuration of an existing job with the new configuration. The caller is expected to provide the entire job configuration including the fields which are unchanged. |
PatchJob | PatchJobRequest | PatchJobResponse | Patch the configuration of an existing job. The caller is not expected to provide all the configuration fields and can provide only subset (e.g. provide only the fields which have changed). This is not supported yet. |
RestartJob | RestartJobRequest | RestartJobResponse | Restart the pods specified in the request. |
PauseJobWorkflow | PauseJobWorkflowRequest | PauseJobWorkflowResponse | Pause the current running workflow. If there is no current running workflow, or the current workflow is already paused, then the method is a no-op. |
ResumeJobWorkflow | ResumeJobWorkflowRequest | ResumeJobWorkflowResponse | Resume the current running workflow. If there is no current running workflow, or the current workflow is not paused, then the method is a no-op. |
AbortJobWorkflow | AbortJobWorkflowRequest | AbortJobWorkflowResponse | Abort the current running workflow. If there is no current running workflow, then the method is a no-op. |
StartJob | StartJobRequest | StartJobResponse | Start the pods specified in the request. |
StopJob | StopJobRequest | StopJobResponse | Stop the pods specified in the request. |
DeleteJob | DeleteJobRequest | DeleteJobResponse | Delete a job and all related state. |
GetJob | GetJobRequest | GetJobResponse | Get the configuration and runtime status of a job. |
GetJobIDFromJobName | GetJobIDFromJobNameRequest | GetJobIDFromJobNameResponse | Get the job UUID from job name. |
GetWorkflowEvents | GetWorkflowEventsRequest | GetWorkflowEventsResponse | Get the events of the current / last completed workflow of a job |
ListPods | ListPodsRequest | ListPodsResponse | List all pods in a job for a given range of pod IDs. |
QueryPods | QueryPodsRequest | QueryPodsResponse | Query pod info in a job using a set of filters. |
QueryJobs | QueryJobsRequest | QueryJobsResponse | Query the jobs using a set of filters. TODO find the appropriate service to put this method in. |
ListJobs | ListJobsRequest | ListJobsResponse | Get summary for all jobs. Results are streamed back to the caller in batches and the stream is closed once all results have been sent. |
ListJobUpdates | ListJobUpdatesRequest | ListJobUpdatesResponse | List all updates (including current and previously completed) for a given job. |
GetReplaceJobDiff | GetReplaceJobDiffRequest | GetReplaceJobDiffResponse | Get the list of instances which will be added/removed/updated if the given job specification is applied via the ReplaceJob API. |
RefreshJob | RefreshJobRequest | RefreshJobResponse | Allows user to load job runtime status from the database and re-execute the action associated with current state. |
GetJobCache | GetJobCacheRequest | GetJobCacheResponse | Get the job state in the cache. |
.proto Type | Notes | C++ Type | Java Type | Python Type |
---|---|---|---|---|
double | double | double | float | |
float | float | float | float | |
int32 | Uses variable-length encoding. Inefficient for encoding negative numbers – if your field is likely to have negative values, use sint32 instead. | int32 | int | int |
int64 | Uses variable-length encoding. Inefficient for encoding negative numbers – if your field is likely to have negative values, use sint64 instead. | int64 | long | int/long |
uint32 | Uses variable-length encoding. | uint32 | int | int/long |
uint64 | Uses variable-length encoding. | uint64 | long | int/long |
sint32 | Uses variable-length encoding. Signed int value. These more efficiently encode negative numbers than regular int32s. | int32 | int | int |
sint64 | Uses variable-length encoding. Signed int value. These more efficiently encode negative numbers than regular int64s. | int64 | long | int/long |
fixed32 | Always four bytes. More efficient than uint32 if values are often greater than 2^28. | uint32 | int | int |
fixed64 | Always eight bytes. More efficient than uint64 if values are often greater than 2^56. | uint64 | long | int/long |
sfixed32 | Always four bytes. | int32 | int | int |
sfixed64 | Always eight bytes. | int64 | long | int/long |
bool | bool | boolean | boolean | |
string | A string must always contain UTF-8 encoded or 7-bit ASCII text. | string | String | str/unicode |
bytes | May contain any arbitrary sequence of bytes. | string | ByteString | str |