You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jun 7, 2022. It is now read-only.
(Required) Maximum amount of time in seconds that the job's task is allowed to run. The timer is started once the task transitions to the 'RUNNING' state. If a task terminates with an error and is restarted, the timer starts again from 0.
true when the task should be restarted after being terminated due to runtime limit.
Capacity
This data structure is associated with a service job and specifies the
number of tasks to run (desired).
At any point in time, the condition min <= desired <= max must hold true. The
desired state may be changed by a user, but also may be changed as a result
of an auto-scaling action.
Not supported yet. (Optional) An expression combining multiple constraints. For example 'zoneBalance AND serverGroup == "mySG"'. Avalilable operators: <, <=, ==, >, >=, in, like, AND, OR
To clear (unset) the entrypoint of the image, pass a single empty string value: [""] |
| command | string | repeated | (Optional) Additional parameters for the entrypoint defined either here or provided in the container image. To clear (unset) the command of the image, pass a single empty string value: [""] |
| env | Container.EnvEntry | repeated | (Optional) A collection of system environment variables passed to the container. |
| softConstraints | Constraints | | (Optional) Constraints that Titus will prefer to fulfill but are not required. These constraints apply to the whole task. |
| hardConstraints | Constraints | | (Optional) Constraints that have to be met for a task to be scheduled on an agent. These constraints apply to the whole task. |
| experimental | google.protobuf.Any | | (Optional) Experimental features |
| volumeMounts | VolumeMount | repeated | (Optional) An array of VolumeMounts. These VolumeMounts will be mounted in the container, and must reference one of the volumes declared for the Job. See the k8s docs https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.19/#volumemount-v1-core for more technical details. |
(Optional) Size of shared memory /dev/shm. If not set, a default value will be provided. A provided value must be less than or equal to amount of memory allocated.
To reference an image, a user has to provide an image name and a version. A
user may specify a version either with
a tag value (for example 'latest') or a digest. When submitting a job, a user
should provide either a tag or a digest value only (not both of them).
For example, docker images can be referenced by {name=titus-examples,
tag=latest}. A user could also choose to provide only the digest without a
tag. In this case, the tag value would be empty.
(Optional) Desired number of tasks to run (min <= desired <= max)
JobChangeNotification
Job event stream consists of two phases. In the first phase, a snapshot of
the current state (a job and its tasks) is
streamed, and it is followed by the SnapshotEnd notification marker. In the
second phase, job/task state updates are sent. When a job is terminated, the
stream completes.
movedFromAnotherJob will be true on the first event for the target Job after a task is moved between jobs. task.jobId will be the destination job, and it will include a 'task.movedFromJob' entry in its taskContext map with the source jobId.
(Optional) Mostly relevant for service jobs, but applicable to batch jobs as well, allows a user to specify own unique identifier for a job (see JobGroupInfo for more information).
(Optional) Extra Containers can be specificed to run alongside the main container in a "pod" (similar to k8s pods). Additional containers can be specified in this field, and they will be launched together with the main container, sharing its resources (network/ram/cpu/gpu/etc). Startup ordering happens in the following way: 1. Titus System Services 2. Platform Sidecars (configured below) 3A. extraContiners (this field) 3B. The main container (container field)
(Optional) Array of platform sidecars to launch alongside the task. These platform sidecars are always ordered after Titus System Services, and before any user container (main or extraContainers).
Percentage of containers that can be relocated within a time interval. The
number of containers is determined
during each evaluation, and the number is based on the current desired job
size. If the job size changes, the percentage of containers changes
accordingly. For example, setting / interval to 60000 (1 minute) and
ratePercentagePerInterval to 5 (5%) would allow only for up to 5% of all
containers to be relocated every minute, given the other criteria are met.
For a job with a desired size of 100, 5 container relocations per minute
would be allowed. If the desired job size changes to 200, the relocation
rate increases to 10 containers per minute.
Self managed task relocation policy for users that would like to
orchestrate custom termination logic.
If the containers are not terminated within the confgured amount of time,
the system default migration policy is assumed instead.
Additional information for building a supplementary job identifier, as the
'applicationName' can be shared by
many jobs running at the same time in Titus. By setting 'JobGroupInfo', a
user may create a job id that is guaranteed to be unique accross all
currently running Titus jobs. The uniquness is checked if any of the
attributes in this record is a non empty string. The full name is built as:
'<application_name>-<stack>-<detail>-<sequence>.
(Optional) Collection of fields and their values for a filter. Available query criteria: jobIds - list of comma separated job ids taskIds - list of comma separated task ids owner - job owner applicationName - job application name imageName - image name imageTag - image tag capacityGroup - job assigned capacity group jobGroupStack - job group stack jobGroupDetail - job group details jobGroupSequence - job group sequence jobType - job type (batch or service) attributes - comma separated job attribute key/value pairs (for example "key1,key2:value2;k3:value3") attributes.op - logical 'and' or 'or' operators, which should be applied to multiple attributes specified in the query jobState - job state (one) taskStates - task states (multiple, comma separated). Empty value is the same as no value set. taskStateReasons - reasons associated with task states (multiple, comma separated) needsMigration - if set to true, return only jobs with tasks that require migration
(Optional) If set, only field values explicitly specified in this parameter will be returned This does not include certain attributes like 'jobId', 'appName' which are always returned. If the nested field value is provided, only the explicitly listed nested fields will be returned. For example: tasks.taskId rule will result in including just this value when encoding Task entity.
(Optional) An identifier of an event that caused a transition to this state. Each job manager can introduce its own set of reason codes. As of now, there are no common reason codes defined for jobs.
URL address to a container log service. When a container is running, its
stdout/stderr or any other file in the
'/logs' folder can be acccessed via this endpoint. The endpoint becomes
unavailable when the container terminates.
A user should provide the 'f' query parameter to specify a file to
download. If the 'f' query parameter is net set, it defaults to 'stdout'.
The file path must be relative to the '/logs' folder.
The deadline time that the owner must migrate their task by or the system will automatically do it. This value is irrelevant if 'needsMigration' is set to false and will default to the value '0'.
Sets the overall network mode for all containers for a Task launched by this job
ObserveJobsQuery
The filtering criteria is applied to both Job and Task events. If a criteria
applies to task fields, the stream will include both task events matching it,
and events for jobs with tasks that match it. The opposite is also true,
e.g.: a criteria on applicationName (a job field) will include both job
events matching it, and events for tasks belonging to a job that matches it.
(Optional) Collection of fields and their values for a filter. Available query criteria: jobIds - list of comma separated job ids taskIds - list of comma separated task ids owner - job owner applicationName - job application name imageName - image name imageTag - image tag capacityGroup - job assigned capacity group jobGroupStack - job group stack jobGroupDetail - job group details jobGroupSequence - job group sequence jobType - job type (batch or service) attributes - comma separated job attribute key/value pairs. The same key may occur multiple times, with different values (any value matches the filter). A value may be omitted, in which case if the key occurs only once, only presence of the key is checked, without value comparison (otherwise the value is an empty string). Example filters: * 'key1' - matches, if the key is present * 'key2:value2' - matches if the attributes contain key 'key2' with value 'value2' * 'key3,key3:value3a,key3:value3b' - matches if the attributes contain key 'key3' with value '' or 'value3a' or 'value3b' All the above can be passed together as 'key1,key2:value2,key3,key3:value3a,key3:value3b' attributes.op - logical 'and' or 'or' operators, which should be applied to multiple attributes specified in the query jobState - job state (one) taskStates - task states (multiple, comma separated). Empty value is the same as no value set. taskStateReasons - reasons associated with task states (multiple, comma separated) needsMigration - if set to true, return only jobs with tasks that require migration
(Required) Includes: * agent execution environment: 'agent.region', 'agent.zone', 'agent.host', 'agent.instanceId' * job type specific information: 'task.index', 'task.resubmitOf' (id of task which this task is replacing), 'task.originalId' (id of task which this task is a replacement)
(Optional) If set to true, and this is a terminate and shrink request ('shrink' set to true), reject the kill request if it would cause the job size go below the current minimum size. Otherwise, the job size minimum size is decremented by 1.
(Optional) Collection of fields and their values for a filter. Available query criteria: jobIds - list of comma separated job ids taskIds - list of comma separated task ids owner - job owner applicationName - job application name imageName - image name imageTag - image tag capacityGroup - job assigned capacity group jobGroupStack - job group stack jobGroupDetail - job group details jobGroupSequence - job group sequence jobType - job type (batch or service) attributes - comma separated job attribute key/value pairs. The same key may occur multiple times, with different values (any value matches the filter). A value may be omitted, in which case if the key occurs only once, only presence of the key is checked, without value comparison (otherwise the value is an empty string). Example filters: * 'key1' - matches, if the key is present * 'key2:value2' - matches if the attributes contain key 'key2' with value 'value2' * 'key3,key3:value3a,key3:value3b' - matches if the attributes contain key 'key3' with value '' or 'value3a' or 'value3b' All the above can be passed together as 'key1,key2:value2,key3,key3:value3a,key3:value3b' attributes.op - logical 'and' or 'or' operators, which should be applied to multiple attributes specified in the query jobState - job state (one) taskStates - task states (multiple, comma separated). Empty value is the same as no value set. taskStateReasons - reasons associated with task states (multiple, comma separated) needsMigration - if set to true, return only tasks that require migration skipSystemFailures - a filter for finished tasks only (does not affect non-finished tasks). If set to true, a finished task that failed due to a system error is filtered out. System error codes are specified in the TaskStatus type definition. These are container failures due to Titus internal issues.
(Optional) An identifier of an event that caused a transition to this state. Each job manager can introduce its own set of reason codes. Below are the predefined (common) set of reason codes associated with task state 'Finished': * 'normal' - task completed with the exit code 0 * 'failed' - task completed with a non zero error code * 'killed' - task was explicitly terminated by a user * 'scaledDown' - task was terminated as a result of job scaling down * 'stuckInState' - task was terminated, as it did not progress to the next state in the expected amount of time * 'runtimeLimitExceeded' - task was terminated, as its runtime limit was exceeded * 'lost' - task was lost, and its final status is unknown * 'invalidRequest' - invalid container definition (security group, image name, etc) * 'crashed' - container crashed due to some internal system error * 'transientSystemError' - transient error, not agent specific (for example AWS rate limiting) * 'localSystemError' - an error scoped to an agent instance on which a container was run. The agent should be quarantined or terminated. * 'unknownSystemError' - unknown error which cannot be classified either as local/non-local or transient. If there are multiple occurences of this error, the agent should be quarantined or terminated.
Struct containing image information about the container
JobStatus.JobState
State information associated with a job.
Name
Number
Description
Accepted
0
A job is persisted in Titus and is ready to be scheduled.
KillInitiated
1
A job still has running tasks that were requested to be terminated. No more tasks for this job are deployed. Job policy update operations are not allowed.
Finished
2
A job has no running tasks, and new tasks cannot be created. Job policy update operations are not allowed.
NetworkConfiguration.NetworkMode
Name
Number
Description
UnknownNetworkMode
0
Unknown, the backend will have to chose a sane default base on other inputs
Ipv4Only
1
IPv4 only means the task will not get an ipv6 address, and will only get a unique v4.
Ipv6AndIpv4
2
IPv6 And IPv4 (True Dual Stack), each task gets a unique v6 and v4 address.
Ipv6AndIpv4Fallback
3
IPv6 and IPv4 Fallback uses the Titus IPv4 "transition mechanism" to give v4 connectivity transparently without providing every container their own IPv4 address. From a spinnaker/task perspective, only an IPv6 address is allocated to the task.
Ipv6Only
4
IPv6 Only is for true believers, no IPv4 connectivity is provided.
HighScale
5
HighScale is a special mode, which applies opinionated network settings to the workload for maximum scalability for the network. Enabling this mode removes the option for the user to select which subnets or security groups in use by the workload. Instead, special HighScale subnets and security groups are chosen.
TaskStatus.ContainerState.ContainerHealth
Name
Number
Description
Unset
0
Unset means we haven't gotten any signal yet about healthiness
Unhealthy
1
Unhealthy means the container is no longer passing its healthcheck
Healthy
2
Healthy means the container is passing its healthcheck
TaskStatus.TaskState
State information associated with a task.
Name
Number
Description
Accepted
0
A task was passed to the scheduler but has no resources allocated yet.
Launched
1
A task had resources allocated and was passed to Mesos.
StartInitiated
2
An executor provisioned resources for a task.
Started
3
The container was started.
KillInitiated
4
A user requested the task to be terminated. An executor is stopping the task and releasing its allocated resources.
Disconnected
5
No connectivity between Mesos and an agent running a task. The task's state cannot be determined until the connection is established again.
Finished
6
A task completed or was forced by a user to be terminated. All resources previously assigned to this task are released.
Return a collection of jobs matching the given criteria. The query result is limited to the active data set. Finished jobs/tasks are not evaluated when the query is executed.
On subscription, sends complete job (definition and active tasks). Next, send distinct job definition or task state chage notifications. The stream is closed by the server only when the job is finished, which happens after the 'JobFinished' notification is delivered.
Return a collection of tasks specified in the 'TaskQuery' request matching the given criteria. The query result is limited to the active data set. Finished jobs/tasks are not evaluated when the query is executed.
Move a task from one service job to another. Source and destination jobs must be service jobs, and compatible. Jobs are compatible when their JobDescriptors are identical, ignoring the following values:
owner * applicationName * jobGroupInfo (stack, details, sequence) * disruptionBudget * Any attributes not prefixed with titus. or titusParameter. * Any container.attributes not prefixed with titus. or titusParameter. * All information specific to service jobs (JobSpec): Capacity, RetryPolicy, MigrationPolicy, etc |