What to do if we want to start a process that performs a small task, such as taking a backup of a database, and then shuts itself down? We could use a regular Pod object for this. But there is one problem with that approach. Kubernetes believes regular Pods are created to stay alive for a long time. If the Pod completes its task and shuts down Kubernetes believes something went wrong, and might even try to restart the Pod for us1.
Instead of using a regular Pod we should instead use the Job. A Job is an abstraction on top of a Pod. It creates a Pod that performs a given task and then shuts down. If we want to perform a given task with a Job at a regular interval, e.g. once per day, we can use a further abstraction called a CronJob. A CronJob creates a Job according to an interval set with a cron-expression.
Typical tasks you would perform with Jobs and CronJobs include taking periodic backups, generating reports, exporting data from a source to some destination, or anything else that makes sense to do as a one-off task or a repeated task according to a schedule.
Jobs#
A Job is an abstraction on top of a Pod. A Job is a workload resource similar to a Deployment, but ultimately used for a separate purpose. As with any Kubernetes object we can define a Job using a YAML manifest. The manifest for a simple Job that sleeps for 15 seconds, prints the current date and time and echos a short message looks like this:
# job.yaml
apiVersion: batch/v1
kind: Job
metadata:
name: simple-job
spec:
template:
spec:
containers:
- name: hello
image: busybox
command:
- /bin/sh
- -c
- sleep 15; date; echo Hello from Job!
restartPolicy: Never
The .apiVersion
is batch/v1
, which indicates that this workload resource belongs in the batch type of resources. The manifest also has a .kind
with the value of Job
, a .metadata.name
that gives this Job a name we can refer to, and a .spec
. The .spec.template
part includes the specification of a regular Pod. We can also see .spec.template.spec.restartPolicy
with a value of Never
. This controls what should happen if the Pod fails during the execution. Valid values for the restart policy is Never
and OnFailure
.
With my manifest stored in job.yaml
I can create the Job object with kubectl apply
:
$ kubectl apply -f job.yaml
job.batch/simple-job created
I can list my Jobs with kubectl get jobs
:
$ kubectl get jobs
NAME COMPLETIONS DURATION AGE
simple-job 0/1 15s 15s
If I have many Jobs and I only want to see my Job named simple-job
I can provide the name to the previous command:
$ kubectl get job simple-job
NAME COMPLETIONS DURATION AGE
simple-job 1/1 20s 26s
Comparing the outputs from the two previous commands show that the Job completed after 20 seconds, and the COMPLETIONS
counter increased from 0 to 1. The COMPLETIONS
counter suggests that we could run many copies of the job, and that is true. We can specify the number of copies of the job that should run with .spec.completions
and we can control how many copies that run in parallel with .spec.parallelism
. If we add these settings to our manifest we get this:
# job2.yaml
apiVersion: batch/v1
kind: Job
metadata:
name: less-simple-job
spec:
completions: 9
parallelism: 3
template:
spec:
containers:
- name: hello
image: busybox
command:
- /bin/sh
- -c
- sleep 15; date; echo Hello from Job!
restartPolicy: Never
I delete my old Job with kubectl delete
:
$ kubectl delete -f job.yaml
job.batch "simple-job" deleted
Then I create my new Job (stored in job2.yaml
) with kubectl apply
:
$ kubectl apply -f job2.yaml
job.batch/less-simple-job created
I wait for about 20 seconds and then I list my Jobs:
$ kubectl get jobs
NAME COMPLETIONS DURATION AGE
less-simple-job 3/9 23s 23s
Now I can see that three out of nine executions have completed. Three are run at the same time because I set .spec.parallelism
to 3, and a total of 9 executions will complete because that is what I set in .spec.completions
.
If I list my Pods I can see that each execution started up a new Pod:
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
less-simple-job-4rskz 0/1 Completed 0 2m35s
less-simple-job-8w9bc 0/1 Completed 0 3m18s
less-simple-job-b7rhb 0/1 Completed 0 3m18s
less-simple-job-bbx4w 0/1 Completed 0 2m55s
less-simple-job-hbtdp 0/1 Completed 0 2m33s
less-simple-job-jz7sw 0/1 Completed 0 2m57s
less-simple-job-mdnp9 0/1 Completed 0 2m37s
less-simple-job-nnzl8 0/1 Completed 0 3m18s
less-simple-job-v8dqb 0/1 Completed 0 2m55s
To make sure that each Pod actually performed the task it was supposed to do, we can check the logs for one of the Pods. I can do this using kubectl logs
:
$ kubectl logs less-simple-job-4rskz
Mon Jan 2 18:30:39 UTC 2023
Hello from Job!
The Pods have a STATUS
of Completed
. If I delete my Job object then the corresponding Pods will be deleted as well.
CronJobs#
Jobs are limited in the sense that they run once (or however many times we specify in .spec.completions
), but after that they stop and we need to recreate the Job if we want it to run again. For Jobs that should be repeated according to some schedule we can instead use the CronJob resource.
A CronJob follows a schedule we can define using a cron-expression2. A cron-expressiong is a string with five fields. The following image illustrates the components of a cron-expression3:
You can also use a few convenient expressions such as:
@yearly
- run once a year at midnight of January 1 (equivalent to0 0 1 1 *
)@monthly
- run once a month at midnight on the first of each month (equivalent to0 0 1 * *
)@weekly
- run once a week, on midnight on Sunday (equivalent to0 0 * * 0
)@daily
- run once a day at midnight (equivalent to0 0 * * *
)@hourly
- run once every hour at the start of the hour (equivalent to0 * * * *
)
Let’s say we want to define a cron-expression for a CronJob that should run every Thursday at 14:05 on every other month (January, March, May, …):
An example manifest of a CronJob using the cron-expression we just built looks like this:
# cronjob.yaml
apiVersion: batch/v1
kind: CronJob
metadata:
name: simple-cronjob
spec:
schedule: "5 14 * 1/2 *"
jobTemplate:
spec:
template:
spec:
containers:
- name: hello
image: busybox
command:
- /bin/sh
- -c
- sleep 15; date; echo Hello from CronJob!
restartPolicy: OnFailure
With this manifest stored in cronjob.yaml
I can create the CronJob with kubectl apply
:
$ kubectl apply -f cronjob.yaml
cronjob.batch/simple-cronjob created
I can list all the CronJobs I have with kubectl get cronjobs
:
$ kubectl get cronjobs
NAME SCHEDULE SUSPEND ACTIVE LAST SCHEDULE AGE
simple-cronjob 5 14 * 1/2 * False 0 <none> 11s
I can shorten the previous command by using the short form of cronjobs
which is cj
. So the previous command can be written as kubectl get cj
.
Unfortunately this cron-expression is not the best for illustrative purposes. Let me instead re-create the CronJob with the following cron-expression: */2 * * * *
. This cron-expression says run a Job every 2 minutes. I update my CronJob and I let it run for a few minutes. If I now list my Jobs I see the following:
$ kubectl get jobs
NAME COMPLETIONS DURATION AGE
simple-cronjob-27877731 1/1 20s 4m24s
simple-cronjob-27877733 1/1 20s 2m24s
simple-cronjob-27877735 1/1 20s 24s
I see several Jobs have been created with two minutes space between them. I can also list my Pods and see a similar result:
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
simple-cronjob-27877731-fn7hq 0/1 Completed 0 4m32s
simple-cronjob-27877733-mznj2 0/1 Completed 0 2m32s
simple-cronjob-27877735-9ll6f 0/1 Completed 0 32s
What happens after a few days, will my lists of Jobs and Pods be extremely long? Not really. In the manifest for CronJobs we can specify .spec.successfulJobsHistoryLimit
with the number of completed Jobs we wish to keep. The default value is three, so there will never be more than three completed Jobs left in the list of Jobs.
For regular Jobs we can also add a .spec.ttlSecondsAfterFinished
that specifies for how long the Job should be kept after it completes. After that time has passed the Job will be deleted automatically.
Summary#
In this article we saw Jobs and CronJobs. We learned what Jobs and CronJobs are for and how we can create them using Kubernetes manifests. We learned about cron-expressions that are used to define the schedule for when a CronJob is run. We also learned how we can run a Job a number of times and control how many runs are performed in parallel. Finally we saw how to control how many Jobs are retained in the history for a CronJob, as well as how to specify a time-to-live for regular Jobs after completion.
In the next article we will revisit the topic of Pods again. We will look at some operational aspects related to Pods. Specifically we will see what readiness-probes are, what liveness-probes are, and how to fetch logs from containers, a bit more about how to use kubectl
for interacting with Pods, and finally a few more details in the Pod manifest that are of interest.
If it tries to restart the Pod or not depends on the specific configuration of the Pod. Also, if the Pod is created via a Deployment with a given replica count then the Deployment will keep restarting Pods for us to keep the desired number of replicas running. ↩︎
I recommend crontab.guru as an aid for defining cron-expressions. ↩︎
In the illustration a star
*
looks like a filled circle. This is an unfortunate consequence of my drawing tool GoAT. However, I currently do not wish to use any other tool for drawing diagrams! We’ll have to live with this issue for now. ↩︎