Zero Downtime Deployment#

What Is The Zero Downtime Deployment?#

Zero downtime deployment in Kubernetes is a process that allows updates to take place without causing any interruption to the service. This is achieved by incrementally replacing the current Pods with new ones and usually called Rolling update.
Rolling Updates: By default, Kubernetes deployments roll-out pod version updates with a rolling update strategy. This strategy aims to prevent application downtime by keeping at least some instances up-and-running at any point in time while performing the updates. Old pods are only shutdown after new pods of the new deployment version have started-up and became ready to handle traffic.
Depending on the workload and available compute resources, we can configure how many instances we want to over- or under-provision at any time. For example, given three desired replicas, should we create three new pods immediately and wait for all of them to start up, should we terminate all old pods except one, or do the transition one-by-one.
To make Kubernetes knows the pods are ready to to handle traffic or shutdown we have to configure the Readiness Probe and Liveness Probe. Kubernetes will only route the client traffic to the pods with a healthy `Liveness Probe.
The pods have to provide a method for Kubernetes then it can check the health status of them. In Spring Boot 3.x Actuator & Graceful Shutdown we used Spring Boot Actuator for monitoring and managing Spring Boot application. It also exporting Liveness and Readiness Probes which will help Spring Boot integrates better with Kubernetes deployment environment.

Zero Downtime Configuration Example#

To speed up the configuration we will reuse the examples from other topics as below.
1. For Spring Boot Actuator configuration we will result the Spring Boot project of topic Spring Boot 3.x Actuator & Graceful Shutdown.
2. Then to build the Docker Image for that Spring Boot project we will result the Dockerfile in the topic Docker With SpringBoot
3. Then to public the docker image of step above to the Minikube Docker Repository we will review step Spring Boot Deployment in Minikube Sample Project.
4. Finally, we will reuse the existing deployment configuration in topic Helm.
Let's do the step 1 to step 3 by yourself. For the step 4 we just need to copy the all the files in the folder sample of Helm and paste to another folder ex: zero-downtime.

Now, let's open the spring-boot-deployment.yaml and put the configurations as below.

spring-boot-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: {{ .Values.springboot.app.name }}
  labels:
    app: {{ .Values.springboot.app.name }}
spec:
  replicas: 1
  strategy:
   rollingUpdate:
     maxSurge: 1
     maxUnavailable: 0
   type: RollingUpdate
  selector:
    matchLabels:
      app: {{ .Values.springboot.app.name }}
  template:
    metadata:
      labels:
        app: {{ .Values.springboot.app.name }}  
      annotations:  
        # each invocation of the template function randAlphaNum will generate a unique random string. Thus random string always changes and causes the deployment to roll.
        rollme: {{ randAlphaNum 5 | quote }}
    spec:
      containers:
      - name: {{ .Values.springboot.containers.name }}
        image: {{ .Values.springboot.containers.image }}
        imagePullPolicy: Never
        readinessProbe:
          httpGet:
            path: /actuator/health/readiness
            port: {{ .Values.springboot.containers.port }}
          initialDelaySeconds: 5
          periodSeconds: 5
          successThreshold: 1
        livenessProbe:
          httpGet:
            path: /actuator/health/liveness
            port: {{ .Values.springboot.containers.port }}
          initialDelaySeconds: 5
          periodSeconds: 5
          successThreshold: 1
        lifecycle:
          preStop:
            exec:
              command: ["/bin/bash", "-c", "sleep 15"]
        ports:
        - containerPort: {{ .Values.springboot.containers.port }}
        env:
        - name: USERNAME
          valueFrom:
            secretKeyRef:
              name: {{ .Values.mongo.secret.name }}
              key: mongo-user
        - name: PASSWORD
          valueFrom:
            secretKeyRef:
              name: {{ .Values.mongo.secret.name }}
              key: mongo-password
        - name: DB_URL
          valueFrom:
            configMapKeyRef:
              name: {{ .Values.mongo.config.name }}
              key: mongo-url
        - name: PORT
          valueFrom:
            configMapKeyRef:
              name: {{ .Values.mongo.config.name }}
              key: mongo-port
        - name: ENV
          valueFrom:
            configMapKeyRef:
              name: {{ .Values.springboot.config.name }}
              key: env

There are two new parts that we added into the existing configuration. The first part is used for configuring the strategy for the Rolling update.

...

spec:
  replicas: 1
  strategy:
   rollingUpdate:
     maxSurge: 1
     maxUnavailable: 0
   type: RollingUpdate

....

replicas: 1: This specifies that the desired number of replicas (instances) of your application should be 1. Replicas are multiple instances of your application running in the Kubernetes cluster.
strategy: This section defines the deployment strategy, determining how updates to your application are rolled out.
- rollingUpdate: This indicates that updates should be rolled out gradually, replacing old replicas with new ones.
  - maxSurge: 1: During an update, this allows one additional replica to be created before old replicas are terminated. This can be helpful to ensure that there's always at least one instance running during the update.
  - maxUnavailable: 0: Specifies that during an update, there should be no more than 0 unavailable replicas. This ensures that there is no downtime during the update process; new replicas are created before old ones are terminated.
- type: RollingUpdate: This specifies that the update strategy is a rolling update. In a rolling update, new replicas are gradually introduced while old ones are gradually terminated, minimizing the impact on the overall application.
Next second part below will be used for configuring the readinessProbe, livenessProbe and lifecycle of the pod.

......

    spec:
      containers:
      - name: {{ .Values.springboot.containers.name }}
        image: {{ .Values.springboot.containers.image }}
        imagePullPolicy: Never
        readinessProbe:
          httpGet:
            path: /actuator/health/readiness
            port: {{ .Values.springboot.containers.port }}
          initialDelaySeconds: 5
          periodSeconds: 5
          successThreshold: 1
        livenessProbe:
          httpGet:
            path: /actuator/health/liveness
            port: {{ .Values.springboot.containers.port }}
          initialDelaySeconds: 5
          periodSeconds: 5
          successThreshold: 1
        lifecycle:
          preStop:
            exec:
              command: ["/bin/bash", "-c", "sleep 15"]
        ports:
        - containerPort: {{ .Values.springboot.containers.port }}

......

readinessProbe: Configures a probe to check the readiness of the application.
- httpGet: Performs an HTTP GET request on the specified path (/actuator/health/readiness) and port to determine if the application is ready.
- initialDelaySeconds: Specifies the number of seconds after the container has started before the readiness probe is initiated.
- periodSeconds: Specifies how often (in seconds) to perform the readiness probe.
- successThreshold: Specifies the number of consecutive successes required for the probe to be considered successful.
livenessProbe: Configures a probe to check the liveness of the application.
- httpGet: Performs an HTTP GET request on the specified path (/actuator/health/liveness) and port to determine if the application is live.
- initialDelaySeconds: Specifies the number of seconds after the container has started before the liveness probe is initiated.
- periodSeconds: Specifies how often (in seconds) to perform the liveness probe.
- successThreshold: Specifies the number of consecutive successes required for the probe to be considered successful.
lifecycle: Defines actions to be executed at the lifecycle phase of the container.
- preStop: Specifies a command to be executed before the container is terminated. In this case, it sleeps for 15 seconds (sleep 15) to allow graceful termination.

Testing#

Now, let's use command below to deploy the helm chart.

helm upgrade -f helm/zero-downtime/values.yaml my-sample-dev helm/zero-downtime/

If you deployed your helm chart before then run the command below to upgrade the helm chart.

helm upgrade -f helm/zero-downtime/values.yaml my-sample-dev helm/zero-downtime/

Now, to test the Zero Downtime let's use K6 and call the api of the Pod continuously for a period of time and deploy the pod again through upgrading the helm chart in that time. We should see there is no any failed api call after K6 finished the execution.
We have the execution scripts for K6 as below. You can view more at K6 topic.

k6.js
import http from "k6/http";
import { sleep } from "k6";

export const options = {
  // Key configurations for Stress in this section
  stages: [
    { duration: "30s", target: 10 }, // traffic ramp-up from 1 to 10 users over 30s.
    { duration: "1m", target: 10 }, // stay at 10 users for 1 minutes
    { duration: "1m", target: 0 }, // ramp-down to 0 users
  ],
};

export default function () {
  http.get(
    "http://192.168.49.2:31000/v1/json/validator/schemas/CustomerJsonSchemaValidatorDev"
  );
  sleep(1);
}

Then we use the command below to execute the K6.

1	`k6 run k6.js`

After the K6 is running let's use the upgrade helm chart command to redeploy the pod.

helm upgrade -f helm/zero-downtime/values.yaml my-sample-dev helm/zero-downtime/

In this example we will try to deploy 2 times when the K6 scripts is running. We will see the result as below.

┌─(~/study/kubernetes)───────────────────────────────────────────────────────────────────────────────────────────────────────(duc@duc-MS-7E01:pts/2)─┐
└─(15:01:13 on main)──>                                                                                                               ──(CN,Thg 204)─┘
helm upgrade -f helm/zero-downtime/values.yaml my-sample-dev helm/zero-downtime/
Release "my-sample-dev" has been upgraded. Happy Helming!
NAME: my-sample-dev
LAST DEPLOYED: Sun Feb  4 15:02:25 2024
NAMESPACE: default
STATUS: deployed
REVISION: 35
TEST SUITE: None
┌─(~/study/kubernetes)───────────────────────────────────────────────────────────────────────────────────────────────────────(duc@duc-MS-7E01:pts/2)─┐
└─(15:02:26 on main)──>                                                                                                               ──(CN,Thg 204)─┘
helm upgrade -f helm/zero-downtime/values.yaml my-sample-dev helm/zero-downtime/
Release "my-sample-dev" has been upgraded. Happy Helming!
NAME: my-sample-dev
LAST DEPLOYED: Sun Feb  4 15:03:27 2024
NAMESPACE: default
STATUS: deployed
REVISION: 36
TEST SUITE: None

Then after the K6 finished we can see there is no any failed request. It means the configurations for Zero Downtime deployment work correctly.

┌─(~/study/kubernetes/helm/zero-downtime)────────────────────────────────────────────────────────────────────────────────────(duc@duc-MS-7E01:pts/1)─┐
└─(15:02:13 on main)──> k6 run k6.js                                                                                                  ──(CN,Thg 204)─┘

          /\      |‾‾| /‾‾/   /‾‾/   
     /\  /  \     |  |/  /   /  /    
    /  \/    \    |     (   /   ‾‾\  
   /          \   |  |\  \ |  (‾)  | 
  / __________ \  |__| \__\ \_____/ .io

  execution: local
     script: k6.js
     output: -

  scenarios: (100.00%) 1 scenario, 10 max VUs, 3m0s max duration (incl. graceful stop):
           * default: Up to 10 looping VUs for 2m30s over 3 stages (gracefulRampDown: 30s, gracefulStop: 30s)


     data_received..................: 10 MB  67 kB/s
     data_sent......................: 151 kB 1.0 kB/s
     http_req_blocked...............: avg=21.73µs  min=1.14µs  med=10.11µs  max=786.42µs p(90)=12.7µs  p(95)=15.88µs
     http_req_connecting............: avg=9.27µs   min=0s      med=0s       max=630.97µs p(90)=0s      p(95)=0s     
     http_req_duration..............: avg=7.88ms   min=1.16ms  med=7.19ms   max=199.43ms p(90)=11.24ms p(95)=13.01ms
       { expected_response:true }...: avg=7.88ms   min=1.16ms  med=7.19ms   max=199.43ms p(90)=11.24ms p(95)=13.01ms
     http_req_failed................: 0.00%  ✓ 0        ✗ 1075
     http_req_receiving.............: avg=590.28µs min=27.09µs med=559.36µs max=3.54ms   p(90)=1ms     p(95)=1.19ms 
     http_req_sending...............: avg=41.9µs   min=8.35µs  med=41.76µs  max=176.7µs  p(90)=54.12µs p(95)=61.43µs
     http_req_tls_handshaking.......: avg=0s       min=0s      med=0s       max=0s       p(90)=0s      p(95)=0s     
     http_req_waiting...............: avg=7.25ms   min=1.13ms  med=6.54ms   max=198.04ms p(90)=10.19ms p(95)=11.78ms
     http_reqs......................: 1075   7.159722/s
     iteration_duration.............: avg=1s       min=1s      med=1s       max=1.2s     p(90)=1.01s   p(95)=1.01s  
     iterations.....................: 1075   7.159722/s
     vus............................: 1      min=1      max=10
     vus_max........................: 10     min=10     max=10


running (2m30.1s), 00/10 VUs, 1075 complete and 0 interrupted iterations
default ✓ [======================================] 00/10 VUs  2m30s

References#

Full Source Code

Zero Downtime Deployment#

What Is The Zero Downtime Deployment?#

Zero Downtime Configuration Example#

Testing#

See Also#

References#