Zero downtime deployment in Kubernetes is a process that allows updates to take place without causing any interruption to the service. This is achieved by incrementally replacing the current Pods with new ones and usually called Rolling update.
Rolling Updates: By default, Kubernetes deployments roll-out pod version updates with a rolling update strategy. This strategy aims to prevent application downtime by keeping at least some instances up-and-running at any point in time while performing the updates. Old pods are only shutdown after new pods of the new deployment version have started-up and became ready to handle traffic.
Depending on the workload and available compute resources, we can configure how many instances we want to over- or under-provision at any time. For example, given three desired replicas, should we create three new pods immediately and wait for all of them to start up, should we terminate all old pods except one, or do the transition one-by-one.
To make Kubernetes knows the pods are ready to to handle traffic or shutdown we have to configure the Readiness Probe and Liveness Probe. Kubernetes will only route the client traffic to the pods with a healthy `Liveness Probe.
The pods have to provide a method for Kubernetes then it can check the health status of them. In Spring Boot 3.x Actuator & Graceful Shutdown we used Spring Boot Actuator for monitoring and managing Spring Boot application. It also exporting Liveness and Readiness Probes which will help Spring Boot integrates better with Kubernetes deployment environment.
Then to build the Docker Image for that Spring Boot project we will result the Dockerfile in the topic Docker With SpringBoot
Then to public the docker image of step above to the Minikube Docker Repository we will review step Spring Boot Deployment in Minikube Sample Project.
Finally, we will reuse the existing deployment configuration in topic Helm.
Let's do the step 1 to step 3 by yourself. For the step 4 we just need to copy the all the files in the folder sample of Helm and paste to another folder ex: zero-downtime.
Now, let's open the spring-boot-deployment.yaml and put the configurations as below.
apiVersion:apps/v1kind:Deploymentmetadata:name:{{.Values.springboot.app.name}}labels:app:{{.Values.springboot.app.name}}spec:replicas:1strategy:rollingUpdate:maxSurge:1maxUnavailable:0type:RollingUpdateselector:matchLabels:app:{{.Values.springboot.app.name}}template:metadata:labels:app:{{.Values.springboot.app.name}}annotations:# each invocation of the template function randAlphaNum will generate a unique random string. Thus random string always changes and causes the deployment to roll.rollme:{{randAlphaNum 5 | quote}}spec:containers:-name:{{.Values.springboot.containers.name}}image:{{.Values.springboot.containers.image}}imagePullPolicy:NeverreadinessProbe:httpGet:path:/actuator/health/readinessport:{{.Values.springboot.containers.port}}initialDelaySeconds:5periodSeconds:5successThreshold:1livenessProbe:httpGet:path:/actuator/health/livenessport:{{.Values.springboot.containers.port}}initialDelaySeconds:5periodSeconds:5successThreshold:1lifecycle:preStop:exec:command:["/bin/bash","-c","sleep15"]ports:-containerPort:{{.Values.springboot.containers.port}}env:-name:USERNAMEvalueFrom:secretKeyRef:name:{{.Values.mongo.secret.name}}key:mongo-user-name:PASSWORDvalueFrom:secretKeyRef:name:{{.Values.mongo.secret.name}}key:mongo-password-name:DB_URLvalueFrom:configMapKeyRef:name:{{.Values.mongo.config.name}}key:mongo-url-name:PORTvalueFrom:configMapKeyRef:name:{{.Values.mongo.config.name}}key:mongo-port-name:ENVvalueFrom:configMapKeyRef:name:{{.Values.springboot.config.name}}key:env
There are two new parts that we added into the existing configuration. The first part is used for configuring the strategy for the Rolling update.
replicas: 1: This specifies that the desired number of replicas (instances) of your application should be 1. Replicas are multiple instances of your application running in the Kubernetes cluster.
strategy: This section defines the deployment strategy, determining how updates to your application are rolled out.
rollingUpdate: This indicates that updates should be rolled out gradually, replacing old replicas with new ones.
maxSurge: 1: During an update, this allows one additional replica to be created before old replicas are terminated. This can be helpful to ensure that there's always at least one instance running during the update.
maxUnavailable: 0: Specifies that during an update, there should be no more than 0 unavailable replicas. This ensures that there is no downtime during the update process; new replicas are created before old ones are terminated.
type: RollingUpdate: This specifies that the update strategy is a rolling update. In a rolling update, new replicas are gradually introduced while old ones are gradually terminated, minimizing the impact on the overall application.
Next second part below will be used for configuring the readinessProbe, livenessProbe and lifecycle of the pod.
readinessProbe: Configures a probe to check the readiness of the application.
httpGet: Performs an HTTP GET request on the specified path (/actuator/health/readiness) and port to determine if the application is ready.
initialDelaySeconds: Specifies the number of seconds after the container has started before the readiness probe is initiated.
periodSeconds: Specifies how often (in seconds) to perform the readiness probe.
successThreshold: Specifies the number of consecutive successes required for the probe to be considered successful.
livenessProbe: Configures a probe to check the liveness of the application.
httpGet: Performs an HTTP GET request on the specified path (/actuator/health/liveness) and port to determine if the application is live.
initialDelaySeconds: Specifies the number of seconds after the container has started before the liveness probe is initiated.
periodSeconds: Specifies how often (in seconds) to perform the liveness probe.
successThreshold: Specifies the number of consecutive successes required for the probe to be considered successful.
lifecycle: Defines actions to be executed at the lifecycle phase of the container.
preStop: Specifies a command to be executed before the container is terminated. In this case, it sleeps for 15 seconds (sleep 15) to allow graceful termination.
Now, to test the Zero Downtime let's use K6 and call the api of the Pod continuously for a period of time and deploy the pod again through upgrading the helm chart in that time. We should see there is no any failed api call after K6 finished the execution.
We have the execution scripts for K6 as below. You can view more at K6 topic.
k6.js
1 2 3 4 5 6 7 8 9101112131415161718
importhttpfrom"k6/http";import{sleep}from"k6";exportconstoptions={// Key configurations for Stress in this sectionstages:[{duration:"30s",target:10},// traffic ramp-up from 1 to 10 users over 30s.{duration:"1m",target:10},// stay at 10 users for 1 minutes{duration:"1m",target:0},// ramp-down to 0 users],};exportdefaultfunction(){http.get("http://192.168.49.2:31000/v1/json/validator/schemas/CustomerJsonSchemaValidatorDev");sleep(1);}
Then we use the command below to execute the K6.
1
k6runk6.js
After the K6 is running let's use the upgrade helm chart command to redeploy the pod.