Kai Kiat Poh

GitOps in Kubernetes Cluster

System Diagram

Hello! This post is about using GitOps in Kubernetes Cluster. I will briefly discuss how we can use different tools to manage and automate the deployment of Kubernetes Cluster. Specifically using tools like

  • Argo CD
  • Argo Rollouts
  • Argo Analysis

GitOps is an operational that takes the best practices of Git and apply them to infrastructure autoation. GitOps relies on a git repository as the source of truth.

ArgoCD is a continuous delivery tool for Kubernetes. ArgoCD uses a git repository as the source of truth where helm/kustomize files are stored. It repeatedly compares the state of the application and the desired state specified in the git repository, a sync operation will be performed whenver there is a difference.

The most important file for Argo CD is the application.yaml, which specify which Git repository to watch. (spec -> source -> destination)

// application.yaml
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  annotations:
    argocd-image-updater.argoproj.io/image-list: ntuasr=ghcr.io/kaikiat/fyp-ci
    argocd-image-updater.argoproj.io/write-back-method: git
    argocd-image-updater.argoproj.io/git-branch: main:image-updater{{range .Images}}-{{.Name}}-{{.NewTag}}{{end}}
  name: sgdecoding-online-scaled 
  namespace: argocd
  finalizers:
    - resources-finalizer.argocd.argoproj.io
spec:
  project: default
  source:
    repoURL: git@github.com:kaikiat/fyp-cd.git  
    targetRevision: main  
    path: canary/sgdecoding-online-scaled 
  destination:
    server: https://kubernetes.default.svc
    namespace: ntuasr-production-google
  syncPolicy:
    automated: 
      prune: true 
      selfHeal: true
      allowEmpty: false 
    syncOptions:    
    - Validate=false
    - CreateNamespace=true 
    - PrunePropagationPolicy=foreground 
    - PruneLast=true
    retry:
      limit: 3
      backoff:
        duration: 5s 
        factor: 2 
        maxDuration: 3m 

Argo Rollouts can be used as a delivery strategy to provide more fine-grained control and specification to the default Kubernetes rollout strategy.

Currently, either a blue green deployment or a canary rollout strategy can be used.

  • Blue Green Rollout: 2 identical environments are created, typically directing software testers to the new environment first.
  • Canary Rollout: Gradually release the new version to a subset of users.

To configure a blue green rollout, ensure that you have 2 service object. One directing to the actual service

// Direct to actual service
apiVersion: v1
kind: Service
metadata:
  name: {{ include "ntuspeechlab.worker.name" $ }}{{ printf "-%s" $model_name | lower | replace "_" "-"  }}
  labels:
    app.kubernetes.io/name: {{ include "ntuspeechlab.worker.name" $ }}
    helm.sh/chart: {{ include "ntuspeechlab.chart" $ }}
    app.kubernetes.io/instance: {{ $.Release.Name }}
    app.kubernetes.io/managed-by: {{ $.Release.Service }}

Another to the preview service

// Direct to preview service
apiVersion: v1
kind: Service
metadata:
  name: {{ include "ntuspeechlab.worker.name" $ }}{{ printf "-%s" $model_name | lower | replace "_" "-"  }}-preview
  labels:
    app.kubernetes.io/name: {{ include "ntuspeechlab.worker.name" $ }}
    helm.sh/chart: {{ include "ntuspeechlab.chart" $ }}
    app.kubernetes.io/instance: {{ $.Release.Name }}
    app.kubernetes.io/managed-by: {{ $.Release.Service }}

While a rollout is ongoing, Argo provides a way to to test both services automatically using Argo Analysis. This allow us to halt any rollouts and rollback the newer version. In the yaml file below, we tested to ensure that the number of failed request is less than 5 percent. The request are send every 5s.

// analysis.yaml
apiVersion: argoproj.io/v1alpha1
kind: AnalysisTemplate
metadata:
  name: analyse-request
spec:
  metrics:
  - name: analyse-request
    interval: 5s
    successCondition: result[0] < 0.05
    failureLimit: 4
    provider:
      prometheus:
        address: http://35.240.236.243:9090
        query: |
          sum(number_of_request_receive_by_master_total{service="sgdecoding-online-scaled-master"})/sum(number_of_request_reject_total{service="sgdecoding-online-scaled-master"})

In summary, GitOps is an effective way to manage Kubernetes cluster. The usage of ArgoCD, Argo Rollouts and Argo Analysis can make the life of Kubernetes operator much easier.

That's all for now. See you in the next one !


This is a NTU FYP Project titled "GitOps in Kubernetes Cluster".

Project Report: https://drive.google.com/file/d/1VhOjXdNNt4wviwpFJ8MMoaAUKvnKMhQw/view

Youtube video: https://www.youtube.com/watch?v=qnsKsQJCX5c&ab_channel=KaiKiatPoh