GitOps in Kubernetes Cluster
Hello! This post is about using GitOps in Kubernetes Cluster. I will briefly discuss how we can use different tools to manage and automate the deployment of Kubernetes Cluster. Specifically using tools like
- Argo CD
- Argo Rollouts
- Argo Analysis
GitOps is an operational that takes the best practices of Git and apply them to infrastructure autoation. GitOps relies on a git repository as the source of truth.
ArgoCD is a continuous delivery tool for Kubernetes. ArgoCD uses a git repository as the source of truth where helm/kustomize files are stored. It repeatedly compares the state of the application and the desired state specified in the git repository, a sync operation will be performed whenver there is a difference.
The most important file for Argo CD is the application.yaml, which specify which Git repository to watch. (spec -> source -> destination)
// application.yaml
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
annotations:
argocd-image-updater.argoproj.io/image-list: ntuasr=ghcr.io/kaikiat/fyp-ci
argocd-image-updater.argoproj.io/write-back-method: git
argocd-image-updater.argoproj.io/git-branch: main:image-updater{{range .Images}}-{{.Name}}-{{.NewTag}}{{end}}
name: sgdecoding-online-scaled
namespace: argocd
finalizers:
- resources-finalizer.argocd.argoproj.io
spec:
project: default
source:
repoURL: git@github.com:kaikiat/fyp-cd.git
targetRevision: main
path: canary/sgdecoding-online-scaled
destination:
server: https://kubernetes.default.svc
namespace: ntuasr-production-google
syncPolicy:
automated:
prune: true
selfHeal: true
allowEmpty: false
syncOptions:
- Validate=false
- CreateNamespace=true
- PrunePropagationPolicy=foreground
- PruneLast=true
retry:
limit: 3
backoff:
duration: 5s
factor: 2
maxDuration: 3m
Argo Rollouts can be used as a delivery strategy to provide more fine-grained control and specification to the default Kubernetes rollout strategy.
Currently, either a blue green deployment or a canary rollout strategy can be used.
- Blue Green Rollout: 2 identical environments are created, typically directing software testers to the new environment first.
- Canary Rollout: Gradually release the new version to a subset of users.
To configure a blue green rollout, ensure that you have 2 service object. One directing to the actual service
// Direct to actual service
apiVersion: v1
kind: Service
metadata:
name: {{ include "ntuspeechlab.worker.name" $ }}{{ printf "-%s" $model_name | lower | replace "_" "-" }}
labels:
app.kubernetes.io/name: {{ include "ntuspeechlab.worker.name" $ }}
helm.sh/chart: {{ include "ntuspeechlab.chart" $ }}
app.kubernetes.io/instance: {{ $.Release.Name }}
app.kubernetes.io/managed-by: {{ $.Release.Service }}
Another to the preview service
// Direct to preview service
apiVersion: v1
kind: Service
metadata:
name: {{ include "ntuspeechlab.worker.name" $ }}{{ printf "-%s" $model_name | lower | replace "_" "-" }}-preview
labels:
app.kubernetes.io/name: {{ include "ntuspeechlab.worker.name" $ }}
helm.sh/chart: {{ include "ntuspeechlab.chart" $ }}
app.kubernetes.io/instance: {{ $.Release.Name }}
app.kubernetes.io/managed-by: {{ $.Release.Service }}
While a rollout is ongoing, Argo provides a way to to test both services automatically using Argo Analysis. This allow us to halt any rollouts and rollback the newer version. In the yaml file below, we tested to ensure that the number of failed request is less than 5 percent. The request are send every 5s.
// analysis.yaml
apiVersion: argoproj.io/v1alpha1
kind: AnalysisTemplate
metadata:
name: analyse-request
spec:
metrics:
- name: analyse-request
interval: 5s
successCondition: result[0] < 0.05
failureLimit: 4
provider:
prometheus:
address: http://35.240.236.243:9090
query: |
sum(number_of_request_receive_by_master_total{service="sgdecoding-online-scaled-master"})/sum(number_of_request_reject_total{service="sgdecoding-online-scaled-master"})
In summary, GitOps is an effective way to manage Kubernetes cluster. The usage of ArgoCD, Argo Rollouts and Argo Analysis can make the life of Kubernetes operator much easier.
That's all for now. See you in the next one !
This is a NTU FYP Project titled "GitOps in Kubernetes Cluster".
Project Report: https://drive.google.com/file/d/1VhOjXdNNt4wviwpFJ8MMoaAUKvnKMhQw/view
Youtube video: https://www.youtube.com/watch?v=qnsKsQJCX5c&ab_channel=KaiKiatPoh