vSphere 8 introduced zonal supervisor cluster deployments in order to improve Tanzu workload resiliency, by enabling TKG clusters deployments across 3 vSphere clusters (where each cluster is mapped to a zone) providing wider faulty domains than just single vSphere cluster as was the case vSphere 7. In vSphere 8 HA zones, the supervisor cluster is stretched across 3 clusters and when TKG clusters are deployed on top of that. The stretched supervisor clusters allows users to control the zone placement of their workloads to achieve the most optimum availability for their distributed applications and mimics the same multi-zone experience as in the public cloud.
In a previous blog post (HERE) I enabled multi-zone (zonal) supervisor cluster deployment based on NSX-T as networking layer, in this post I will deploy a multi-zonal high available TKG cluster and eventually will run and test a distributed kubernetes hosted application on top.
For software versions I used the following:
For virtual hosts and appliances sizing I used the following specs:
Before we dive in the configuration steps, let’s check the current supervisor cluster zonal deployment. From vCenter UI you can see that we have 3 defined availability zones under three vSphere clusters:
In addition, if you click on your vCenter server name (top left pane) and then choose Configure > vSphere Zones you can list the available zones:
I also created my first namespace (pindakaas) on which I will deploying my guest Tanzu Kubernetes Clusters (TKC)
Now, from our bootstrap/jumpbox Linux machine I will login to my supervisor cluster and inspect the supervisor nodes running
The supervisor cluster is up, running and in stable state. Next step is to deploy a guest Tanzu Cluster (TKC) on our 3 availability zones.
While the concept of deploying guest Tanzu Kubernetes Clusters (TKCs) on top of vSphere supervisor cluster is the same in the sense of creating deployment YAML with the following parameters:
However, the new v1alpha3 API adds a lot of new specs (for zonal deployment for example) in addition new TKR images format and many others, below is the sample deployment YAML which I used to deploy my multi-zone guest TKC
apiVersion: run.tanzu.vmware.com/v1alpha3
kind: TanzuKubernetesCluster
metadata:
name: multizone-tkc01
namespace: pindakaas
annotations:
run.tanzu.vmware.com/resolve-os-image: os-name=ubuntu
spec:
topology:
controlPlane:
replicas: 3
vmClass: best-effort-medium
storageClass: pindakaas-storagepolicy
tkr:
reference:
name: v1.23.8---vmware.2-tkg.2-zshippable
nodePools:
- name: workers-pool-1
replicas: 1
failureDomain: tonychocoloney
vmClass: best-effort-medium
storageClass: pindakaas-storagepolicy
tkr:
reference:
name: v1.23.8---vmware.2-tkg.2-zshippable
- name: workers-pool-2
replicas: 1
failureDomain: stroopwaffels
vmClass: best-effort-medium
storageClass: pindakaas-storagepolicy
tkr:
reference:
name: v1.23.8---vmware.2-tkg.2-zshippable
- name: workers-pool-3
replicas: 1
failureDomain: pindas
vmClass: best-effort-medium
storageClass: pindakaas-storagepolicy
tkr:
reference:
name: v1.23.8---vmware.2-tkg.2-zshippable
settings:
network:
cni:
name: antrea
services:
cidrBlocks: ["198.53.100.0/16"]
pods:
cidrBlocks: ["192.0.5.0/16"]
serviceDomain: cluster.local If you are familiar with vSphere with Tanzu TKC YAML then it will be obvious for you that in vSphere 8 VMware has introduced a new Tanzu API which is called TanzuKubernetesCluster v1alpha3. This API adds new capabilities to the TKC deployment such as
The rest of the YAML is pretty much the same as v1alpha2 API, if you need more details about Tanzu APIs then you can reference VMware documentation. In my setup, the storage class pindakaas-storagepolicy is zonal storage policy that was previously created in my previous blog post HERE.
After I applied the above YAML to my supervisor cluster (kubectl apply -f yaml-filename.yaml) the zonal TKC cluster deployment will kick in and in about 10 to 15 min (depends on the size of your nodes and config) you should see the TKC guest cluster created across the 3 zones we have:
From our bootstrap Linux host, lets login to the newly created multi-zone guest Tanzu cluster and verify that the newly configured control-plane and worker nodes are in Ready state:
kubectl-vsphere login –server=https://172.10.200.2 –insecure-skip-tls-verify -u administrator@vsphere.local –tanzu-kubernetes-cluster-namespace pindakaas –tanzu-kubernetes-cluster-name multizone-tkc01
kubectl config use-context multizone-tkc01
kubectl get nodes -o wide
Now, let us inspect how the supervisor cluster has allocated control-plane and worker nodes to zones. This is done by means of “label” control and worker nodes with special Tanzu labels, each labels is mapped to the availability zones we have (and we have 3 of them) and then this is used to place the nodes on the desired availability zone.
For a better output of node labels I will use jq to filter on the json output of my cluster nodes, which if you use Ubuntu you can download it using “sudo apt install jq -y“.
kubectl get nodes -o jsonpath='{range .items[*]}{.metadata.labels}' |jq '.' This will result in a “prettified” output of the nodes and their assigned labels
Our multi-zone guest cluster is now deployed and as a last step we will deploy on top of it a test distributed kubernetes application and verify its pods creation across zones.
For this step, I will use an app called yelb which was developed by a former VMware colleague. It is a simple application but has a cool UI to test micro-services.
first, we need to create a namespace and create a cluster role binding on the default Pod Security Policy (PSP) so that we can schedule pods on top of TKC guest clusters.
kubectl create clusterrolebinding default-tkg-admin-privileged-binding --clusterrole=psp:vmware-system-privileged --group=system:authenticated kubectl create ns yelb
Then, copy and paste the below YAML file and apply it on your cluster using “kubectl apply -f yelb-deployment-filename.yaml“
apiVersion: v1
kind: Service
metadata:
name: redis-server
labels:
app: redis-server
tier: cache
namespace: yelb
spec:
type: ClusterIP
ports:
- port: 6379
selector:
app: redis-server
tier: cache
---
apiVersion: v1
kind: Service
metadata:
name: yelb-db
labels:
app: yelb-db
tier: backenddb
namespace: yelb
spec:
type: ClusterIP
ports:
- port: 5432
selector:
app: yelb-db
tier: backenddb
---
apiVersion: v1
kind: Service
metadata:
name: yelb-appserver
labels:
app: yelb-appserver
tier: middletier
namespace: yelb
spec:
type: ClusterIP
ports:
- port: 4567
selector:
app: yelb-appserver
tier: middletier
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: yelb-ui
namespace: yelb
spec:
replicas: 3
selector:
matchLabels:
app: yelb-ui
tier: frontend
secgroup: web
template:
metadata:
labels:
app: yelb-ui
tier: frontend
secgroup: web
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: topology.kubernetes.io/zone
operator: In
values:
- pindas
- stroopwaffels
- tonychocoloney
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- yelb-ui
topologyKey: topology.kubernetes.io/zone
containers:
- name: yelb-ui
image: harbor-repo.vmware.com/dockerhub-proxy-cache/mreferre/yelb-ui@sha256:9df5e2611d6cf7cbc304104c18bb93ab3b185ae68ad25f75b655be1106cdd1b2
ports:
- containerPort: 80
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: redis-server
namespace: yelb
spec:
replicas: 3
selector:
matchLabels:
app: redis-server
tier: cache
secgroup: cache
template:
metadata:
labels:
app: redis-server
tier: cache
secgroup: cache
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: topology.kubernetes.io/zone
operator: In
values:
- pindas
- stroopwaffels
- tonychocoloney
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- redis-server
topologyKey: topology.kubernetes.io/zone
containers:
- name: redis-server
image: harbor-repo.vmware.com/challagandlp/mreferre/redis@sha256:3c07847e5aa6911cf5d9441642769d3b6cd0bf6b8576773ae3a0742056b9dd47
ports:
- containerPort: 6379
# volumeMounts:
# - name: redis-slave-data
# mountPath: /data
# volumes:
# - name: redis-slave-data
# persistantVolumeClaim:
# claimName: redis-slave-claim
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: yelb-db
namespace: yelb
spec:
replicas: 3
selector:
matchLabels:
app: yelb-db
tier: backenddb
secgroup: db
template:
metadata:
labels:
app: yelb-db
tier: backenddb
secgroup: db
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: topology.kubernetes.io/zone
operator: In
values:
- pindas
- stroopwaffels
- tonychocoloney
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- yelb-db
topologyKey: topology.kubernetes.io/zone
containers:
- name: yelb-db
image: harbor-repo.vmware.com/challagandlp/mreferre/yelb-db@sha256:6412d2fe96ee71ca701932d47675c549fe0428dede6a7975d39d9a581dc46c0c
ports:
- containerPort: 5432
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: yelb-appserver
namespace: yelb
spec:
replicas: 3
selector:
matchLabels:
app: yelb-appserver
tier: middletier
secgroup: app
template:
metadata:
labels:
app: yelb-appserver
tier: middletier
secgroup: app
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: topology.kubernetes.io/zone
operator: In
values:
- pindas
- stroopwaffels
- tonychocoloney
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- yelb-appserver
topologyKey: topology.kubernetes.io/zone
containers:
- name: yelb-appserver
image: harbor-repo.vmware.com/challagandlp/mreferre/yelb-appserver@sha256:db367946dc02cf38752ad925e0b0fbff0f5c6f9186ca481fb8541530879d9c8d
ports:
- containerPort: 4567
---
apiVersion: v1
kind: Service
metadata:
name: yelb-ui
labels:
app: yelb-ui
tier: frontend
namespace: yelb
spec:
type: LoadBalancer
ports:
- port: 80
protocol: TCP
targetPort: 80
selector:
app: yelb-ui
tier: frontend The important and new section in the above YAML is the following:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: topology.kubernetes.io/zone
operator: In
values:
- pindas
- stroopwaffels
- tonychocoloney
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- yelb-appserver
topologyKey: topology.kubernetes.io/zone The node affinity section is what tells the kube scheduler on which nodes the deployment pods should be scheduled, and in the above we match on our 3 zones, however we also need to make sure than no pods from same deployments coexists on the same node or zone (to ensure maximum HA) and thats why we need the podAntiAffinity parameter which instructs Kube-Scheduler to not schedule pods of same label together on same zone.
After the deployment is created, all the pods (which I configured 3 pod replicas per deployment i.e. 1 pod per zone) should be in running state
Let’s now inspect the 3 pods of the yelb-ui deployment and check on which node each pod is running. We should see that we have single pod per worker node i.e. per zone
I also created a loadbalancer service (see deployment YAML above) which exposes the UI pod on port 80, this service will be assigned a VIP from the underlying NSX infrastructure or if you are using NSX ALB or HA-Proxy then this will be assigned from the useable IP pool for virtual services.
From a web browser navigate to http://172.10.200.5 you should see the yelb app homepage
Overview NSX Advanced Load Balancer (a.k.a Avi) offers variety of advanced load balancing and application…
Overview With the release of VMware NSX 4.0 VMware announced the deprecation of NSX standard…
Overview Backup and restore is the main building block in any organisation's disaster recovery policy…
Overview In this blog post I am going to walk you through the configuration of…
Overview NodePortLocal is a feature that is part of the Antrea Agent, through which a…
Overview In part two of this blog post, we will be using NSX DFW to…