Antrea CNI offers a very handy feature called NodePortLocal which runs as part of the Antrea agents. This feature allows exposed Pod to be accessed from external network using specific ports opened only on the node on which that pod is running. This enables better integration with external Load Balancers which can take advantage of the feature instead of relying on NodePort Services implemented by kube-proxy, external Load-Balancers can consume NPL port mappings published by the Antrea Agent.
How does traditional LoadBalancer service work with NodePort?
A NodePort service opens a specific port on every node in the cluster, and forwards any traffic sent to the node on that port to the corresponding app. When you configure a service of type LoadBalancer so you can make use of external load balancer (NSX ALB) in your Tanzu/K8s cluster, it relies in the backend on the NodePort service implemented by kube-proxy to load balance incoming traffic to pods exposed by NodePort.
Shortcomings of NodePort service
NodePort is a very simple and effective approach, however the fact that with NodePort if a pod is exposed on port 30001 on a node, all the other nodes on the cluster will have their port 30001 in open and listening state even if the pod is not running on any of those nodes. This imposes a security risk in addition to inefficient use of node/port mapping, since some nodes will have such port blocked only for that pod (which is not running on them) while other pods which are actually running on them cannot make use of that port.So in the above picture, if the green pod is not present on node 1 anymore, port 5000 will still be open and exposed on that node.
In addition, session persistence on external load balancers cannot function with NodePort, since load balancer need a specific node to which sessions will be persistent and this is not feasible NodePort. In addition, NodePort exposes a range of ports on all k8s nodes irrespective of the Pod scheduling. It may hit the port range limitations as the number of services (of type nodePort) increases.
To better understand this, let us dive a bit into my home lab and see how this actually looks like in a Tanzu cluster.
For software versions I used the following:
- VMware ESXi 7.0U3f
- vCenter server version 7.0U3f
- TrueNAS 12.0-U7 used to provision NFS data stores to ESXi hosts.
- VyOS 1.4 used as lab backbone router and DHCP server.
- Ubuntu 18.04 LTS as bootstrap machine.
- Ubuntu 20.04.2 LTS as DNS and internet gateway.
- Windows Server 2012 R2 Datacenter as management host for UI access.
- Tanzu Kubernetes Grid 1.5.4 with Antrea CNI v1.7.1
- NSX ALB controller 22.1.1
For virtual hosts and appliances sizing I used the following specs:
- 3 x ESXi hosts each with 8 vCPUs, 2 x NICs and 96 GB RAM.
- vCenter server appliance with 2 vCPU and 24 GB RAM.
My TKG Cluster Configuration
I am running the following TKG workload cluster which is deployed to use NSX ALB as load balancer (for more details see my previous blog post HERE)
I also have an application running called sock-shop with the following load balancer service:
As you can see, the load balancer service is making use of NodePort which exposes my frontend pod on TCP port 31601. Now, lets check on which node is my frontend pod actually hosted and running:
From the above output, my frontend pod is running on node vexpert-tkg-wld01-md-0-fcb46549-qt2lk which has an IP of 126.96.36.199, navigating now to NSX ALB UI and lets check the virtual service (VIP 188.8.131.52 from load balancer service screenshot above) and its backend pool
As you can see, the load balancer VIP is mapped to my 3 TKG nodes which have port TCP 31601 open and reserved for frontend pod (although it is only running on node 184.108.40.206).
Using Antrea NodePortLocal Feature in TKGm
In native k8s running Antrea CNI you need to enable NodePortLocal feature gate in antrea configmap and then add a special annotation to your load balancer service in order to instruct load balancer service to make use of NodePortLocal (Antrea docs). However, TKG uses special construct and workflow to deploy and manage workload clusters and hence requires a special workflow to enable Antrea NodePortLocal, in addition to that if your TKG workload clusters will use NSX ALB AKO to provision load balancer and ingress services, then you need to deploy a special CRD called AKODeploymentConfig from the management cluster in which you configure NSX ALB AKO to use NodePortLocal as service type instead of the default NodePort mode.
Step 1: Deploy TKG workload cluster with NodePortLocal enabled
If you have not deployed TKG cluster with NSX ALB yet you can reference my previous blog post HERE, so assuming that you already have a TKG management cluster up and running we then will deploy a new workload cluster using the following deployment YAML
AVI_CA_DATA_B64: <paste base64 encoded certificate of your NSX ALB here> AVI_CLOUD_NAME: chocomel-cloud AVI_CONTROL_PLANE_HA_PROVIDER: "true" AVI_CONTROLLER: alb-controller.corp.local AVI_DATA_NETWORK: TKG AVI_DATA_NETWORK_CIDR: 220.127.116.11/24 AVI_ENABLE: "true" AVI_LABELS: "" AVI_MANAGEMENT_CLUSTER_VIP_NETWORK_CIDR: 18.104.22.168/24 AVI_MANAGEMENT_CLUSTER_VIP_NETWORK_NAME: TKG AVI_PASSWORD: <encoded:Vk13YXJlMSFWTXdhcmUxIQ==> AVI_SERVICE_ENGINE_GROUP: Default-Group AVI_USERNAME: admin CLUSTER_CIDR: 100.96.0.0/11 CLUSTER_NAME: vexpert-tkg-wld02 CLUSTER_PLAN: dev ENABLE_AUDIT_LOGGING: "false" ENABLE_CEIP_PARTICIPATION: "false" ENABLE_MHC: "true" IDENTITY_MANAGEMENT_TYPE: none INFRASTRUCTURE_PROVIDER: vsphere LDAP_BIND_DN: "" LDAP_BIND_PASSWORD: "" LDAP_GROUP_SEARCH_BASE_DN: "" LDAP_GROUP_SEARCH_FILTER: "" LDAP_GROUP_SEARCH_GROUP_ATTRIBUTE: "" LDAP_GROUP_SEARCH_NAME_ATTRIBUTE: cn LDAP_GROUP_SEARCH_USER_ATTRIBUTE: DN LDAP_HOST: "" LDAP_ROOT_CA_DATA_B64: "" LDAP_USER_SEARCH_BASE_DN: "" LDAP_USER_SEARCH_FILTER: "" LDAP_USER_SEARCH_NAME_ATTRIBUTE: "" LDAP_USER_SEARCH_USERNAME: userPrincipalName OIDC_IDENTITY_PROVIDER_CLIENT_ID: "" OIDC_IDENTITY_PROVIDER_CLIENT_SECRET: "" OIDC_IDENTITY_PROVIDER_GROUPS_CLAIM: "" OIDC_IDENTITY_PROVIDER_ISSUER_URL: "" OIDC_IDENTITY_PROVIDER_NAME: "" OIDC_IDENTITY_PROVIDER_SCOPES: "" OIDC_IDENTITY_PROVIDER_USERNAME_CLAIM: "" OS_ARCH: amd64 OS_NAME: photon OS_VERSION: "3" SERVICE_CIDR: 100.64.0.0/13 TKG_HTTP_PROXY_ENABLED: "false" TKG_IP_FAMILY: ipv4 VSPHERE_CONTROL_PLANE_DISK_GIB: "20" VSPHERE_CONTROL_PLANE_ENDPOINT: "" VSPHERE_CONTROL_PLANE_MEM_MIB: "4096" VSPHERE_CONTROL_PLANE_NUM_CPUS: "2" VSPHERE_DATACENTER: /vExpert-Homelab VSPHERE_DATASTORE: /vExpert-Homelab/datastore/DS01 VSPHERE_FOLDER: /vExpert-Homelab/vm/TKG VSPHERE_INSECURE: "true" VSPHERE_NETWORK: /vExpert-Homelab/network/TKG VSPHERE_PASSWORD: <encoded:Vk13YXJlMSE=> VSPHERE_RESOURCE_POOL: /vExpert-Homelab/host/Chocomel/Resources/TKG VSPHERE_SERVER: vc-l-01a.corp.local VSPHERE_SSH_AUTHORIZED_KEY: none VSPHERE_TLS_THUMBPRINT: "" VSPHERE_USERNAME: firstname.lastname@example.org VSPHERE_WORKER_DISK_GIB: "20" VSPHERE_WORKER_MEM_MIB: "4096" VSPHERE_WORKER_NUM_CPUS: "2" WORKER_MACHINE_COUNT: "2" ANTREA_NODEPORTLOCAL: true
In the above YAML, the setting ANTREA_NODEPORTLOCAL: “true” is what enabled NodePortLocal on Antrea agents once the cluster is deployed. Now, save and exit the above and start cluster creation:
tanzu cluster create vexpert-tkg-wld02 -f tkg-wld02.yaml
where tkg-wld02.yaml is the name of the above YAML file. Tanzu will do its magic and then your cluster should be created, up and running:
Step 2 (optional) update TKG workload cluster Antrea and set custom port range
This is an optional step and can be skipped, however I prefer using a more recent version (v1.7.1) due to the availability antrea-config configmap in which I am able to define the range of ports that NodePortLocal should be using (I was not able to do this with the default antrea configmaps rolled out by Tanzu).
From VMware Customer Connect and under Container Networking, download latest VMware Antrea version and deploy it in your workload cluster (steps can be found in my previous post HERE). Once your antrea is upgraded you should be able to see the following configmaps available under kube-system namespace (make sure that your context is your TKG workload cluster)
Now, edit the above configmap and enable NodePortLocal feature gate and specify port range as follows:
kubectl edit configmap antrea-config -n kube-system
The featureGate section of your configmap should look like this:
Save and exit the file, lastly ensure that nodeportlocal.enable flag is set to true in antrea-config-822fk25299 configmap (in your case it will have a different name of course)
The above flag was set to true during cluster creation using the ANTREA_NODEPORTLOCAL: “true” configuration parameter we set earlier in the workload cluster deployment YAML and cannot be manually changed.
Step 3: Set NSX ALB AKO service type to NodePortLocal
By default, if you make use of AKO (Avi Kubernetes Operator) in your TKG it will run in NodePort mode. This means, that all provisioned load balancer and ingress services by AKO will be deployed relying on NodePort forwarding mode explained in the beginning of this post. If you need to change this configuration with native k8s then you need to edit the AKO configmap running under avi-system namespace, however in TKG the values of the ado configmap are controlled by the TKG management cluster and any manual changes you do to the deployed system pods will be reverted back.
So, in order to change the AKO service type we need to switch to management cluster instance and create a YAML file with the following content in it:
apiVersion: networking.tkg.tanzu.vmware.com/v1alpha1 kind: AKODeploymentConfig metadata: name: npl-enabled spec: adminCredentialRef: name: avi-controller-credentials namespace: tkg-system-networking certificateAuthorityRef: name: avi-controller-ca namespace: tkg-system-networking cloudName: chocomel-cloud clusterSelector: matchLabels: npl-enabled: "true" controller: 22.214.171.124 dataNetwork: cidr: 126.96.36.199/24 name: TKG extraConfigs: cniPlugin: antrea # required disableStaticRouteSync: false ingress: disableIngressClass: false nodeNetworkList: - cidrs: - 188.8.131.52/24 networkName: TKG serviceType: NodePortLocal # required shardVSSize: MEDIUM serviceEngineGroup: Default-Group
you need of course to change some of the above parameters to reflect your environment values, save and exit the file, switch to your management cluster, label your workload cluster with the npl-enabled=”true” label and then apply the YAML file. See below screenshot for details:
Just ignore some of the above errors, this is because I already have my workload cluster labeled and AKO is already modified as above.
To verify the above, I deployed an application called WeaveSocks under a namespace called sock-shop. The UI of this application is running as a deployment called frontend, and I deployed a load balancer service called frontend-lb which is redirecting http traffic to frontend pod so users can access my sock shop UI
With Antrea NodePortLocal in action, all Pods which are accessed by means of a load balancer service will be automatically annotated with an automatically generated annotation by Antrea CNI displaying on which node the pod runs and on which local node port the pod is exposed:
Again, this value is added and managed automatically by Antrea NodePortLocal feature and you should not change it manually.
Now, if you recall in the beginning of this post I had the same exact pod and load balancer service configured and the result was a load balancer service mapped to same TCP port on all 3 cluster nodes although the pod is running on only 1 node. With NodePortLocal if you switch to NSX ALB you should see the load balancer service only maps to 1 TCP port local to the node that actually runs our frontend pod. From the above screenshot, the load balancer service should map to node 184.108.40.206 and local TCP port 61000.
Switch to NSX ALB UI > Applications > Dashboards you should be able to see all available load balancer services displayed as Virtual Services, ours is called default-expert-tkg-wld02–sockshop-frontend-lb
Click on the first and second VS, the first one is the default load balancer behaviour with NodePort discussed in the beginning, while the second one is the one we just created with NodePortLocal
If we switch back to our TKG workload cluster vexpert-tkg-wld02 and check the nodes which are there, you should see three nodes
This is to confirm that our NodePortLocal service has done its magic and works as expected. Now, lets open a web browser to our socks shop UI
I hope that you have enjoyed reading this post as much as I enjoyed writing it and learned something new as well.