Overview

Antrea CNI offers a very handy feature called NodePortLocal which runs as part of the Antrea agents. This feature allows exposed Pod to be accessed from external network using specific ports opened only on the node on which that pod is running. This enables better integration with external Load Balancers which can take advantage of the feature instead of relying on NodePort Services implemented by kube-proxy, external Load-Balancers can consume NPL port mappings published by the Antrea Agent.

How does traditional LoadBalancer service work with NodePort?

A NodePort service opens a specific port on every node in the cluster, and forwards any traffic sent to the node on that port to the corresponding app. When you configure a service of type LoadBalancer so you can make use of external load balancer (NSX ALB) in your Tanzu/K8s cluster, it relies in the backend on the NodePort service implemented by kube-proxy to load balance incoming traffic to pods exposed by NodePort.

Source: Vincent Han Blog

Shortcomings of NodePort service

NodePort is a very simple and effective approach, however the fact that with NodePort if a pod is exposed on port 30001 on a node, all the other nodes on the cluster will have their port 30001 in open and listening state even if the pod is not running on any of those nodes. This imposes a security risk in addition to inefficient use of node/port mapping, since some nodes will have such port blocked only for that pod (which is not running on them) while other pods which are actually running on them cannot make use of that port.So in the above picture, if the green pod is not present on node 1 anymore, port 5000 will still be open and exposed on that node.

In addition, session persistence on external load balancers cannot function with NodePort, since load balancer need a specific node to which sessions will be persistent and this is not feasible NodePort. In addition, NodePort exposes a range of ports on all k8s nodes irrespective of the Pod scheduling. It may hit the port range limitations as the number of services (of type nodePort) increases.

To better understand this, let us dive a bit into my home lab and see how this actually looks like in a Tanzu cluster.

Lab Inventory

For software versions I used the following:

    • VMware ESXi 7.0U3f
    • vCenter server version 7.0U3f
    • TrueNAS 12.0-U7 used to provision NFS data stores to ESXi hosts.
    • VyOS 1.4 used as lab backbone router and DHCP server.
    • Ubuntu 18.04 LTS as bootstrap machine.
    • Ubuntu 20.04.2 LTS as DNS and internet gateway.
    • Windows Server 2012 R2 Datacenter as management host for UI access.
    • Tanzu Kubernetes Grid 1.5.4 with Antrea CNI v1.7.1
    • NSX ALB controller 22.1.1

For virtual hosts and appliances sizing I used the following specs:

    • 3 x ESXi hosts each with 8 vCPUs, 2 x NICs and 96 GB RAM.
    • vCenter server appliance with 2 vCPU and 24 GB RAM.

My TKG Cluster Configuration

I am running the following TKG workload cluster which is deployed to use NSX ALB as load balancer (for more details see my previous blog post HERE)

I also have an application running called sock-shop with the following load balancer service:

As you can see, the load balancer service is making use of NodePort which exposes my frontend pod on TCP port 31601. Now, lets check on which node is my frontend pod actually hosted and running:

From the above output, my frontend pod is running on node vexpert-tkg-wld01-md-0-fcb46549-qt2lk which has an IP of 172.10.82.33, navigating now to NSX ALB UI and lets check the virtual service (VIP 172.10.82.154 from load balancer service screenshot above) and its backend pool

As you can see, the load balancer VIP is mapped to my 3 TKG nodes which have port TCP 31601 open and reserved for frontend pod (although it is only running on node 172.10.82.33).

Using Antrea NodePortLocal Feature in TKGm

In native k8s running Antrea CNI you need to enable NodePortLocal feature gate in antrea configmap and then add a special annotation to your load balancer service in order to instruct load balancer service to make use of NodePortLocal (Antrea docs). However, TKG uses special construct and workflow to deploy and manage workload clusters and hence requires a special workflow to enable Antrea NodePortLocal, in addition to that if your TKG workload clusters will use NSX ALB AKO to provision load balancer and ingress services, then you need to deploy a special CRD called AKODeploymentConfig from the management cluster in which you configure NSX ALB AKO to use NodePortLocal as service type instead of the default NodePort mode.

Step 1: Deploy TKG workload cluster with NodePortLocal enabled

If you have not deployed TKG cluster with NSX ALB yet you can reference my previous blog post HERE, so assuming that you already have a TKG management cluster up and running we then will deploy a new workload cluster using the following deployment YAML

AVI_CA_DATA_B64: <paste base64 encoded certificate of your NSX ALB here>
AVI_CLOUD_NAME: chocomel-cloud
AVI_CONTROL_PLANE_HA_PROVIDER: "true"
AVI_CONTROLLER: alb-controller.corp.local
AVI_DATA_NETWORK: TKG
AVI_DATA_NETWORK_CIDR: 172.10.82.0/24
AVI_ENABLE: "true"
AVI_LABELS: ""
AVI_MANAGEMENT_CLUSTER_VIP_NETWORK_CIDR: 172.10.82.0/24
AVI_MANAGEMENT_CLUSTER_VIP_NETWORK_NAME: TKG
AVI_PASSWORD: <encoded:Vk13YXJlMSFWTXdhcmUxIQ==>
AVI_SERVICE_ENGINE_GROUP: Default-Group
AVI_USERNAME: admin
CLUSTER_CIDR: 100.96.0.0/11
CLUSTER_NAME: vexpert-tkg-wld02
CLUSTER_PLAN: dev
ENABLE_AUDIT_LOGGING: "false"
ENABLE_CEIP_PARTICIPATION: "false"
ENABLE_MHC: "true"
IDENTITY_MANAGEMENT_TYPE: none
INFRASTRUCTURE_PROVIDER: vsphere
LDAP_BIND_DN: ""
LDAP_BIND_PASSWORD: ""
LDAP_GROUP_SEARCH_BASE_DN: ""
LDAP_GROUP_SEARCH_FILTER: ""
LDAP_GROUP_SEARCH_GROUP_ATTRIBUTE: ""
LDAP_GROUP_SEARCH_NAME_ATTRIBUTE: cn
LDAP_GROUP_SEARCH_USER_ATTRIBUTE: DN
LDAP_HOST: ""
LDAP_ROOT_CA_DATA_B64: ""
LDAP_USER_SEARCH_BASE_DN: ""
LDAP_USER_SEARCH_FILTER: ""
LDAP_USER_SEARCH_NAME_ATTRIBUTE: ""
LDAP_USER_SEARCH_USERNAME: userPrincipalName
OIDC_IDENTITY_PROVIDER_CLIENT_ID: ""
OIDC_IDENTITY_PROVIDER_CLIENT_SECRET: ""
OIDC_IDENTITY_PROVIDER_GROUPS_CLAIM: ""
OIDC_IDENTITY_PROVIDER_ISSUER_URL: ""
OIDC_IDENTITY_PROVIDER_NAME: ""
OIDC_IDENTITY_PROVIDER_SCOPES: ""
OIDC_IDENTITY_PROVIDER_USERNAME_CLAIM: ""
OS_ARCH: amd64
OS_NAME: photon
OS_VERSION: "3"
SERVICE_CIDR: 100.64.0.0/13
TKG_HTTP_PROXY_ENABLED: "false"
TKG_IP_FAMILY: ipv4
VSPHERE_CONTROL_PLANE_DISK_GIB: "20"
VSPHERE_CONTROL_PLANE_ENDPOINT: ""
VSPHERE_CONTROL_PLANE_MEM_MIB: "4096"
VSPHERE_CONTROL_PLANE_NUM_CPUS: "2"
VSPHERE_DATACENTER: /vExpert-Homelab
VSPHERE_DATASTORE: /vExpert-Homelab/datastore/DS01
VSPHERE_FOLDER: /vExpert-Homelab/vm/TKG
VSPHERE_INSECURE: "true"
VSPHERE_NETWORK: /vExpert-Homelab/network/TKG
VSPHERE_PASSWORD: <encoded:Vk13YXJlMSE=>
VSPHERE_RESOURCE_POOL: /vExpert-Homelab/host/Chocomel/Resources/TKG
VSPHERE_SERVER: vc-l-01a.corp.local
VSPHERE_SSH_AUTHORIZED_KEY: none
VSPHERE_TLS_THUMBPRINT: ""
VSPHERE_USERNAME: administrator@vsphere.local
VSPHERE_WORKER_DISK_GIB: "20"
VSPHERE_WORKER_MEM_MIB: "4096"
VSPHERE_WORKER_NUM_CPUS: "2"
WORKER_MACHINE_COUNT: "2"
ANTREA_NODEPORTLOCAL: true

In the above YAML, the setting ANTREA_NODEPORTLOCAL: “true” is what enabled NodePortLocal on Antrea agents once the cluster is deployed. Now, save and exit the above and start cluster creation:

tanzu cluster create vexpert-tkg-wld02 -f tkg-wld02.yaml

where tkg-wld02.yaml is the name of the above YAML file. Tanzu will do its magic and then your cluster should be created, up and running:

 

Step 2 (optional) update TKG workload cluster Antrea and set custom port range

This is an optional step and can be skipped, however I prefer using a more recent version (v1.7.1) due to the availability antrea-config configmap in which I am able to define the range of ports that NodePortLocal should be using (I was not able to do this with the default antrea configmaps rolled out by Tanzu).

From VMware Customer Connect and under Container Networking, download latest VMware Antrea version and deploy it in your workload cluster (steps can be found in my previous post HERE). Once your antrea is upgraded you should be able to see the following configmaps available under kube-system namespace (make sure that your context is your TKG workload cluster)

Now, edit the above configmap and enable NodePortLocal feature gate and specify port range as follows:

kubectl edit configmap antrea-config -n kube-system

The featureGate section of your configmap should look like this:

Save and exit the file, lastly ensure that nodeportlocal.enable flag is set to true in antrea-config-822fk25299 configmap (in your case it will have a different name of course)

The above flag was set to true during cluster creation using the ANTREA_NODEPORTLOCAL: “true” configuration parameter we set earlier in the workload cluster deployment YAML and cannot be manually changed.

Step 3: Set NSX ALB AKO service type to NodePortLocal 

By default, if you make use of AKO (Avi Kubernetes Operator) in your TKG it will run in NodePort mode. This means, that all provisioned load balancer and ingress services by AKO will be deployed relying on NodePort forwarding mode explained in the beginning of this post. If you need to change this configuration with native k8s then you need to edit the AKO configmap running under avi-system namespace, however in TKG the values of the ado configmap are controlled by the TKG management cluster and any manual changes you do to the deployed system pods will be reverted back.

So, in order to change the AKO service type we need to switch to management cluster instance and create a YAML file with the following content in it:

apiVersion: networking.tkg.tanzu.vmware.com/v1alpha1
kind: AKODeploymentConfig
metadata:
  name: npl-enabled
spec:
  adminCredentialRef:
    name: avi-controller-credentials
    namespace: tkg-system-networking
  certificateAuthorityRef:
    name: avi-controller-ca
    namespace: tkg-system-networking
  cloudName: chocomel-cloud
  clusterSelector:
    matchLabels:
      npl-enabled: "true"
  controller: 172.10.80.1
  dataNetwork:
    cidr: 172.10.82.0/24
    name: TKG
  extraConfigs:
    cniPlugin: antrea               # required
    disableStaticRouteSync: false
    ingress:
      disableIngressClass: false
      nodeNetworkList:
        - cidrs:
          - 172.10.82.0/24
          networkName: TKG
      serviceType: NodePortLocal    # required
      shardVSSize: MEDIUM
  serviceEngineGroup: Default-Group

you need of course to change some of the above parameters to reflect your environment values, save and exit the file, switch to your management cluster, label your workload cluster with the npl-enabled=”true” label and then apply the YAML file. See below screenshot for details:

Just ignore some of the above errors, this is because I already have my workload cluster labeled and AKO is already modified as above.

NodePortLocal verification 

To verify the above, I deployed an application called WeaveSocks under a namespace called sock-shop. The UI of this application is running as a deployment called frontend, and I deployed a load balancer service called frontend-lb which is redirecting http traffic to frontend pod so users can access my sock shop UI

With Antrea NodePortLocal in action, all Pods which are accessed by means of a load balancer service will be automatically annotated with an automatically generated annotation by Antrea CNI displaying on which node the pod runs and on which local node port the pod is exposed:

Again, this value is added and managed automatically by Antrea NodePortLocal feature and you should not change it manually.

Now, if you recall in the beginning of this post I had the same exact pod and load balancer service configured and the result was a load balancer service mapped to same TCP port on all 3 cluster nodes although the pod is running on only 1 node. With NodePortLocal if you switch to NSX ALB you should see the load balancer service only maps to 1 TCP port local to the node that actually runs our frontend pod. From the above screenshot, the load balancer service should map to node 172.10.82.36 and local TCP port 61000.

Switch to NSX ALB UI > Applications > Dashboards you should be able to see all available load balancer services displayed as Virtual Services, ours is called default-expert-tkg-wld02–sockshop-frontend-lb

Click on the first and second VS, the first one is the default load balancer behaviour with NodePort discussed in the beginning, while the second one is the one we just created with NodePortLocal

If we switch back to our TKG workload cluster vexpert-tkg-wld02 and check the nodes which are there, you should see three nodes

This is to confirm that our NodePortLocal service has done its magic and works as expected. Now, lets open a web browser to our socks shop UI

 

Final word

I hope that you have enjoyed reading this post as much as I enjoyed writing it and learned something new as well.