In this blog post we will be exploring how vRealize Network Insight can be used to monitor Tanzu and kubernetes clusters running NSX Container Plugin (NCP) as CNI. NCP is used in Supervisor Clusters deployed by Workload Management services (WCP) in vSphere and TKGi and can also be used in Openshift and upstream kubernetes.
Running NCP as your Tanzu or Kubernetes CNI allows customers to leverage the powerful networking features and functions of NSX in their containerised workloads. NSX Manager is then your single pane of glass (UI) to manage, deploy and monitor kubernetes clusters networking and security policies.
With NCP, customers can also leverage NSX standard and advanced load balancers as load balancer services for pods utilising NCP as their CNI. Compared to Tanzu/K8s clusters running Antrea CNI (more about monitoring Antrea cluster using vRNI can be found in my previous blog post HERE) clusters running NCP are fully visible and monitored within vRNI without the need to deploy any additional Netflow collectors, while for Antrea clusters vRNI can only provide flow analysis with no k8s cluster/pod level knowledge (more about that in the blog post mentioned above).
For software versions I used the following:
- VMware ESXi 184.108.40.20667351
- vCenter server version 7.0U3 with workload management with NSX enabled.
- NSX-T 220.127.116.11
- vRealize Network Insight 6.6 platform and collector appliances.
- TrueNAS 12.0-U7 used to provision NFS datastores to ESXi hosts.
- VyOS 1.4 used as lab backbone router.
- Ubuntu 20.04 LTS as Linux jumpbox.
- Ubuntu 20.04.2 LTS as DNS and internet gateway.
- Windows Server 2012 R2 Datacenter as management host for UI access.
- vRealize Network Insight 6.6
For virtual hosts and appliances sizing I used the following specs:
- 3 x virtualised ESXi hosts each with 8 vCPUs, 4 x NICs and 32 GB RAM.
- vCenter server appliance with 2 vCPU and 24 GB RAM.
- NSX-T Manager medium appliance
- vRealize Network Insight medium Platform and Collector appliances.
- 2 x medium edges with no resource reservations.
Before we start
Before we jump right into the configuration steps, you need to ensure that your environment meets the following requirements in order to monitor kubernetes clusters in vRNI:
- Install latest version of vRNI platform and collector appliances (one collector appliance is enough).
- Tanzu or K8s cluster running NCP as container networking interface.
- In my lab environment I used the supervisor cluster deployed after enabling workload management using NSX networking (you can check that in a previous post HERE)
- You need to have vRNI Enterprise license in order to add kubernetes/TKGi clusters as data sources.
- You have to add the NSX Manager instance managing your workload networking also as a data source in vRNI.
Preparing your Kubernetes cluster to be added to vRNI as data source
In this step we just need to ensure that our NCP system pod is running properly and that our kubernetes/TKGi cluster is visible in NSX Manager.
Step 1: Verify workload management deployment
From your bootstrap machine (in my case it is an Ubuntu VM running the CLI plugin for vSphere) ensure that you are logged in to your supervisor cluster and list the available names spaces:
The commands are used are:
kubectl vsphere login --server=https://172.20.10.2 --insecure-skip-tls-verify -u firstname.lastname@example.org kubectl config use-context 172.20.10.2 kubectl get ns
The last command is to list the available namespaces under the supervisor cluster (172.20.10.2) and the highlighted namespace vmware-system-nsx is where the NCP pod is running, this is the pod responsible to connecting the cluster to NSX, you can verify this by using the below command:
In my lab environment I have also deployed 6 workload pods inside a namesake called vrni which all runs under the supervisor cluster:
Step 2: Verify cluster and pods visibility in NSX Manager
Login to you NSX Manager and navigate to Inventory > Containers and under Clusters you should be able to see a list of the clusters running NCP
If you switch to Namespaces and search for ‘vrni’ then you should also see the configured pods earlier configured:
Add your Kubernetes/TKGi cluster as data source in vRNI
Login to your vRNI platform UI, navigate to Settings > Accounts and Data Sources, click on ADD Source scroll down and under Containers choose TKGi if you have TKGi cluster (previously known as Enterprise PKS) or Kubernetes if you run any other form of Kubernetes deployment (vSphere, Openshift, upstream K8s etc.). In my environment I am running workload management vsphere cluster so I am choosing Kubernetes as data source.
As you see above, you will have to specify the address/FQDN of the NSX Manager serving this kubernetes cluster, in addition you will need to upload a copy of your Kubernetes cluster configuration file. The cluster configuration by default is located on your bootstrap machine (machine from which you manage your Kubernetes cluster) under ~/.kube/config so you need to upload this file as well to vRNI.
Once you input/add the above information and successfully complete the source validation, you should be to start using vRNI filters and pinboards to gain insights into your TKGi/Kubernetes clusters.
Common vRNI search queries
There are some common and frequently used vRNI search filters to use again your kubernetes clusters to gain insight into what is happening inside the containerised world. In the coming examples I wil be covering some of those filters.
Once you login to vRNI you should see the search box
Lets start by listing all the available namespaces we have running by typing ‘kubernetes namespaces‘ and press enter
Now, lets be more specific and search for a namespace called ‘tanzuvexpert’
kubernetes namespaces where Name = 'tanzuvexpert'
Click on the name of the namespace and lets explore more the components of that namespace
Click on the number of Pods
Now click on the first pod which is starting with curl-deployment, this will open a new window with more details regarding that pod including NSX port name, container image, labels, kubernetes worker node and cluster:
Now, if you click on Metrics and then choose Network Traffic by Packets you should interesting metrics and details regarding Rx, Tx and dropped packets within that Pod
Common searches for kubernetes nodes
In the search bar type the following search syntax:
Kubernetes Node where Kubernetes Cluster = '172.20.10.2'
This search filter will return all the available nodes in cluster 172.20.10.2, it is interesting to also list any node that might not be in a “Ready” state, this can be done by using the below filter:
kubernetes nodes where Ready != 'True'
Summary of useful vRNI search filters for kubernetes
(this is part of VMware vRNI K8s search poster)
flows where Kubernetes Cluster = Production' kubernetes pods group by Kubernetes Services kubernetes Pods group by Kubernetes Node flows where kubernetes service is set flows where firewal action = 'DROP' group by Kubernetes Service Kubernetes events where event code = 'ImagePullBackOff' Kubernetes events where problem entity.Kubernetes Cluster = '172.20.10.2'
Using vRNI to monitor NCP based Kubernetes clusters is very powerful and provides network and security admins with deep insight into containers and their traffic patterns, health status, security planning and much more.
Leveraging such a powerful insight tool allows vRNI in combination with NCP/NSX as networking stack to be positioned as a powerful kubernetes observability tool.