In this blog post we will be exploring how vRealize Network Insight can be used to monitor Tanzu and kubernetes clusters running NSX Container Plugin (NCP) as CNI. NCP is used in Supervisor Clusters deployed by Workload Management services (WCP) in vSphere and TKGi and can also be used in Openshift and upstream kubernetes.
Running NCP as your Tanzu or Kubernetes CNI allows customers to leverage the powerful networking features and functions of NSX in their containerised workloads. NSX Manager is then your single pane of glass (UI) to manage, deploy and monitor kubernetes clusters networking and security policies.
With NCP, customers can also leverage NSX standard and advanced load balancers as load balancer services for pods utilising NCP as their CNI. Compared to Tanzu/K8s clusters running Antrea CNI (more about monitoring Antrea cluster using vRNI can be found in my previous blog post HERE) clusters running NCP are fully visible and monitored within vRNI without the need to deploy any additional Netflow collectors, while for Antrea clusters vRNI can only provide flow analysis with no k8s cluster/pod level knowledge (more about that in the blog post mentioned above).
For software versions I used the following:
For virtual hosts and appliances sizing I used the following specs:
Before we jump right into the configuration steps, you need to ensure that your environment meets the following requirements in order to monitor kubernetes clusters in vRNI:
In this step we just need to ensure that our NCP system pod is running properly and that our kubernetes/TKGi cluster is visible in NSX Manager.
From your bootstrap machine (in my case it is an Ubuntu VM running the CLI plugin for vSphere) ensure that you are logged in to your supervisor cluster and list the available names spaces:
The commands are used are:
kubectl vsphere login --server=https://172.20.10.2 --insecure-skip-tls-verify -u administrator@vsphere.local
kubectl config use-context 172.20.10.2
kubectl get ns
The last command is to list the available namespaces under the supervisor cluster (172.20.10.2) and the highlighted namespace vmware-system-nsx is where the NCP pod is running, this is the pod responsible to connecting the cluster to NSX, you can verify this by using the below command:
In my lab environment I have also deployed 6 workload pods inside a namesake called vrni which all runs under the supervisor cluster:
If you would need to have more details about setting up workload management on NSX and creating namespaces and pods, you can reference my previous blog posts HERE and HERE.
Login to you NSX Manager and navigate to Inventory > Containers and under Clusters you should be able to see a list of the clusters running NCP
If you switch to Namespaces and search for ‘vrni’ then you should also see the configured pods earlier configured:
Login to your vRNI platform UI, navigate to Settings > Accounts and Data Sources, click on ADD Source scroll down and under Containers choose TKGi if you have TKGi cluster (previously known as Enterprise PKS) or Kubernetes if you run any other form of Kubernetes deployment (vSphere, Openshift, upstream K8s etc.). In my environment I am running workload management vsphere cluster so I am choosing Kubernetes as data source.
As you see above, you will have to specify the address/FQDN of the NSX Manager serving this kubernetes cluster, in addition you will need to upload a copy of your Kubernetes cluster configuration file. The cluster configuration by default is located on your bootstrap machine (machine from which you manage your Kubernetes cluster) under ~/.kube/config so you need to upload this file as well to vRNI.
Once you input/add the above information and successfully complete the source validation, you should be to start using vRNI filters and pinboards to gain insights into your TKGi/Kubernetes clusters.
There are some common and frequently used vRNI search filters to use again your kubernetes clusters to gain insight into what is happening inside the containerised world. In the coming examples I wil be covering some of those filters.
Once you login to vRNI you should see the search box
Lets start by listing all the available namespaces we have running by typing ‘kubernetes namespaces‘ and press enter
Now, lets be more specific and search for a namespace called ‘tanzuvexpert’
kubernetes namespaces where Name = 'tanzuvexpert'
Click on the name of the namespace and lets explore more the components of that namespace
Click on the number of Pods
Now click on the first pod which is starting with curl-deployment, this will open a new window with more details regarding that pod including NSX port name, container image, labels, kubernetes worker node and cluster:
Now, if you click on Metrics and then choose Network Traffic by Packets you should interesting metrics and details regarding Rx, Tx and dropped packets within that Pod
In the search bar type the following search syntax:
Kubernetes Node where Kubernetes Cluster = '172.20.10.2'
This search filter will return all the available nodes in cluster 172.20.10.2, it is interesting to also list any node that might not be in a “Ready” state, this can be done by using the below filter:
kubernetes nodes where Ready != 'True'
(this is part of VMware vRNI K8s search poster)
flows where Kubernetes Cluster = Production'
kubernetes pods group by Kubernetes Services
kubernetes Pods group by Kubernetes Node
flows where kubernetes service is set
flows where firewal action = 'DROP' group by Kubernetes Service
Kubernetes events where event code = 'ImagePullBackOff'
Kubernetes events where problem entity.Kubernetes Cluster = '172.20.10.2'
Using vRNI to monitor NCP based Kubernetes clusters is very powerful and provides network and security admins with deep insight into containers and their traffic patterns, health status, security planning and much more.
Leveraging such a powerful insight tool allows vRNI in combination with NCP/NSX as networking stack to be positioned as a powerful kubernetes observability tool.
Overview NSX ALB (previously known as Avi) offers rich capabilities for L4-L7 load balancing across…
Overview In part two of my blog series covering Kubernetes/Tanzu as a service using cloud…
Overview In a previous blog post series (part one and part two) I covered how…
Overview In part one of this blog post, we deployed a Cloud Director instance and…
Overview Regardless of the type of the cloud services that your organisation is making use…
Overview I decided to find sometime before the end of the year and about a…
View Comments