Categories: KubernetesTanzuvRNI

Monitoring NCP based TKGi & K8s using vRNI

In this blog post we will be exploring how vRealize Network Insight can be used to monitor Tanzu and kubernetes clusters running NSX Container Plugin (NCP) as CNI. NCP is used in Supervisor Clusters deployed by Workload Management services (WCP) in vSphere and TKGi and can also be used in Openshift and upstream kubernetes.

Running NCP as your Tanzu or Kubernetes CNI allows customers to leverage the powerful networking features and functions of NSX in their containerised workloads. NSX Manager is then your single pane of glass (UI) to manage, deploy and monitor kubernetes clusters networking and security policies.

With NCP, customers can also leverage NSX standard and advanced load balancers as load balancer services for pods utilising NCP as their CNI. Compared to Tanzu/K8s clusters running Antrea CNI (more about monitoring Antrea cluster using vRNI can be found in my previous blog post HERE) clusters running NCP are fully visible and monitored within vRNI without the need to deploy any additional Netflow collectors, while for Antrea clusters vRNI can only provide flow analysis with no k8s cluster/pod level knowledge (more about that in the blog post mentioned above).

Lab Inventory

For software versions I used the following:

  • VMware ESXi 7.0.2.17867351
  • vCenter server version 7.0U3 with workload management with NSX enabled.
  • NSX-T 3.2.0.1
  • vRealize Network Insight 6.6 platform and collector appliances.
  • TrueNAS 12.0-U7 used to provision NFS datastores to ESXi hosts.
  • VyOS 1.4 used as lab backbone router.
  • Ubuntu 20.04 LTS as Linux jumpbox.
  • Ubuntu 20.04.2 LTS as DNS and internet gateway.
  • Windows Server 2012 R2 Datacenter as management host for UI access.
  • vRealize Network Insight 6.6

For virtual hosts and appliances sizing I used the following specs:

  • 3 x virtualised ESXi hosts each with 8 vCPUs, 4 x NICs and 32 GB RAM.
  • vCenter server appliance with 2 vCPU and 24 GB RAM.
  • NSX-T Manager medium appliance
  • vRealize Network Insight medium Platform and Collector appliances.
  • 2 x medium edges with no resource reservations.

Before we start

Before we jump right into the configuration steps, you need to ensure that your environment meets the following requirements in order to monitor kubernetes clusters in vRNI:

  • Install latest version of vRNI platform and collector appliances (one collector appliance is enough).
  • Tanzu or K8s cluster running NCP as container networking interface.
    • In my lab environment I used the supervisor cluster deployed after enabling workload management using NSX networking (you can check that in a previous post HERE)
  • You need to have vRNI Enterprise license in order to add kubernetes/TKGi clusters as data sources.
  • You have to add the NSX Manager instance managing your workload networking also as a data source in vRNI.

Preparing your Kubernetes cluster to be added to vRNI as data source

In this step we just need to ensure that our NCP system pod is running properly and that our kubernetes/TKGi cluster is visible in NSX Manager.

Step 1: Verify workload management deployment

From your bootstrap machine (in my case it is an Ubuntu VM running the CLI plugin for vSphere) ensure that you are logged in to your supervisor cluster and list the available names spaces:

The commands are used are:

kubectl vsphere login --server=https://172.20.10.2 --insecure-skip-tls-verify -u administrator@vsphere.local

kubectl config use-context 172.20.10.2

kubectl get ns

The last command is to list the available namespaces under the supervisor cluster (172.20.10.2) and the highlighted namespace vmware-system-nsx is where the NCP pod is running, this is the pod responsible to connecting the cluster to NSX, you can verify this by using the below command:

In my lab environment I have also deployed 6 workload pods inside a namesake called vrni which all runs under the supervisor cluster:

If you would need to have more details about setting up workload management on NSX and creating namespaces and pods, you can reference my previous blog posts HERE and HERE.

Step 2: Verify cluster and pods visibility in NSX Manager

Login to you NSX Manager and navigate to Inventory > Containers and under Clusters you should be able to see a list of the clusters running NCP

If you switch to Namespaces and search for ‘vrni’ then you should also see the configured pods earlier configured:

Add your Kubernetes/TKGi cluster as data source in vRNI

Login to your vRNI platform UI, navigate to Settings > Accounts and Data Sources, click on ADD Source scroll down and under Containers choose TKGi if you have TKGi cluster (previously known as Enterprise PKS) or Kubernetes if you run any other form of Kubernetes deployment (vSphere, Openshift, upstream K8s etc.). In my environment I am running workload management vsphere cluster so I am choosing Kubernetes as data source.

As you see above, you will have to specify the address/FQDN of the NSX Manager serving this kubernetes cluster, in addition you will need to upload a copy of your Kubernetes cluster configuration file. The cluster configuration by default is located on your bootstrap machine (machine from which you manage your Kubernetes cluster) under ~/.kube/config so you need to upload this file as well to vRNI.

Once you input/add the above information and successfully complete the source validation, you should be to start using vRNI filters and pinboards to gain insights into your TKGi/Kubernetes clusters.

Common vRNI search queries

There are some common and frequently used vRNI search filters to use again your kubernetes clusters to gain insight into what is happening inside the containerised world. In the coming examples I wil be covering some of those filters.

Once you login to vRNI you should see the search box

Lets start by listing all the available namespaces we have running by typing ‘kubernetes namespaces‘ and press enter

Now, lets be more specific and search for a namespace called ‘tanzuvexpert’

kubernetes namespaces where Name = 'tanzuvexpert'

Click on the name of the namespace and lets explore more the components of that namespace

Click on the number of Pods

Now click on the first pod which is starting with curl-deployment, this will open a new window with more details regarding that pod including NSX port name, container image, labels, kubernetes worker node and cluster:

Now, if you click on Metrics and then choose Network Traffic by Packets you should interesting metrics and details regarding Rx, Tx and dropped packets within that Pod

Common searches for kubernetes nodes

In the search bar type the following search syntax:

Kubernetes Node where Kubernetes Cluster = '172.20.10.2'

This search filter will return all the available nodes in cluster 172.20.10.2, it is interesting to also list any node that might not be in a “Ready” state, this can be done by using the below filter:

kubernetes nodes where Ready != 'True'

Summary of useful vRNI search filters for kubernetes

(this is part of VMware vRNI K8s search poster)

flows where Kubernetes Cluster = Production'
kubernetes pods group by Kubernetes Services
kubernetes Pods group by Kubernetes Node
flows where kubernetes service is set
flows where firewal action = 'DROP' group by Kubernetes Service
Kubernetes events where event code = 'ImagePullBackOff' 
Kubernetes events where   problem entity.Kubernetes Cluster = '172.20.10.2'

Final word

Using vRNI to monitor NCP based Kubernetes clusters is very powerful and provides network and security admins with deep insight into containers and their traffic patterns, health status, security planning and much more.

Leveraging such a powerful insight tool allows vRNI in combination with NCP/NSX as networking stack to be positioned as a powerful kubernetes observability tool.

Bassem Rezkalla

View Comments

Recent Posts

Configuring Layer 7 Ingress using AKO for vSphere with Tanzu on NSX-T Networking

Overview NSX ALB (previously known as Avi) offers rich capabilities for L4-L7 load balancing across…

1 week ago

Providing Tanzu as a Service for Tenants using Cloud Director 10.4.1 and vSphere 8 with Tanzu (TKGs) – Part II

Overview In part one of this blog post, we deployed a Cloud Director instance and…

1 month ago

Providing Tanzu as a Service for Tenants using Cloud Director 10.4.1 and vSphere 8 with Tanzu (TKGs) – Part I

Overview Regardless of the type of the cloud services that your organisation is making use…

1 month ago

Why apply for the vExpert program

Overview I decided to find sometime before the end of the year and about a…

2 months ago