In part two of my blog series covering Kubernetes/Tanzu as a service using cloud director and CSE 4.0, I will continue the deployment workflow started in part one, the workflows covered in part two will include NSX ALB integration with Cloud Director and eventually deploying a Tanzu cluster inside Tenant Pindakaas which we created in part one.
For software versions I used the following:
For virtual hosts and appliances sizing I used the following specs:
Revise part one of this blog series for a list of prerequisites needed before you can continue with the steps mentioned in part one and two.
A very high level of Tanzu as a service offering with Cloud Director and CSE4.0 is summarised in the diagram below, the idea is to deploy CSE4.0 vApp which interacts with Cloud Director APIs to provide the capability to provision Tanzu Guest Clusters (TKGm) per tenant. Cloud Director provider will provide underlying networking and load balancing requirements in either per tenant dedicated or shared across tenants, while every tenant will be responsible of its own TKGm cluster deployment. Those TKGm clusters are bootstraped by means of an ephemeral temporarily VM which is created by the CSE extension and will be deleted automatically once the Tanzu cluster is up and running. There is no need to deploy Tanzu management cluster per tenant. Container Service Extension server will be deployed in its own provider managed Org and will need to have connectivity to Cloud Director only (no internet connectivity required). A detailed reference architecture which I used in my home lab and will be referenced in this blog post can be seen below:
Couple of remarks regarding the above architectural design:
The deployment workflow is broken down to 4 major steps:
Cloud Director version 10.2 offered native integration with NSX ALB (Avi) to offer load balancing capability across tenants workloads. The integration requires Cloud Director networking to be backed by NSX-T and to have NSX-T added as a cloud provider in NSX ALB. For service providers to offer Tanzu or native Kubernetes as a service using Cloud Director and CSE, they need to utilise NSX ALB (Avi) as load balancer for Tanzu/Kubernetes clusters.
In high-level as per the above architecture diagram, the following configuration needs to be completed on NSX-T before NSX can be added as cloud provider in NSX ALB
From my NSX-T manager, the network topology for service engine management network connectivity is shown below:
If you do not have already an NSX-T cloud instance integrated with NSX ALB then we need to add one (the same one we use with our Cloud Director instance). Before we can add an NSX-T Cloud we need to create two user defined credentials in NSX ALB, one for the user to use to authenticate with NSX and another one to authenticate to a vCenter instance where NSX ALB will use to deploy service engines. To create custom user credentials, from NSX ALB UI navigate to Administration and then under User Credentials click on CREATE to create two user credentials as explained above.
Repeat the same for vcenter user and when done you should see screen similar to the below
Now we need to add our NSX-T instance as NSX-T cloud in NSX ALB, to do that navigate to Infrastructure > Clouds click on CREATE and choose NSX-T Cloud
Fill in your NSX-T and vCenter information and use the credentials we created for both instances. Important note, make sure to add NSX-T either by an IP address or FQDN exactly as you previously added it in Cloud Director so if your NSX-T is added in Cloud Director by its FQDN then you have to add it by FQDN also in NSX ALB, otherwise Cloud Director will not be able to pull NSX-T relates information from NSX ALB at later step.
In my setup I also checked the DHCP check box, since I configured my NSX-T segment to which Avi service engine will be attached to use DHCP.
We need then to specify which NSX T1 GW and L2 segment to use for service engine management interface connectivity, both T1 and segment has been pre-configured in NSX-T. For the data network T1 and overlay segment I used for service engine are “dummy” objects i.e. will not really be used, this is needed because we cannot save the NSX-T Cloud addition workflow without specifying a T1 and segment for SE data network, however in integration with Cloud Director, the layer 2 segment will be automatically created by NSX ALB (192.168.255.0/25) as shown in the reference architecture and this will happen when we enable load balancer service on our Pindakaas tenant. So this is a “workaround” in order to be able to finalise addition of NSX-T as Cloud in Avi, but as mentioned it will be corrected later by Cloud Director.
In the next section add your vCenter instance and then click SAVE.
NSX ALB will take couple of seconds to add and sync info with NSX-T cloud and when done then you should see the status circle turned green
As last step, I created a service engine group Pindakaas-SE-Group under my newly created NSX-T cloud (but you can the default SE group)
Login to Cloud Director provider portal and click on Resources > Infrastructure Resources and under NSX-ALB click on Controllers and then click on ADD
Fill in your ALB controller information (make sure that you have Enterprise license) and then click on SAVE
Note, you might need to generate a new self-signed SSL certificate for your Avi controller (if you are not using a CA signed SSL certs) if cloud director failed to add your controller, on how to generate a new self-signed Avi controller certificate you can review one of my previous blog posts HERE.
Once our NSX ALB controller is added, we need to add our NSX-T instance also to Cloud Director, from same window click on NSX-T Clouds and then on ADD
You should see your NSX-T cloud added
Next we need to add our Service Engine group which we defined under NSX ALB NSX-T cloud, Click on Service Engine Groups and then click on ADD
Then navigate to Cloud Resources > Edge Gateways and click on Pindakaas-EDGE, then from left pane under Load Balancer click on General Settings move slider to activate load balancer state and click SAVE
Next I need to add a service engine group from the newly added NSX ALB to cloud director and then assign it to edge gateway serving my Tenant Pindakaas, click on Service Engine Groups and then click on ADD this will open a window where we need to select the name of our newly added NSX ALB controller and select our Pindakaas-SE-Group created in the previous step:
You should see similar screen to below
If you navigate to NSX-T you should be able to see a system created segment with subnet 192.168.255.0/25 and attached to Pindakaas Edge gateway T1 router
If you navigate now back to your NSX ALB UI and check the NSX-T Cloud settings you should see that NSX ALB has added the correct Edge Gateway and segment for service engine placement, make sure to remove the Dummy T1 and segment we used earlier
By this we have prepared our Cloud Director instance with NSX ALB and NSX-T cloud integration, and enabled load balancer capability on our tenant Pindakaas Edge gateway. With that we are all set to deploy our Tanzu TKGm cluster inside tenant Pindakaas and conclude the last step in the deployment workflow.
To allow organisation (tenants) users to create Tanzu clusters within tenant VDC we need to create an Org user and assign the below roles to it
Also make sure to publish the following Rights Bundles to your organisations under which Tanzu/Kubernetes clusters can be created
Make sure to add the right Preserve All ExtraConfig Elements During OVF Import and Export is added to both Org Admin role and the right bundle published to the tenants.
In my setup, I created a user called nsxbaas which will be an organisation admin for tenant Pindakaas with Kubernetes rights and I will use nsxbaas to create my TKGm cluster. For more details about how to create an Org user with appropriate Tanzu rights and roles you can refer to one of my previous blog post HERE.
Login to your tenant portal using the above user and navigate to More > Kubernetes Container Clusters
Click on NEW and then you should be able to see VMware Tanzu Kubernetes Grid appearing as an available Kubernetes Provider
Set cluster name and pick a TKG template to use for your cluster (this is the OVA images we uploaded earlier to our shared catalog).
Choose which vApp network to connect Tanzu nodes to, remember this network needs to have internet connectivity.
Set the number of control plan nodes and then click NEXT
Set the number of Worker nodes and create any additional worker pools if needed.
Create a default storage class to be used for your Tanzu clusters inside that tenant and then click NEXT
Set your Pods CIDR (this will be used by Antrea CNI to assign IPs for pods) and define your services CIDR (for Kubernetes cluster services such as cluster service IPs)
If you enable Auto Repair, cluster creation process will not try to resolve any errors if encountered to give a chance to manually troubleshoot the issue, if Auto Repair is enabled Tanzu creation workflow will restart cluster creation if any errors encountered to try and resolve it.
Review workflow parameters and if all good then click on FINISH.
The workflow of Tanzu cluster provisioning is lengthy process as it includes the following steps:
In my setup this took me something between 30 to 45 minutes so be patient. After successful deployment you should see your clusters listed with status Available
From vCenter you can verify that control and worker nodes VMs are successfully deployed and running along with Avi service engines:
To login to our TKG cluster, navigate back to tenant portal and click on More > Kubernetes Containers Clusters then click on our cluster pindakaas-tkgcluster01 and then DOWNLOAD KUBE CONFIG
This will download the kube configuration file for our cluster, copy this to a machine with kubectl installed and run the following command to verify cluster status and start deploying workloads
By this we have come to the end of this long (but hopefully useful) blog post series, follow my blog or my social media channels to stay updated about my work.
Cheers,
Overview NSX Advanced Load Balancer (a.k.a Avi) offers variety of advanced load balancing and application…
Overview With the release of VMware NSX 4.0 VMware announced the deprecation of NSX standard…
Overview Backup and restore is the main building block in any organisation's disaster recovery policy…
Overview In this blog post I am going to walk you through the configuration of…
Overview NodePortLocal is a feature that is part of the Antrea Agent, through which a…
Overview In part two of this blog post, we will be using NSX DFW to…