Upgrading SSC (SaltStack Config) to 8.7

In this post i will go over upgrading my 8.x SSC appliance to 8.7. As a pre requirement we do need to have vRSLCM (vRealize Lifecycle Manager) upgraded to 8.7. Instructions can be found here. The upgrade already includes the latest Product Support Pack so an update to the Product Support Pack is not required.

To get started we can go to vRealize Lifecycle Manager -> Lifecycle Operations -> Settings -> Binary Mapping. (If you haven’t added your My VMware credentials you will need to do that first by going to vRealize Lifecycle Manager -> Lifecycle Operations -> Settings -> My VMware)

Click on Add Binaries under Product Binaries

Select My VMware and click on Discover

We can see a list of binaries that have been discovered. We can select what we need and click on Add

This will create a request and start downloading the package. To view the progress we can click on the Click Here hyperlink

Click on the in Progress button to view the details

We now have to wait for the download to complete

After the download is complete we can go to Environments -> View Details on the environment that includes SSC

Click on Upgrade

An Inventory sync is recommended if the environment has changed since LCM performed the last sync. We trigger the sync from the UI or click on Proceed to continue

Select product Version 8.7.0 and click Next. We can also review the compatibility matrix to make sure the environment is compatible.

We can automatically create and delete a snapshot part of the upgrade process

Run the Precheck to make sure there are no errors

Once the check is complete, click on Next. Review the upgrade details and click on Next. We are taken to the progress screen where we can follow the progress.

The system will get rebooted and once its back up we will be on 8.7

Upgrading vRSLCM (vRealize Lifecycle Manager) to 8.7

In this guide i will go over the steps of getting an existing 8.x vRSLCM appliance upgraded to the latest 8.7 release. The release notes can be found here

The first step is to lo in to vRealize Suite Lifecycle Manager under the Lifecycle Operations section

Go to settings -> System Upgrade

Click on Check for Upgrade

We can see that the check found a new version available for 8.7

Click on Upgrade

Verify that a snapshot or backup exists in case the process fails. Check the check mark for I took a snapshot of the vRealize Suite Lifecycle Manager before I performed this operation. Click Next

Click on Run Precheck

Verify that all check have passed and click on upgrade

This will fire up the upgrade process and start upgrading packages. The system will automatically reboot on 8.7 once completed. We can check the version by going to Settings -> System Details

If you get the below error clear the browser cache and try again

Change Delete old snapshot restriction from 7 days in Automation Central

I recently ran through an issue where i wanted to automatically delete snapshots after 5 days instead of the default 7 days that comes out of box in automation central. I wanted to change delete old snapshot restriction from 7 days to 5 days.

It seems like the restriction comes from the Reclaim Settings. In order to change it we can go to Optimize -> Reclaim -> Settings

Change the Snapshots setting to older than

Going to Automation Central i was able to confirm that the minimum is now 5 days

vRA 8 API getting started

I wanted to keep track of what needs to be done prior to actually being able to query API on vRA 8. Ive been having a hard time finding the documentation i needed in the past. If you are looking for the Cloud version it can be found here

First step is to get an API token for the specific username. We can do this by using curl or postman. The call would look similar:

curl --location -k --request POST 'https://vra_url/csp/gateway/am/api/login?access_token' --header 'Content-Type: application/json' --data-raw '{
"username": "username",
"password": "password",
"domain": "domain name or system domain"
}'

The command will output a refresh_token. Example:

{"refresh_token":"DJzTxR..."}

We will use this token to generate a bearer token

curl --location -k --request POST 'https://vra_url/iaas/api/login' --header 'Content-Type: application/json' --data-raw '{
        "refreshToken": "DJzTxR..."
}'

This will output a bearer token than can be used to run additional API calls Example:

{"tokenType":"Bearer","token":"eyJ0eXAiO..."}

Full API guide can be found at https://vra_url/automation-ui/api-docs/ or https://developer.vmware.com/api/

vROPS Cloud Proxy docker routing

I recently ran into a problem where i needed to route the subnet that is being used for the docker routing on the vROPS Cloud Proxy appliance. The network used is 172.17.0.0/16. For this we will try to route a portion of that subnet through my ethernet interface

To test the configuration out without adding a persistent route i used the following command:

ip route add 172.17.11.0/24 via 172.16.11.1 dev eth0 

Verify that everything is running properly, if anything went wrong the route should get reverted on the next reboot. If the changes are successful we can add the route as persistent by editing the network configuration using vi:

vi /etc/systemd/network/10-eth0.network

Next add something similar to the Route option

[Route]
Destination=172.17.11.0/24
Gateway=172.16.11.1

Please note that the above is not official guidance from VMware. If you need support please reach out to technical support.

vRSLCM (vRealize Lifecycle Manager) Product Support Pack

In this guide i will go over the steps of getting an existing 8.x vRSLCM appliance to support the latest product releases available. Here is a great blog that goes in to the details about what the Product Support Pack is https://blogs.vmware.com/management/2019/01/vrslcm-pspak.html. Typically the newer Product Support Pack is included part of the upgrade for LCM, however sometimes there are product releases in between releases where product support packs come in handy.

The first step is to log in to vRealize Suite Lifecycle Manager under the Lifecycle Operations section

Go to settings -> Product Support Pack

We can see that i recently upgraded to 8.6 however a new update is available 8.6.0.1. Based on what we can see in the details the new support pack adds support for vRA 8.6.1. If an update is not available click on the Check Support Packs Online button and refresh the screen within a few minutes

Click on Apply Version

Verify that a snapshot or a backup exists and click Submit

We can view the progress by clicking on the Click Here link after submitting the request

Once the process is complete the system will most likely reboot. To check the status we can go back to settings -> Product Support Pack. As we can see we are now at the updated patch level

If you get the below error clear the browser cache and try again

Deploying the vROPS cloud proxy

In this guide we will go over Deploying the vROPS cloud proxy for cloud as well as on premise. The official VMware documentation can be found here

To get started log in to your vROPS instance. If its cloud it would be something similar to https://www.mgmt.cloud.vmware.com/vrops-cloud/ui/index.action For on premise it would be https://vrops_url/ui/login.action

Once you are logged on We can go to Data Sources -> Cloud Proxies and press on New. example:

Here we can see the download cloud proxy OVA option as well as a copy button. We can also see the OTK key. Keep a note of this as we will need it during the deployment. The first step is do get the proxy deployed. by either downloading the OVA or by copying the url. In my case i will copy the URL.

For reference you should be at this screen:

Click on the clipboard icon to copy the path to the ova

Next we will go to our vCenter to get it deployed. Go to one of the hosts or clusters, then go to actions -> Deploy OVF Template… example:

If you are deploying the cloud proxy for vROPS cloud the URL will look similar to this:

If its for on premise it would look similar to this:

Click on next and accept the certificate thumbprint

Select a name and location where the deployment should go and click on Next

Select a compute resource and click on Next

Review the details of the deployment and click on next

Accept the licensing agreement and click on next

Select a size and click next. If the environment is larger than 8k VMs you would want to deploy the Standard size. The sizing guide can be found here

Select a storage device and click on Next

Select a network and click on Next

Here is where we would add that OTK key from earlier. Paste in the OTK key. Give the VM a friendly name (this name will be what`s displayed in vROPS cloud proxies page. The network Proxy Settings are only applicable if you need to use a proxy to get out to the internet. The rest of the fields should be pretty self explanatory

Verify everything is correct and click on finish

Once the deployment is complete power on the machine. At this time we need to wait for a couple of minutes before it appears under the cloud proxies. It took about 20 minutes in my environment before i was able to see it in the vROPS cloud console

We can check the console while we wait for everything to be provisioned

Once the deployment is complete the console would look similar to this:

We can also see the proxy coming online in the vROPS cloud proxies menu

Once complete the proxy will show as online

The full vROPS cloud documentation can be found here

The full vROPS on premise documentation can be found here

You can request a trial from here

Workaround instructions to address CVE-2021-44228 and CVE-2021-45046 in vRealize Operations 7.x

In this article i will go over one of the workaround instructions to address CVE-2021-44228 and CVE-2021-45046 in vRealize Operations 7.x. I have tested the workaround on vROPS 7.5 as its still shipped with VCF 3.x and i haven’t yet seen documentation on a workaround for this version. If you are looking for instructions for version 8.x consult kb article 87076. This has been tested on December 21 2021. Please check the official documentation or open a ticket for production usage.

Create a snapshot of the vROPS components to make sure we have something to revert to in case anything were to go wrong.

Log into the vROPS instance admin UI typically https://ip_address/admin and take the cluster offline. This applies to all nodes including but not limited to Analytic, Primary, Replica, Data, Remote Collectors and Witness nodes.

Give a reason and press ok

Verify the cluster is offline before continuing

Log in via ssh to a temporary path ex /tmp. Because vROPS 7.5 doesn`t come with the newer OpenSSL modules we need to find other means to get the files to the server without using a direct download method like wget.

In my case in put the code below in a file called vrops-log4j-fix.sh in my /tmp directory

#!/bin/bash

file=/tmp/impacted_jars.txt

echo "Searching for impacted .jar files. Please wait..."

find /usr/lib -type f -name "*.jar" -exec sh -c "zipinfo -1 '{}' | grep "org/apache/logging/log4j/core/lookup/JndiLookup.class" && echo {}" \; | grep "/usr/lib" > $file

line_qty=$(wc -l < $file)

if [ $line_qty -ne 0 ]; then
    echo "Found $line_qty impacted .jar files"
else
    echo "No impacted .jar files found"
    exit 0
fi

echo "Starting to patch impacted .jar files"

while IFS= read -r line;
do     
    echo "patching -> $line"

    own_user=$(stat -c '%U' "$line")
    own_group=$(stat -c '%G' "$line")

    zip -q -d "$line" org/apache/logging/log4j/core/lookup/JndiLookup.class
    if [ $? -ne 0 ]; then echo "ERROR: Fail to Patch $line"; fi

    chown $own_user:$own_group "$line"

done < $file

rm -f $file

Make the file executable by running chmod +x vrops-log4j-fix.sh

And then run the script by running ./vrops-log4j-fix.sh

The system will go through and find impacted .jar files and try to patch them. If successful we should end up with something like this

Next we will do the same with cp-log4j-fix.sh file

#!/bin/bash

#set -x


FAILURE="0"
WRAPPER_FILES="/usr/lib/vmware-vcops/user/conf/collector/wrapper.conf"
for f in $WRAPPER_FILES
    do
        last_idx=""
        if [[ -f $f ]]; then
            echo "********************************"
            echo "Updating file: $f"
            let last_idx=$(egrep "^wrapper.java.additional.[[:digit:]]+=" $f | cut -d= -f1 | awk -F '.' '{print $4}' | sort -n | tail -1)
            if [[ -z $last_idx ]]; then
                echo -e "ERROR: Failed to get JVM additional index"
                let FAILURE="1"
                continue
            fi
            ((last_idx++))
            echo -e "\n#Fixing Apache Log4j2 Remote Code Execution Vulnerability\nwrapper.java.additional.$last_idx=-Dlog4j2.formatMsgNoLookups=true" >> $f
            if [[ $? != 0 ]]; then
                echo -e "ERROR: Failed to update file: $f\n"
                let FAILURE="1"
            else 
                echo -e "Sucessfully updated file: $f\n"
            fi
        else
            echo -e "ERROR: file is not found: $f\n"
            let FAILURE="1"
        fi
    done


CASA_JVM="/usr/lib/vmware-casa/casa-webapp/bin/setenv.sh"
echo "********************************"
echo "Updating file: $CASA_JVM"
echo 'JAVA_OPTS="$JAVA_OPTS -Dlog4j2.formatMsgNoLookups=true"' >> $CASA_JVM
if [[ $? != 0 ]]; then
    echo -e "ERROR: Failed to update file: $CASA_JVM\n"
    let FAILURE="1"
else
    echo -e "Sucessfully updated file: $CASA_JVM\n"
fi

if [[ "X$FAILURE" == "X1" ]]; then
    exit 1
fi

exit 0

and lastly the data-rc-witness-log4j-fix.sh file

#!/bin/bash

#set -x


FAILURE="0"
WRAPPER_FILES="/usr/lib/vmware-vcops/user/conf/analytics/wrapper.conf
/usr/lib/vmware-vcops/user/conf/collector/wrapper.conf
/usr/lib/vmware-vcops/user/conf/gemfire/wrapper.conf
/usr/lib/vmware-vcops/user/conf/tomcat-enterprise/wrapper.conf"
for f in $WRAPPER_FILES
    do
        last_idx=""
        if [[ -f $f ]]; then
            echo "********************************"
            echo "Updating file: $f"
            let last_idx=$(egrep "^wrapper.java.additional.[[:digit:]]+=" $f | cut -d= -f1 | awk -F '.' '{print $4}' | sort -n | tail -1)
            if [[ -z $last_idx ]]; then
                echo -e "ERROR: Failed to get JVM additional index"
                let FAILURE="1"
                continue
            fi
            ((last_idx++))
            echo -e "\n#Fixing Apache Log4j2 Remote Code Execution Vulnerability\nwrapper.java.additional.$last_idx=-Dlog4j2.formatMsgNoLookups=true" >> $f
            if [[ $? != 0 ]]; then
                echo -e "ERROR: Failed to update file: $f\n"
                let FAILURE="1"
            else 
                echo -e "Sucessfully updated file: $f\n"
            fi
        else
            echo -e "ERROR: file is not found: $f\n"
            let FAILURE="1"
        fi
    done


CATALINA_FILES="/usr/lib/vmware-casa/casa-webapp/bin/setenv.sh
/usr/lib/vmware-vcops/tomcat-web-app/bin/setenv.sh"

for f in $CATALINA_FILES
    do
        if [[ -f $f ]]; then
            echo "********************************"
            echo "Updating file: $f"
            echo 'JAVA_OPTS="$JAVA_OPTS -Dlog4j2.formatMsgNoLookups=true"' >> $f
            if [[ $? != 0 ]]; then
                echo -e "ERROR: Failed to update file: $f\n"
                let FAILURE="1"
            else 
                echo -e "Sucessfully updated file: $f\n"
            fi
        else
            echo -e "ERROR: file is not found: $f\n"
            let FAILURE="1"
        fi
    done

if [[ "X$FAILURE" == "X1" ]]; then
    exit 1
fi

exit 0

It would look similar to this in the end

To verify that CVE-2021-44228 was applied run the following

ps axf | grep --color log4j2.formatMsgNoLookups | grep -v grep

Running ./vrops-log4j-fix.sh will also verify that there are no .jar files that need to be patched

Next bring the instance back online in the admin console

Changing vRO Kubernetes IP range

I recently ran through an routing issue where the Kubernetes IP range in vRO 8.6 was used somewhere else on the network. I didn’t want to redeploy the appliance so i went through the below to get it updated

First i identified what the ip range in use is by running

vracli network k8s-subnets

Lets check the status of the pods to make sure everything is running as it should. If any of the pods experience issues the change wont go through and it will cause additional issues

kubectl get pods -n prelude

The expectation is that under the Ready tab we have something similar

NAME                               READY   STATUS    RESTARTS   AGE
docker-registry-695f9b8b45-d8gqr   1/1     Running   0          53m
postgres-0                         1/1     Running   0          53m
proxy-service-5d8f64b54-lmxg5      1/1     Running   0          54m
vco-app-78499d8cbd-4mcnk           3/3     Running   0          54m

To set a new internal Kubernetes ip range i ran

vracli network k8s-subnets --cluster-cidr 192.168.0.0/22 --service-cidr 192.168.4.0/22

Then in order to apply i changes i ran

vracli upgrade exec

I was prompted with a question

The services will be shut down while upgrade is in progress. Confirm you want to continue with the upgrade operation.[Y/n]

By pressing Y the system went ahead and reconfigured\redeployed the pods on the proper network

And lastly i wanted to check the status of the pods to make sure they all came back

kubectl get pods -n prelude

vIDM 3.3.5 HA

In this guide we will go over the vIDM 3.3.5 HA configuration. The official documentation can be found here

Im going to assume that the load balancer configuration is already completed, the vIDM appliance has a the required certificate in the LCM inventory. Please read the official documentation for the full requirements.

We will be using the scale out feature in Lifecycle Manager. To do so we can navigate to Lifecycle Operations -> Environments -> globalenvironment -> View Details -> Click on Add Components

It is recommended that an inventory sync is performed prior to starting the process. It can be triggered by pressing on Trigger Inventory sync button. In my case i don`t need one as i did it earlier so ill just click Proceed

Network configuration should be populated. Verify the config and click next

Towards the bottom of the Components page there will be a components section. Click on the Plus sign next to it and select VMware Identity Manager Secondary Node. Perform this task task 2 times so we can have 3 vidm nodes.

Complete the required fields Like network configuration and Cluster Virtual IP

On the next page run the precheck in order to execute the data validation

Verify the Manual Validation as described in the Pop Up Window and click on Run Precheck

Once all the check are complete click on next, Verify the Summary and click on Submit

This will take us to the Request Details Page where we can follow the steps taken

Once the additional nodes are installed validate that everything is working as expected.