Cloud Assembly – Deleting Orphaned Resources

The VMware SaaS offering for cloud services (Cloud Automation Services or CAS for short) has been bubbling away for a while now and was moved to GA status in mid January.  I won’t cover the finer details here but suffice to say CAS is a suite of SaaS products.

Alot of people within VMware (myself included) have been on a journey to get up to speed on the new SaaS offerings, particularly as the next iterations of our on-prem products will be based (in some part) on the new developments within our SaaS offerings.

As part of my exploration of CAS (and in this case Cloud Assembly), I have hooked in a number of endpoints, built machines, performed some customisation and a few other things.  Basically alot of tyre kicking.  It’s been several weeks since I have had time to devote to CAS and now that I have I find myself in a catch 22 situation.  Let me explain.

Within VMware a number of training platforms were spun up to help internal users explore the product set.  These platforms generally included a vCenter endpoint with NSX-T and some host resources allowing users to configure an endpoint and explore blueprints.  In between the time I last did something in Cloud Assembly and the date of this article (a few weeks) my training platform had been automatically de-commissioned including my test deployments I had created from CAS.  Now this is entirely my own fault as I knew my training platform was time limited.

This sudden loss of an endpoint (and objects within the endpoint) presents a problem for CAS and Cloud Assembly as it assumes at some point that this connection will be re-established.  It does NOT automatically remove orphaned objects from its inventory within your org.  What is more, the GUI offers no method of removing orphaned deployments and machines (note the greyed out DELETE button you see on the screen shot below is NOT publically available to customers).  Every machine shown below physcially does not exist any more but Cloud Assembly won’t give them up.

Screenshot 2019-03-19 at 16.21.37

What’s more I cannot delete the “Cloud Zone” (i.e. the collection of resource that represents my missing endpoint) or the project (a container of my cloud zones, in this case just containg my one missing endpoint) as Cloud Assembly states I still have resources being used in them.

Here I am trying to remove the project “AK Test”.  Its giving me the suggestion to remove my cloud zones used by the project as well as blueprints and machines.

Screenshot 2019-03-19 at 16.32.31

Lets try and remove my cloud zone as suggested.  Nope that doesn’t work.  This seems to be a dependency loop where each requires the other to be deleted first.

Screenshot 2019-03-19 at 16.34.44.png

The Fix

This is not so much a fix but more like a work around.  The API appears to have more potential to resolve these types of issues by having force delete options etc.  All of the API is documented via swagger and the account management swagger URL is listed in the online documentation for CAS.

https://console.cloud.vmware.com/csp/gateway/am/api/swagger-ui.html#/

Before I do anything with the API I need to create myself an API Token against my org account (i’ve obscured most of the refresh token field for my own protection).

Note it’s important that you create the API token on an account that has access to see and manage your objects.

Screenshot 2019-03-19 at 16.38.35

Now that I have the refresh token I can use this to get a bearer token which will represent my identity in any API calls I choose to make.  The swagger interface gives me some options.

Screenshot 2019-03-19 at 17.00.41

Looking at the methods available I can work out that a bearer token is obtained using a POST operation from:

https://console.cloud.vmware.com/csp/gateway/am/api/auth/api-tokens/authorize

The API call needs the following headers:

  • Content-Type = application/x-www-form-urlencoded
  • Cache-Control = no-cache

Finally the refresh token needs to be placed within the body of the request also in a x-www-form-urlencoded format.  My call looks as following in POSTMAN (note parts of my access token and refresh token have been blanked out).

Screenshot 2019-03-19 at 16.50.32

Now I have an access token I can start issuing API commands against my objects.  I obviously want to delete my orphaned configuration so lets start at the deployments level (remember deployments and machines are 2 separate things just like in vRA).

The first stage is to verify I can see all my deployments so I need an API call to get all of my deployments.  There’s nothing in the swagger endpoint which is not surprising as that endpoint was all about identity management.  I need an API endpoint that is for managing deployments which is:

https://api.mgmt.cloud.vmware.com/deployment/api/swagger/swagger-ui.html

Now I can see some methods I can use including getting all my deployments and deleting them via ID.

Screenshot 2019-03-19 at 17.12.50.png

The REST API call for getting all the deployments in POSTMAN looks like this:

Screenshot 2019-03-19 at 17.14.14

Each deployment has a “id” field and it is these values I need to copy down so I can submit my deletion requests, one request for each ID.  Remember that in this list might be deployments you don’t want to delete to make sure you are looking at the right deployments before you issue delete requests!

Screenshot 2019-03-19 at 17.17.12

Now I can execute my delete requests using the ID’s from the previous step within the URL of each call however I need a bit more magic, specifically to force the deletions.  This is accomplished using the  “forceDelete=true” parameter.  Note that this also requires the same authortization header with my bearer token.

Screenshot 2019-03-19 at 17.21.07

The Fix Part 2

Now that my deployments have been deleted I need to do the same for my machines.  Machines are handled within the “IaaS” endpoint so the first place to look is within:

https://api.mgmt.cloud.vmware.com/iaas/api/swagger/swagger-ui.html

Sure enough there is a method to get all the machines and one to delete machines by ID.

Screenshot 2019-03-19 at 17.31.40

Just as with the deployments, each individual machine has its own ID and it is these I can use to delete the machines themselves.

Screenshot 2019-03-19 at 17.34.56.png

That was the plan anyway, but unfortunately there appears to be no forceDelete option available for machines.  Executing a delete API call for one of my machines results in a request that ultimately fails.  This is raised as an internal bug at the moment so I will come back and update this article when the fucntionality has been added.