Author Archive

Problem: VM’s disconnecting vNIC after vmotion.

Well today I had an interesting conundrum. Was doing some routine patching of an ESX cluster and suddenly alerts were going off about VM’s being disconnected.

It turns out we hit the default port limit of the vSwitch on the destination ESX host, which is 64 (or 56 usable).

A quick check of the logs and vswitch config on the service console confirmed the suspicion.

After the incident, did a google and it would appear one of my fellow countrymen, Cristoph Fromage encountered the same limit last year. Link

To get services back online quickly, I simply migrated a few machines off the over-allocated host, then re-enabled the interfaces on the affected VM’s. A better monitoring system would’ve been helpful here, or if I were faster with powercli, perhaps finding the disabled interfaces through that… I ended up going through all the VM’s in that cluster to be sure I’d got them all.

The ultimate fix is to carefully juggle the VM’s around so you don’t hit the limit again, then increase the port limit on each vSwitch in the affected cluster….

Between this, a massive spanning tree issue taking down half the campus and an abandoned snapshot…. I think I’ve had enough disaster for one day.

vCenter as a vApp?

So I’ve been doing a fair bit of thinking lately on what I want my new virtual infrastructure to look like….

I’ve got multiple datacenters, with multiple clusters in each (differing hardware requires that) plus a dedicated VM testlab and I was thinking…. well probably best to have a vcenter in each.

My line of thinking was basically:

  • vCenter in each DC (in linked mode?)
  • Separate DB’s
  • Maybe template it?
  • Well the DB should be a VM too
  • Need to sortout the startup order…
  • Hmm what about a vAPP

Now it seems like a reasonable leap to me, but (correct me if I’m wrong), all the vApp detail is stored in the VCDB, if vCenter is unavailable… will the startup order of the vCenter vAPP work as expected in a HA event?

Time to test it in the testlab I think….

Custom shares on a Resource Pool, scripted (Modified)

Well I’ve been taking a bit of a break from XML-based powershell code, ‘coz it was doing my head in working with that. I was going through some older blog posts on YellowBricks and stumbled across a few related to Resource Pools, specifically related to how shares work.

Now I was always under the impression that shares were already weighted to account for the number and size of VM’s in them… however I was clearly mistaken. I did actually raise this question during my vSphere training and was assured that was the case…. so it’s definitely something to be aware of. The Resource Pool Priority Pie Paradox Read the rest of this entry »

Clone-to-test: Part 2

In this post I present the first script. It took me quite a while, as I had to figure out how to do everything. I’m no rocket-scientist, so if I can figure it out, you all have hope too. I really do suggest getting your hands dirty in some code, powershell is pretty easy to pick up. :)

If you haven’t done so already, I would suggest reading Part 1. You may also want to check out the architecture of my isolated test environment, as the design is pretty closely linked with my particular requirements.
Read the rest of this entry »

Clone-to-test: Part 1

Well it’s taken me a while, but I’ve just completed the first part of my test environment migration process. So I thought I’d post up the design and architecture diagrams.

The idea is to provide a repeatable process for cloning batches of VM’s from the production network into my organization’s isolated vm-test environment. The cited use cases for this environment include Virtualization infrastructure change testing, VI development, VI product evaluation and Virtual Machine change test/verification. This series of scripts addresses the last requirement in an automated manner. Read the rest of this entry »

VMware Basic VM Performance as XML (Powershell)

Well continuing my recent foray into scripting, I wrote a small script to dump some basic stats about each VM into formatted XML.

The purpose is to get a basic idea of average use accross all our VM’s. We’re using this information as part of a virtual infrastructure review and redeployment. We have budget to replace aging ESX hardware and we’re taking the opportunity to tidy things up a bit.

The script is using the same framework as my dump-virtual-machine-info-as-xml script posted previously. I’ve grown quite fond of XML, although I’m sure I can definitely improve it…. for a start creating a proper schema.
Read the rest of this entry »

My VMware Test Environment

So I thought I’d share with you some background on the test environment I’m building, as I’ll be posting the scripts I’m developing to make it all happen… so a bit of context will hopefully help you.

My organization is currently going through a phase of massive expansion and as a result we’ve been given a little freedom to explore new technology and increase our knowledge and experience with VMware. To experiment with the new products and also to test changes to VM’s, we decided to build a small 2-node VMware cluster mimicking prod as closely as possible. Read the rest of this entry »

vHW Upgrade Risk Calculator v0.5

One of the tasks on my plate is to upgrade the virtual hardware of approximately 300ish VM’s. We need this done to support backup using the VDAP mechanism through Commvault.

My environment is politically pretty complex, and we have no agreed maintenance window; This creates an interesting situation for upgrades of this scale. So I thought I’d knock up a bit of a risk calculator to try and identify which VM’s are likely to experience trouble during the vHW upgrade.

This is the first part of the project and will help us do the first stage of grouping and outage scheduling.

The next part, which I’m developing now, is a script to clone a batch of VM’s to an isolated VM test environment (a pair of ESX servers, on a duplicate, but isolated LAN segment).

Once we have a running clone of each batch, I can use Update Manager to do the tools and vHW upgrade, then run a testing script over them. This process will allow us to catch out the problematic VM’s ahead of time and schedule their upgrade in a more closely monitored way.

So without further a due, Here it is: VI_vHW_Upg_cheatsheet.

Hopefully this is of use to someone :)
-Doug

cloud camp

Today there was a Cloud Camp in Perth. It was basically a frank discussion among people involved with cloud technologies.

There were people from a lot of different fields and the discussions were all at a very high level. I gained a bit of insight into various viewpoints.

One of the discussions I found interesting was around proper investigation of business requirements BEFORE making any significant change, like moving to ‘the cloud’. This seems obvious to me, however at least in my (abeit limited) experience, it doesn’t happen nearly as often as it should. Read the rest of this entry »

Theme issues

After making my first post, I’ve realized my formatting isn’t ideal. If anyone has any suggestions, I’d be grateful.
-Doug

The opinions expressed on this site are my own and not necessarily those of my employer.

All code, documentation etc is my own work and is licensed under Creative Commons and you are free to use it, at your own risk.

I assume no liability for code posted here, use it at your own risk and always sanity-check it in your environment.