Problem: VM’s disconnecting vNIC after vmotion.

Well today I had an interesting conundrum. Was doing some routine patching of an ESX cluster and suddenly alerts were going off about VM’s being disconnected.

It turns out we hit the default port limit of the vSwitch on the destination ESX host, which is 64 (or 56 usable).

A quick check of the logs and vswitch config on the service console confirmed the suspicion.

After the incident, did a google and it would appear one of my fellow countrymen, Cristoph Fromage encountered the same limit last year. Link

To get services back online quickly, I simply migrated a few machines off the over-allocated host, then re-enabled the interfaces on the affected VM’s. A better monitoring system would’ve been helpful here, or if I were faster with powercli, perhaps finding the disabled interfaces through that… I ended up going through all the VM’s in that cluster to be sure I’d got them all.

The ultimate fix is to carefully juggle the VM’s around so you don’t hit the limit again, then increase the port limit on each vSwitch in the affected cluster….

Between this, a massive spanning tree issue taking down half the campus and an abandoned snapshot…. I think I’ve had enough disaster for one day.

5 Responses to “Problem: VM’s disconnecting vNIC after vmotion.”

Leave a Reply

The opinions expressed on this site are my own and not necessarily those of my employer.

All code, documentation etc is my own work and is licensed under Creative Commons and you are free to use it, at your own risk.

I assume no liability for code posted here, use it at your own risk and always sanity-check it in your environment.