I have an odd issue in my lab, and one that I have not been able to figure out. I have documented it carefully and have someone else coming in tomorrow to help me check it out. But I have been gone for a while, and have not paid much attention to my lab. So I want to do a quick audit of it. Not a health check, but just a quick look. I thought I could use CloudPhysics for it. So here we go.
BTW, I believe in indicators. Meaning I don’t look everywhere or at everything, but I look for things that can impact other things. Things that indicate what is going on and gives me clues. So a quick look can be useful and give you an idea of the condition of your lab.
First I look at NTP.
It looks fine. I am looking here to make sure that NTP is configured and running, and set all the same.
Now I look at syslog.
Again looking to see if everything is configured the same. And it is. You can play with the Time Machine feature to see if it was always like this!
Now I like to look at host BIOS and up-time.
If you are lucky to know BIOS versions you will know if they are current, but if they are the same where they are supposed to be that is a pretty good sign. And normally, I would expect to see a fairly long time from when the hosts were last started. In my case here the BIOS is the right one so that is good, and the boot time is not unexpected so all good.
Now another card good to look at is called Cluster Overall Health Status.
It looks good. And as hoped. See the Time Machine option? You can look back in time to see if that result you see of Green is new or old.
Now I like to look at Host Inventory. I want to make sure I see what I should or if there is something new perhaps. In my case it will be missing a host but that is expected.
Now I want to see a little more info on my hosts. So I use the ESXi Host info.
I like this card in that is shows a lot of info in a small space. So I can see state, and version / build and again I notice all is the same and that is good.
Now one I like to look at is more for my curiousity more than actually an indicator I use. It is the PCI IO Devices.
You can see that on my newest server in the cluster there is a couple of unsupported devices. Good to know!
Another thing I like to look at, is the last time templates were updated. Generally they should be updated every 30 or 45 days to catch the recent patches and updates.
In this case, the templates were updated recently so that is pretty good.
Recently it has become obvious that the E1000 driver is not always the best one to use with VMs. There has been some good work lately by Michael Webster to confirm that PVSCSI is a pretty good choice. So we use VMs with E1000 to check the status of the virtual machines network adapter.
So this is not good. There is work to be done here!
Lastly, I like to look at the vMotions from Yesterday. Not that vMotions happening is bad, in fact it is good as DRS is doing its job. But it is still good to see.
Now here is bad news. This card is not quite behaving right. I did report that and they are working on it. I thought it would be fixed by now but it isn’t.
But, there you have it. A very quick, and easy, look at my lab. So long as you have CloudPhysics looking at your lab or environment, you can easily take a very quick look at this.
Hope you found this interesting and useful.
BTW, another way to do this – outside of CloudPhysics, is to use vCheck so that you get an email each morning with a good status type look at things.
=== END ===