Careful with firmware patching in a vSAN Cluster!

So I had some time recently to update my lab.  I am a long-time user of the Dell OpenManage Integration for vCenter.  It has gotten a little flaky in the last while so I have to use it to patch one host at a time. And I have do it 2 to 5 times for each host.  Maybe I should say very flaky?  I keep hoping it will get better when I update the OpenManage appliance but it doesn’t seem to change.

So I start my firmware updating, and I don’t notice something important.  And I get two hosts done.

In the image above I point out that my array controller - PERC H730 has a current BIOS level of 25.5.0.0018 and it will be updated to 25.5.2.0001.  OK, no big deal and like I said, I do two hosts worth of updates. Then I notice that vSAN Health is not happy.

So you can see here that my two hosts that I have updated to current levels are now showing as WARNING in the vSAN health.

You can see above that there is two hosts that need to be downgraded to the 25.5.0.0018.

So, now I know that I need to be more careful and the next two hosts I do I do not select the Recommended patches for the H730 array controller.

You can see in the screenshot above the only Recommended not selected is the PERC H730 so that means my last two hosts are not updated to the not certified firmware. But everything else is updated - with multiple tries mind you.

And all is good.  But for the darn yellow warning in the vSAN health.  Which I prefer to only see green.  What do I do?  I wonder if the built-in vSAN firmware updates will update to a older version?

The built-in update tool does seem to be able to do a downgrade to the certified version.  But, be careful and go slow with it as I accidentally updated two hosts at the same time.  I thought the first one had finished but it hadn’t. Wait.  I see no yellow in vSAN health.  Is green.  And no updates outstanding.  But when I check using the Dell OM tool to see what is outstanding I see both hosts are still using the later driver and not the certified one.

One way to think of this is that I went through this fuss - and frustration - for you.  So you don’t have too.  Be careful of your patching.

Morale of the story - use Dell OM if you have it to do all the firmware patching except for the controller in your vSAN hosts. Use the built - in vSAN firmware update tool for that.  To help with this maybe do the updates via vSAN first. But always watch out in the OpenManage UI.

Michael

=== END ===

8 thoughts on “Careful with firmware patching in a vSAN Cluster!

  1. Hi Michael,

    I have similar hardware configuration with vSAN 6.6 on top of RERC H730P controller and have noticed the warning message about the latest firmware version 25.5.2.0001 incompatibility with vSAN 6.6.

    However, according to VMware HCL (https://www.vmware.com/resources/compatibility/detail.php?deviceCategory=vsanio&productid=34858&deviceCategory=vsanio&details=1&vsan_type=vsanio&io_partner=23&io_releases=278&keyword=H73&page=1&display_interval=10&sortColumn=Partner&sortOrder=Asc), that version of the firmware is completely supported by the vendor. I spoke with PSO about this, and they confirmed that VMware HCL has precedence over vSAN Health Check.

    In fact, vSAN Health Check should be in consistency with HCL, and that is what VMware is trying to achieve at the moment. In the next updates to vSAN we should see this issue resolved.

    Regards,

  2. It really drives me crazy when the vendor has firmware that is supposed to fix some issues from their perspective but VMware is always a month or two behind in testing. VSAN Health should always be green and although I am not overly OCD I find it does drive me crazy when it’s not all green.

    1. Completely agree. I really like the health service and do not want it yellow for anything except real issues.

      Not sure if it is all VMware, as I expect it is the combination and partly Dell too.

      Michael

  3. Thanks for the interesting article. It would be great if dell could implement a verfication check against the HCL.

  4. My customer use the same dell perc controller for vSAN all flash and we could see this warning mesage, too. Like rdronov already said, don’t take care to the embedded health check, only.
    VMware told us to take a look to the HCL. So this firmware is completely supported.

    We couldn’t see any performance improvments between both firmware levels. Our write throughput is not so fast as we expected. We have a 4 node 6.6 vSAN all-flash cluster with two disk groups per host.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.