Configure and test of ESXi Dump Collector

I have neglected this for a while.  But when you use SD like I do to boot your ESXi hosts, they do need a place to dump core files.  When VMware gets a PSOD - purple screen of death, it is important to have the ability for VMware to copy the memory to someplace else.  And, BTW, take a picture of any PSOD you see as there is often info in it that is not anywhere else!

So I need to get his done now.  In talking with others I realize that this is not that common for people so I am writing it down as I do it.

This is for vSphere 6, and of course, I am using the vSphere Web Client.  Plus I am doing this with a VCSA which is the same but for the dump collector log location.

Set Up of the Dump Collector

We need to start in the Home page.

  • We change to Administration, followed by VMware vSphere ESXi Dump Collector.

  • We use the Actions menu need the top of the screen, and Start the Dump Collector service.

  • Once started, you may see an error message.  Use the Refresh button to clear it and it should look like the following screenshot.

  • If you change to the Manage page you will see what little config is possible.

  • The defaults are good for us.

We now have the Dump Collector working and able to receive cores.  We now need to get the hosts configured to use it.

Setup of Host

While this can be done in Host Profiles we are going to do it at the command line.  Even for Host Profiles you should do it at the command line in your reference host.

We need to access the console of our ESXi hosts via SSH.

  • One on our host, the first thing we should do is check the status.

esxcli system coredump network get

  • Next we need to configure the host for remote coredump.

esxcli system coredump network set -v vmk0 -i 192.168.9.16 -o 6500

  • My use of vmk0 here is representative of the management network which is normally vmk0.
  • The 192.168.9.16 is my vCSA which is hosting the dump collector.
  • Now the host is configured we need to enable this functionality.

esxcli system coredump network set -e true

  • We can check out configure using the first CLI we did.

esxcli system coredump network get

  • Note how it looks a little different now?

How can we test this?

A very simple test would be to use the following command:

esxcli system coredump network check

Thanks to William (for sharing with me this article) I am able to confirm this is working. If on your VCSA you look into:

/var/log/vmware/netdumper/netdumper.log

You will see something like below:

The three messages are from my three hosts where I tested things!

And, BTW, the info on this post back is from the article that William shared with me.  What I am  impressed about is that this was from a vSphere 5.1 based 2012 article.  It is cool the path is still the same in 2016 and vSphere 6.0 U1.

Yes, you can force PSOD and see if it moves the memory.  You can find out how to force a PSOD in this article.  Not sure if it will move memory or not, but it is fun to try.  Suggest to be careful with this!

Where do the coredumps end up?

While if this is not a play circumstance support will help with this, but just in case the location is:

/var/core/netdumps

But remember you will need to enable shell, and then use the shell to access it.  I will add the location for the Windows core’s when I find it.

Links

You can find additional info in this KB article.

Cheatsheet

If you are using this as a reference while you are doing the work this should help.

esxcli system coredump network get

esxcli system coredump network set -v vmk0 -i 192.168.9.16 -o 6500

esxcli system coredump network set -e true

esxcli system coredump network get

esxcli system coredump network check

Conclusion

I hope that this helps, and remember if you boot with SD like I do, or with USB like many of you, then if you PSOD - thank goodness so rare - and there is a core, you will need to have the process in this article done so that the core can be retained.

Update

  • 1/20/16 - added the info from William about how to confirm the check worked.
  • 1/20/16 - Added the cheatsheet to help, and the clarification that the IP referenced is my VCSA.

Michael

=== END ===

Posted in How To

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 1,347 other followers

%d bloggers like this: