ESX Host dump file diagnostics / assessment

PSOD is a fatal crash of VMware ESX/ESXi hosts which kills all active Virtual Machines. A diagnostic screen with white type on a purple background.

【Part.2】Advanced Troubleshooting of VMware ESXi Server 6.x for vSphere ...

This PSOD is also generating a DUMP file, so that the Administrators can drill down the Issue and carry out a proper RCA.

Before jumping into the DUMP file analysis, it is always recommended to analyze the ESXi log files;

  • VMkernel summary – /var/log/vmksummary.log
  • ESXi host agent log – /var/log/hostd.log

By reading the above log files, we can identify whether a DUMP file has been generated or not.

If a DUMP file has been generated, we can start ding some additional analysis.

Step 01:

Get the host up and running, login to the host through SSH (with Putty).

Then go to the core directory. The core directory is the location where your PSOD is stored at (cd var/core).

You can list the PSODs (dumps) with the ls command.

Step 02:

Upon confirmation a dump exists, you can start downloading the file to your workstation.

I tend to used WinSCP in Windows clients and the baked in SCP command in Linux.

browse to  /var/core path and copy the latest DUMP file (vmkernel-zdump.#).

Step 03:

Once you are done with downloading the DUMP file, open the file using vi or Notepad++

Then search for the keyword @BlueScreen 

Step 04:

Take a note of the error and search it on the VMware Knowledge Base

Tags: