• hi@yahyazahedi.com
  • Germany

vSAN Monitoring and Troubleshooting Tools – Part 1

This is part of the VMware vSAN guide post series. You can access and explore more objectives from the VMware vSAN study guide using the following link.

Up to this point of vSAN study guide, I’ve primarily discussed configuration, settings, features, and the process of running a vSAN cluster, along with keeping it updated. So far, everything’s been working good! However, what happens when issues arise? What tools does vSAN offer for troubleshooting and monitoring? In this post and upcoming posts, I aim to address to these queries.

There are several tools available to achieve this purpose, and it is necessary to be familiar with these tools in order to facilitate faster troubleshooting. In this post and upcoming ones, I’ll address the following objectives:

  • vSAN Skyline Health
  • vSAN Cluster Level Monitoring
  • vSAN Host Monitoring
  • vSAN VM Monitoring

Let’us start with Skyline Health.

vSAN Skyline Health

In previous post, I mentioned several times to Skyline Health, and you may noticed a difference in GUI of Skyline Health before and after the update. In this post, I want to go deeper and explore this feature further.

Skyline Health is a self-service diagnostics feature designed to detect and address issues in both vSphere and vSAN environments. It’s important to note that while it’s commonly associated with vSAN, it’s also available for vSphere. So, it’s not exclusive to vSAN; you can utilize it for vSphere as well.

Today, I want to leverage Skyline Health for vSAN. You can access it by navigating to the vSAN cluster, then selecting the Monitor tab and selecting Skyline Health under the vSAN section. Here, you’ll find two cards within the Overview section: the Cluster Health Score, which operates on recent health findings and Health Score Trend, which show health score trend over the past 24 hours. This trend is customizable, allowing you to specify a particular time frame.

Under health findings, there four categories, Unhealthy, Healthy, Info, Silenced which you can utilize them to diagnose issues, troubleshoot and remediate problems. let’s start with the first category of findings:

Unhealthy findings refer to important issues that needs attention, for example, in my case I am not using VMware certified device storage and If you look at the impact area of this issue, it shows Compliance, which means my device storages are not compliance with VMware hardware compatibility list (HCL).

As you can see, there are three options:

  • Silence Alert: This option silences the alert and moves the card to the Silenced category.
  • Troubleshoot: This option shows a new card with instructions on how to resolve the issue.
  • View History Details: This option provides a history of the issue.

Click on View History Details.

A new card will be displayed, providing historical information on this particular issue. You’ll be able to see how many times it has occurred and on which days.

If you click Troubleshoot, a new card will be displayed, providing information regarding the issue and its root cause to facilitate resolution. In the “Why is the issue occurring?” section, you’ll find the reasons behind the problem. In the “How to troubleshoot and fix” section, you’ll discover further details about the issue, in my case, which devices are experiencing hardware compatibility issues, along with recommended actions to resolve it effectively.

The second category is Healthy, which refers findings with no issues, hence requiring no additional attention. Everything is functioning smoothly, indicated by the green status. Our primary goal is to ensure that all findings fall within this category, leaving other categories empty.

The third category is Info, refer to findings that may not impact the state of vSAN directly but are important for enhancing the overall health and efficiency of the vSAN clsuter. This category includes some best practices and recommendations aimed at optimizing the performance and stability of the to improve the vSAN cluster.

The forth category is Silenced, If you silence any findings from any other categories, they will appear here. If you have some issues and are actively addressing them for along time or for any other reason prefer to not display them in Unhealthy category or other categories, you can click on Silence Alert to move them in this category.

In the next post, I will explain more about vSAN Cluster Monitoring.

Share Post on:

Leave a Reply

Your email address will not be published. Required fields are marked *