Hi, we recently experienced the above issue and after looking for explanations I haven't been able to find any satisfying answers when other people have posted this issue.
Our problem is as follows:
2 node 2008R2 cluster running SQL 2012
Each node is a HP BL460c running in a HP C7000 Blade Chassis.
We were updating the flexfabric cards on one of the chassis. The other chassis had been patched the previous week with no problems.
During the update process the flexfabric cards, which hold the Ethernet and FC connections, reboot so before work had begun all active cluster services had been failed over to the node in the chassis not being worked on. However despite this the cluster service shut down on this one particular cluster. All other clusters running across these 2 chassis continued to run as expected.
As other people have posted before we saw the following errors in the system log.
1564: File share witness resource 'File Share Witness' failed to arbitrate for the file share
1069: Cluster resource 'File Share Witness' in clustered service or application 'Cluster Group' failed.
1172: The Cluster service is shutting down because quorum was lost. This could be due to the loss of network connectivity between some or all nodes in the cluster, or a failover of the witness disk.
Run the Validate a Configuration wizard to check your network configuration. If the condition persists, check for hardware or software errors related to the network adapter. Also check for failures in any other network components to which the node is connected such as hubs, switches, or bridges.
However we cant understand what could cause this to happen when the service is running on the node in the chassis not being updated, especially when the same update was performed the week before with no issues. How can both nodes lose connectivity to the File Share Witness at the same time?
Cluster Validation tests run fine and don't highlight any issues. The file share witness is accessible from both servers.