Event ID 1073 The Cluster service was halted to prevent an inconsistency within the failover cluster. The error code was '668'.

May 19, 2014, 6:31 am

≫ Next: Issue with storage while adding node to cluster

≪ Previous: Anyone can help me to answer this question?

Hi everyone!

There is a 5-node SQL Server 2012 failover cluster based on Windows server 2012 Datacenter and built on IBM Bladecenter HS23 type 7875. Cluster nodes are using SAN-boot from IBM Storwize v3700 and LUN's from IBM Storwize v7000.
Periodically on different nodes of the cluster appears an error vent ID 1073 The Cluster service was halted to prevent an inconsistency within the failover cluster. The error code was '668', and Event ID 7031 The Cluster Service service terminated unexpectedly. It has done this 1 time(s). TThe following corrective action will be taken in 60000 milliseconds: Restart the service и Event ID 7024 The Cluster Service service terminated with the following service-specific error: An assertion failure has occurred. After these errors have appeared cluster node hangs in "joining" state and the same happens to all nodes that will be rebooted or turned off, and all operations I try to preform on cluster(stopping cluster service, pause, evict, etc) are failling. Cluster returns to normal state only after all of its node are rebooted. Here's is the piece of cluster log at the time the error occurred:

00000b4c.00000c7c::2014/04/21-03:32:25.939 INFO [VSS] Backing up part of the system state [VSS] OnPrepareBackup: starting new session dfb4fbf0-db28-40d2-af3a-82e66a271267
00000b4c.00000c7c::2014/04/21-03:32:25.939 INFO [VSS] OnPrepareBackup returning - true
00000b4c.00001194::2014/04/21-03:32:26.704 INFO [GUM] Node 7: Processing RequestLock 4:4744
00000b4c.00001198::2014/04/21-03:32:26.704 INFO [GUM] Node 7: Processing GrantLock to 4 (sent by 3 gumid: 11271)
00000b4c.00000e2c::2014/04/21-03:32:26.704 ERR mscs::GumAgent::ExecuteQueuedUpdate: TransactionInProgress(5918)' because of 'Cannot restart an in-progress transaction'
00000b4c.00001194::2014/04/21-03:32:26.719 ERR Failed type check .?AUBoxedNodeSet@mscs@@
00000b4c.00001194::2014/04/21-03:32:26.719 ERR [CORE] mscs::ClusterCore::DeliverMessage: TypeMismatch(1629)' because of 'failed type check'
00000b4c.00000e2c::2014/04/21-03:32:26.750 INFO [VSS] HandleBackupGum - Initiating the backup
00000b4c.00000e2c::2014/04/21-03:32:26.750 INFO [VSS] HandleOnFreezeGum - Stopping the Death Timer
00000b4c.00000e2c::2014/04/21-03:32:26.750 INFO [VSS] HandleBackupGum - Completed the backup Request
00000b4c.00000e2c::2014/04/21-03:32:26.750 ERR [GUM] Node 7: sequenceNumber + 1 == payload->GumId (5129, 11272)
00000b4c.00000e2c::2014/04/21-03:32:26.750 ERR mscs::GumAgent::ExecuteQueuedUpdate: AssertionFailed(668)' because of 'failed assertion'(sequenceNumber + 1 == payload->GumId is false)
00000b4c.00000e2c::2014/04/21-03:32:26.750 ERR GumHandler failed (status = 668)
00000b4c.00000e2c::2014/04/21-03:32:26.750 ERR GumHandler failed (status = 668), executing OnStop
00000b4c.00000e2c::2014/04/21-03:32:26.750 INFO [DM]: Shutting down, so unloading the cluster database.
00000b4c.00000e2c::2014/04/21-03:32:26.750 INFO [DM] Shutting down, so unloading the cluster database (waitForLock: false).
00000b4c.00000e2c::2014/04/21-03:32:26.813 ERR FatalError is Calling Exit Process.
00000b4c.00000b50::2014/04/21-03:32:26.813 INFO [CS] About to exit process...
000015d0.000015d4::2014/04/21-03:32:26.828 WARN [RHS] Cluster service has terminated.
00001618.0000161c::2014/04/21-03:32:26.828 WARN [RHS] Cluster service has terminated.
00001588.0000158c::2014/04/21-03:32:26.828 WARN [RHS] Cluster service has terminated.
000015f4.000015f8::2014/04/21-03:32:26.828 WARN [RHS] Cluster service has terminated.

All of the reccommeded failover cluster updates and hotfixes are installed and the cluster is validated.

↧

Issue with storage while adding node to cluster

May 30, 2014, 8:10 pm

≫ Next: Cluster Name Resource not coming ONLINE

≪ Previous: Event ID 1073 The Cluster service was halted to prevent an inconsistency within the failover cluster. The error code was '668'.

Greetings all,

I started with a 2-node Windows 2012 failover cluster consisting of node1 and node2. To migrate to Windows Server 2012 R2 I created a node3 and installed it as a single node failover cluster. I used copy roles to migrate some VMs and their corresponding CSVs to the new cluster and things work great. Next I moved all remaining VMs to node1 on the original cluster and evicted node2 from the original cluster. So far so good. I then installed a clean copy of Server 2012 R2 onto node2 and attempted to add it to the new cluster and that's when I ran into trouble. It seems as though my shared storage on a Dell MD3000 (I know not supported) is not being seen correctly as the following is what I get when validating storage:

NODE2.adataconcepts.local
Row Disk Number Disk Signature VPD Page 83h Identifier VPD Serial Number Model Bus Type Stack Type SCSI Address Adapter Eligible for Validation Disk Characteristics
0 0 2b4cd3d6 19EBF3B700000000001517FFFF0AEB84 BOOT Intel Raid 1 Volume RAID Stor Port 1:0:1:0 Intel(R) Desktop/Workstation/Server Express Chipset SATA RAID Controller False Disk is a boot volume. Disk is a system volume. Disk is used for paging files. Disk is used for memory dump files. Disk bus type does not support clustering. Disk is on the system bus. Disk partition style is MBR. Disk type is BASIC.
1 1 e4f31a08 60019B9000B68362000064BD5387F98E 71K002P DELL MD3000 Multi-Path Disk Device SAS Stor Port 0:0:15:0 Microsoft Multi-Path Bus Driver True Disk partition style is MBR. Disk type is BASIC. Disk uses Microsoft Multipath I/O (MPIO).
2 2 5be62253 60019B9000B683620000638651D2EB2D 71K002P DELL MD3000 Multi-Path Disk Device SAS Stor Port 0:0:15:6 Microsoft Multi-Path Bus Driver True Disk partition style is MBR. Disk type is BASIC. Disk uses Microsoft Multipath I/O (MPIO).
3 3 5be62245 60019B9000B6933F00000B9351D2E686 71K002P DELL MD3000 Multi-Path Disk Device SAS Stor Port 0:0:15:7 Microsoft Multi-Path Bus Driver True Disk partition style is MBR. Disk type is BASIC. Disk uses Microsoft Multipath I/O (MPIO).
4 4 c1100e74 60019B9000B6933F00000BE05363B77F 71K002P DELL MD3000 Multi-Path Disk Device SAS Stor Port 0:0:15:11 Microsoft Multi-Path Bus Driver True Disk partition style is MBR. Disk type is BASIC. Disk uses Microsoft Multipath I/O (MPIO).
5 5 6420dd00 60019B9000B68362000064395363C60A 71K002P DELL MD3000 Multi-Path Disk Device SAS Stor Port 0:0:15:12 Microsoft Multi-Path Bus Driver True Disk partition style is MBR. Disk type is BASIC. Disk uses Microsoft Multipath I/O (MPIO).
6 6 2cf2c0bc 60019B9000B6933F00000BFF536C698E 71K002P DELL MD3000 Multi-Path Disk Device SAS Stor Port 0:0:15:13 Microsoft Multi-Path Bus Driver True Disk partition style is MBR. Disk type is BASIC. Disk uses Microsoft Multipath I/O (MPIO).

NODE3.adataconcepts.local
Row Disk Number Disk Signature VPD Page 83h Identifier VPD Serial Number Model Bus Type Stack Type SCSI Address Adapter Eligible for Validation Disk Characteristics
0 0 eb58d775 759EAC7A01000000001517FFFF0AEB84 Boot Intel Raid 1 Volume RAID Stor Port 1:0:1:0 Intel(R) Desktop/Workstation/Server Express Chipset SATA RAID Controller False Disk is a boot volume. Disk is a system volume. Disk is used for paging files. Disk is used for memory dump files. Disk bus type does not support clustering. Disk is on the system bus. Disk partition style is MBR. Disk type is BASIC.
1 1 e4f31a08 60019B9000B68362000064BD5387F98E 71K003O DELL MD3000 Multi-Path Disk Device SAS Stor Port 0:0:2:0 Microsoft Multi-Path Bus Driver True Disk is already clustered. Disk partition style is MBR. Disk type is BASIC. Disk uses Microsoft Multipath I/O (MPIO).
2 2 5be62253 60019B9000B683620000638651D2EB2D 71K003O DELL MD3000 Multi-Path Disk Device SAS Stor Port 0:0:2:6 Microsoft Multi-Path Bus Driver True Disk is already clustered. Disk partition style is MBR. Disk type is BASIC. Disk uses Microsoft Multipath I/O (MPIO).
3 3 5be62245 60019B9000B6933F00000B9351D2E686 71K003O DELL MD3000 Multi-Path Disk Device SAS Stor Port 0:0:2:7 Microsoft Multi-Path Bus Driver True Disk is already clustered. Disk partition style is MBR. Disk type is BASIC. Disk uses Microsoft Multipath I/O (MPIO).
4 4 c1100e74 60019B9000B6933F00000BE05363B77F 71K003O DELL MD3000 Multi-Path Disk Device SAS Stor Port 0:0:2:11 Microsoft Multi-Path Bus Driver True Disk is already clustered. Disk partition style is MBR. Disk type is BASIC. Disk uses Microsoft Multipath I/O (MPIO).
5 5 6420dd00 60019B9000B68362000064395363C60A 71K003O DELL MD3000 Multi-Path Disk Device SAS Stor Port 0:0:2:12 Microsoft Multi-Path Bus Driver True Disk is already clustered. Disk partition style is MBR. Disk type is BASIC. Disk uses Microsoft Multipath I/O (MPIO).
6 6 2cf2c0bc 60019B9000B6933F00000BFF536C698E 71K003O DELL MD3000 Multi-Path Disk Device SAS Stor Port 0:0:2:13 Microsoft Multi-Path Bus Driver True Disk is already clustered. Disk partition style is MBR. Disk type is BASIC. Disk uses Microsoft Multipath I/O (MPIO).

List Disks To Be Validated
Description: List disks that will be validated for cluster compatibility.
Start: 5/30/2014 5:07:08 PM.
Physical disk e4f31a08 is visible from only one node and will not be tested. Validation requires that the disk be visible from at least two nodes. The disk is reported as visible at node: NODE2.adataconcepts.local
Physical disk 2cf2c0bc is visible from only one node and will not be tested. Validation requires that the disk be visible from at least two nodes. The disk is reported as visible at node: NODE3.adataconcepts.local
Physical disk 6420dd00 is visible from only one node and will not be tested. Validation requires that the disk be visible from at least two nodes. The disk is reported as visible at node: NODE3.adataconcepts.local
Physical disk c1100e74 is visible from only one node and will not be tested. Validation requires that the disk be visible from at least two nodes. The disk is reported as visible at node: NODE3.adataconcepts.local
Physical disk 5be62245 is visible from only one node and will not be tested. Validation requires that the disk be visible from at least two nodes. The disk is reported as visible at node: NODE3.adataconcepts.local
Physical disk 5be62253 is visible from only one node and will not be tested. Validation requires that the disk be visible from at least two nodes. The disk is reported as visible at node: NODE3.adataconcepts.local
Physical disk e4f31a08 is visible from only one node and will not be tested. Validation requires that the disk be visible from at least two nodes. The disk is reported as visible at node: NODE3.adataconcepts.local
Physical disk 2cf2c0bc is visible from only one node and will not be tested. Validation requires that the disk be visible from at least two nodes. The disk is reported as visible at node: NODE2.adataconcepts.local
Physical disk 6420dd00 is visible from only one node and will not be tested. Validation requires that the disk be visible from at least two nodes. The disk is reported as visible at node: NODE2.adataconcepts.local
Physical disk c1100e74 is visible from only one node and will not be tested. Validation requires that the disk be visible from at least two nodes. The disk is reported as visible at node: NODE2.adataconcepts.local
Physical disk 5be62245 is visible from only one node and will not be tested. Validation requires that the disk be visible from at least two nodes. The disk is reported as visible at node: NODE2.adataconcepts.local
Physical disk 5be62253 is visible from only one node and will not be tested. Validation requires that the disk be visible from at least two nodes. The disk is reported as visible at node: NODE2.adataconcepts.local
No disks were found on which to perform cluster validation tests. To correct this, review the following possible causes:
* The disks are already clustered and currently Online in the cluster. When testing a working cluster, ensure that the disks that you want to test are Offline in the cluster.
* The disks are unsuitable for clustering. Boot volumes, system volumes, disks used for paging or dump files, etc., are examples of disks unsuitable for clustering.
* Review the "List Disks" test. Ensure that the disks you want to test are unmasked, that is, your masking or zoning does not prevent access to the disks. If the disks seem to be unmasked or zoned correctly but could not be tested, try restarting the servers before running the validation tests again.
* The cluster does not use shared storage. A cluster must use a hardware solution based either on shared storage or on replication between nodes. If your solution is based on replication between nodes, you do not need to rerun Storage tests. Instead, work with the provider of your replication solution to ensure that replicated copies of the cluster configuration database can be maintained across the nodes.
* The disks are Online in the cluster and are in maintenance mode.
No disks were found on which to perform cluster validation tests.

Can anyone shed some light on why even though the disk signatures are the same on both nodes the two nodes don't seem to acknowledge that they are in fact looking at the same disks? Any help greatly appreciated!

Regards,

scott

Scott

↧

Cluster Name Resource not coming ONLINE

June 1, 2014, 12:35 pm

≫ Next: Licensing Windows Server in a Cluster (Passive Node)

≪ Previous: Issue with storage while adding node to cluster

Hello,

I have recently deployed 2 node cluster based on Windows Server 2012 R2. CNO and corresponding IP is online but Microsoft DTC role name is not coming online.

Error / Warning Logs is pasted for your reference:

00000bc0.0000109c::2014/06/01-19:17:07.001 WARN [RES] Network Name: [NNLIB] LogonUserCall fails for user xxxx$: (useSecondaryPassword: 0), password length is 0
00000bc0.0000109c::2014/06/01-19:17:07.064 WARN [RES] Network Name: [NNLIB] LogonUserEx fails for user xxxx$: 1326 (useSecondaryPassword: 1)
00000bc0.00000c20::2014/06/01-19:17:07.065 WARN [RES] Network Name <Cluster Name>: Identity: Get Token Request, currently doesnt have a token!
00000bc0.000011c4::2014/06/01-19:17:07.065 WARN [RES] Network Name <xxx>: AccountAD: Slow operation has exception (6)' because of '::ImpersonateLoggedOnUser( GetToken() )'

00000bc0.00001330::2014/06/01-19:17:07.071 ERR [RES] Network Name <xxx>: Online thread Failed: (0)' because of 'Initializing netname configuration for xxx failed with error 6.'
00000bc0.00001330::2014/06/01-19:17:07.071 ERR [RHS] Online for resource xxx failed.
000006b4.000004d8::2014/06/01-19:17:07.071 WARN [RCM] HandleMonitorReply: ONLINERESOURCE for 'xxx', gen(58) result 5018/0.
000006b4.000004d8::2014/06/01-19:17:07.071 ERR [RCM] rcm::RcmResource::HandleFailure: (xxxx)
00000bc0.000011c4::2014/06/01-19:17:07.072 ERR [RES] Network Name <xxxx>: AdminShare: OnCloseBase, Error Already Closing, previous state: Closing/Ending

↧

Licensing Windows Server in a Cluster (Passive Node)

December 9, 2010, 2:42 pm

≫ Next: Failover cluster manager

≪ Previous: Cluster Name Resource not coming ONLINE

Do I need to purchase a Windows Server license for passive nodes in a Fail-Over cluster? Passive means it is running behind a VIP but no users are accessing the passive node as long as the active node is up and running. Both nodes are identical (running the same version and edition of Windows server). Please tell me if this differs for version 2008,2003 or even for Ent, Data Center or Standard editions.

↧

Failover cluster manager

June 2, 2014, 9:18 am

≫ Next: 2012 R2 CSV "Online (No Access)" after node joins cluster

≪ Previous: Licensing Windows Server in a Cluster (Passive Node)

Hi Guys,

I am getting below error message while validate a configuration of failover cluster manager.

I have checked services and cluster services are started. I have rebooted both the cluster server, but still getting the same error message. Any further help will be appreciated.

Thanks

↧

2012 R2 CSV "Online (No Access)" after node joins cluster

June 2, 2014, 10:40 am

≫ Next: CSV performance issue with SaS disks

≪ Previous: Failover cluster manager

Okay, this has been going on for months after we performed an upgrade on our 2008 R2 clusters. We upgraded our development cluster from 2008 R2 to 2012 (SP1) and had no issues, saw great performance increases and decided to do our production clusters. At the time, 2012 R2 was becoming prominent and we decided to just hop over 2012, thinking changes in this version weren't that drastic, we were wrong.

The cluster works perfectly as long as all nodes stay up and online. Live migration works great, roles (including disks) flip between machines based on load just fine, etc. When a node reboots, or the cluster service restarts, when the node goes from "Down" to "Joining" and then "Online", the CSV(s) will switch from Online to Online (No Access) and the mount point will disappear. If you were to switch the CSV(s) to the node that just joined back into the cluster, the mount point returns and it goes back to Online.

Cluster validation checks out with flying colors and Microsoft has been able to provide 0 help whatsoever. We have two types of FC storage, one that is being retired and one that we are switching all production machines to. It does this with both storage units, one SUN and one Hitachi. Since we are moving to Hitachi, we verified that the firmware was up-to-date (it is), our drivers are current (they are) and that the unit is fully functional (everything checks out). This has not happened before 2012 R2 and we have proven it by reverting to 2012 on our development cluster. We have started using features that come with 2012 R2 on our other clusters so we would like to figure this problem out to continue using this platform.

Cluster logs show absolutely no diagnostic information that's of any help. The normal error message is:

Cluster Shared Volume 'Volume3' ('VM Data') is no longer accessible from this cluster node because of error '(1460)'. Please troubleshoot this node's connectivity to the storage device and network connectivity.

Per Microsoft our Hitachi system with 2012 R2 and utilizing MPIO (we have two paths) is certified for use. This is happening on all three of our clusters (two production and one development). They mostly have the same setup but not sure what could be causing this at this point.

↧

CSV performance issue with SaS disks

June 2, 2014, 12:11 pm

≫ Next: Windows Failover Cluster (Errors retrieving file shares)

≪ Previous: 2012 R2 CSV "Online (No Access)" after node joins cluster

We've been testing Microsoft's famous solution ; storage pools , SaS and finally CSV for a cluster environment IOPS .

we have SaS disks connected through JBODs and SaS HBA cards .

before making a CSV ; I used SQLIO and got about 150,000 IOPS which is acceptable ( ReadONLY , 4k , Random )

After making the CSV volume , on the coordinator node ; I am getting : 100,000 ( 50,000 less right away )

on a noncorordinator node ; I am getting only 25000 IOPS .

IF I change the owner (and make it coordinator) , then I get 100,000 IOPS again .

This is so much difference . we don't get actual IOPS that we paid for . we get 25 percent of it . any thoughts ?

↧

Windows Failover Cluster (Errors retrieving file shares)

December 6, 2013, 11:23 am

≫ Next: how to remove two nodes from old cluster, and put them in new one, in another domain

≪ Previous: CSV performance issue with SaS disks

I'm having an issue with Windows Failover Cluster with a Windows Server 2012 R2 machine. I have two cluster nodes (nodeA and nodeB). My issue is that when nodeA is the owner node, and I open failover cluster manager <clusterName> >> roles >> <fileserver role> >> shares tab it will hang and say that it is loading, but this will occur infinitely. Although when I go to nodeB (not the owner node) and I go to shares it will show me all of the shares that I have. Next when I go to <clusterName> >> Nodes >> click on Roles tab the information says "There were errors retrieving file shares."

Now when I switch the nobeB to the owner node, I cannot view the shares on that machine but can now view them on nodeA.

We alse have a test network where I have recreated the machines, environment and the failover cluster to as close as the production network as I can except everything works great in the test network

↧

how to remove two nodes from old cluster, and put them in new one, in another domain

June 3, 2014, 5:04 am

≫ Next: windows server 2012 R2 clustering issue with disks

≪ Previous: Windows Failover Cluster (Errors retrieving file shares)

Hi there!
I've two servers (Server1 and Server2), in cluster with settings like below:

domain name: london.local
Server1 -> OS windows server 2012
Server2 -> OS windows server 2012

and they are part of cluster with two nodes:
cluster.london.local

Now, i want to remove these two servers from cluster, remove from actual domain, and put them in a new domain name paris.local

After that, i want to create the cluster between two servers in domain paris.local

which is the best way to remove the nodes from actual cluster?

Regards!

Lasandro Lopez

↧

windows server 2012 R2 clustering issue with disks

April 8, 2014, 11:33 am

≫ Next: Best Approch to Upgrade 2008R2/2012 cluster in windows 2012 R2 Cluster

≪ Previous: how to remove two nodes from old cluster, and put them in new one, in another domain

we have a two node cluster

we have volumes configurred on both the clusters.we are using EVA san storage.

Data(C:\ClusterStorage\Volume1)

Logs(C:\ClusterStorage\Volume2)

i am able to change/move Clustered disk drives from owner node A to Owner node B.I cannot see the clustered drives on both active/passive nodes.

i am also not able to view the volumes on node b in the windows explorer.Can someone please tell me how to look at the volumes once we move between the drives on the cluster.

Thank you

lucky

↧

Best Approch to Upgrade 2008R2/2012 cluster in windows 2012 R2 Cluster

June 2, 2014, 5:55 pm

≫ Next: Quorum disk purpose

≪ Previous: windows server 2012 R2 clustering issue with disks

Hi Team,

I want to consolidate my Windows Hyper-v server/cluster into one single Windows 2012R2 cluster .

i have below environment.

Three Windows 2012 R1, Clustered Hyper V.
Three Windows 2008 R2, individual Hyper V host. Most of the Virtual Machine are running with RDM disk.

can any one suggest me best approach to upgrade all three windows 2008R2 standalone host into 2012R2 and than consolidate into single cluster along with windows 2012R1.

can i convert RDM into iscsi Volume.

Thanks

Ravindra

Ravi

↧

Quorum disk purpose

May 25, 2014, 1:09 pm

≫ Next: Hyper-V Guest Cluster Node Failing Regularly

≪ Previous: Best Approch to Upgrade 2008R2/2012 cluster in windows 2012 R2 Cluster

A Windows Cluster employs a "quorum" disk which is essentially a log file used to record any changes made to the active node (so they can be pushed to the passive node if required). But I also read that the "quorum" can cast a vote to determine if the cluster remains running. A log file casting a vote? Can you please clarify this?

TIA,

edm2

↧

Hyper-V Guest Cluster Node Failing Regularly

May 12, 2014, 1:40 am

≫ Next: Add Node to Hyper-V Cluster running Server 2012 R2

≪ Previous: Quorum disk purpose

Hi,

We currently have a 4-node Server 2012 R2 Cluster witch hosts among other things, a 3 node Guest Cluster running a single clustered file service.

Around once a week, the guest cluster node that is currently hosting the clustered file service will fail. It's as if the VM is blue screening. That in itself is fairly anoying and I'll be doing all the updates and checking event log for clues as to the cause.

The problem then is that whichever physical cluster node that is hosting the VM when it fails, will not unlock some of the VM's files. The Virtual machine configuration lists as Online Pending. This means that the failed VM cannot be restarted on any other cluster node. The only fix is to drain the physical host it failed on, and reboot.

Looking for suggestions on how to fix the following.

1. Crashing guest file cluster node

2. Failed VM with shared VHDX requiring Phyiscal host reboot.

Event messages for the physical host that was hosting the failed vm in order that they occured.

Hyper-V-Worker: Event ID 18590 - 'FS-03' has encountered a fatal error. The guest operating system reported that it failed with the following error codes: ErrorCode0: 0x9E, ErrorCode1: 0x6C2A17C0, ErrorCode2: 0x3C, ErrorCode3: 0xA, ErrorCode4: 0x0. If the problem persists, contact Product Support for the guest operating system. (Virtual machine ID 36166B47-D003-4E51-AFB5-7B967A3EFD2D)
FailoverClustering: Event ID 1069 - Cluster resource 'Virtual Machine FS-03' of type 'Virtual Machine' in clustered role 'FS-03' failed.
Hyper-V-High-Availability: Event ID 21128 - 'Virtual Machine FS-03' failed to shutdown the virtual machine during the resource termination. The virtual machine will be forcefully stopped.
Hyper-V-High-Availability: Event ID 21110 - 'Virtual Machine FS-03' failed to terminate.
Hyper-V-VMMS: Event ID 20108 - The Virtual Machine Management Service failed to start the virtual machine '36166B47-D003-4E51-AFB5-7B967A3EFD2D': The group or resource is not in the correct state to perform the requested operation. (0x8007139F).
Hyper-V-High-Availability: Event ID 21107 - 'Virtual Machine FS-03' failed to start.
FailoverClustering: Event ID 1205 - The Cluster service failed to bring clustered role 'FS-03' completely online or offline. One or more resources may be in a failed state. This may impact the availability of the clustered role.

↧

Add Node to Hyper-V Cluster running Server 2012 R2

May 26, 2014, 12:13 am

≫ Next: ARP Storm After NLB Creation on 2012 R2 Cluster

≪ Previous: Hyper-V Guest Cluster Node Failing Regularly

Hi All,

I am in the process to upgrade our Hyper-V Cluster to Server 2012 R2 but I am not sure about the required Validation test.

The Situation at the Moment-> 1 Node Cluster running Server 2012 R2 with 2 CSVs and Quorum. Addtional Server prepared to add to the Cluster. One CSV is empty and could be used for the Validation Test. On the Other CSV are running 10 VMs in production. So when I start the Validation wizard I can select specific CSVs to test, which makes sense;-) But the Warning message is not clear for me "TO AVOID ROLE FAILURES, IT IS RECOMMENDED THAT ALL ROLES USING CLUSTER SHARED VOLUMES BE STOPPED BEFORE THE STORAGE IS VALIDATED". Does it mean that ALL CSVs will be testest and Switched offline during the test or just the CSV that i have selected in the Options? I have to avoid definitly that the CSV where all the VMs are running will be switched offline and also that the configuration will be corputed after loosing the CSV where the VMs are running.

Can someone confirm that ONLY the selected CSV will be used for the Validation test ???

Many thanks

Markus

↧

ARP Storm After NLB Creation on 2012 R2 Cluster

June 4, 2014, 1:26 am

≫ Next: Pre purchase advice on Dell VRTX Cluster config

≪ Previous: Add Node to Hyper-V Cluster running Server 2012 R2

I have a customer with an issue when creating a NLB in multicast mode between two guests on a Hyper-V 2012 R2 cluster. The cluster is on a C7000 chassis with BL460 g8 servers and VC FlexFabric 10Gb/24-port (4.10 f/w).

MAC spoofing is also enabled on both of the guests, when the static ARP entry is created on the core switch, we see an ARP storm on the core switch causing poor performance on the VLAN in question.

Can anyone provide any advice on how we can get around this issue?

Please don't forget to mark posts as helpful or answers.
Inframon Blogs | All Things ConfigMgr

↧

Pre purchase advice on Dell VRTX Cluster config

May 15, 2014, 5:18 am

≫ Next: Error while adding file share for File Server role

≪ Previous: ARP Storm After NLB Creation on 2012 R2 Cluster

Hi, I have read quite a few posts on this box and the more I read the more confused I am getting.

The required end result is an SQL 2012 std AlwaysOn 2 node cluster; a highly available general file and profile share; and a 2012 hyper-v server.

Am I correct in thinking the following...

2 blades (4 procs, 8 cores, 32gb ram) each blade running Server 2012 std, hyper-v enabled running 2 vm's of server 2012, vm1 (2 cores) file/profile share, vm2 sql (4 cores) all installed onto the blade HDD/SSD. 85 users, business critical for SQl and file / profile share.

1 blade (2 proc, 12 cores, 32gb ram) server 2012 ent, hyper-v enabled. This will run assorted linux and windows vm's. low usage, non business critical machines.

1 blade (1 proc, 4 cores, 16 gb ram) server 2012 std, scale out file server using 4 of the shared drives in one raid 10 LUN (15k drives), smb 3.x for SQL. 12 shared drives in a separate raid 50 LUN (7.2k NL) for the file/profile share vm's and hyper-v. This leaves hdd's free for global spares and future expansion.

My questions are;

Is this a viable configuration or am I looking at the box the wrong way, having a single blade controlling the LUNS? Should I be looking at putting Server 2012 enterprise on 2 blades, upping the memory to 64gb on them, and running the other 2 blades at a base level as a file cluster? SQL speed is the most important factor for us, we are about 50/50 read write and use FILESTREAM to store data, this will not change and writes are likely to increase going forwards.

If I go for the 16port internal switch using all M502P blades will I gain a benefit from using Intel SFP+ cards for the 2 SQL blades ? I have a Netgear GS752TXS stack that supports DA connections.

At the moment our SQL, File Share and profiles are all on separate physical boxes. We have moved our desktops to VDI on vmware / citrix, expanding that with another Equalogic and couple of nodes will cost considerably more than a VRTX. I want to remove the remaining single points of failure. I accept the VTRX is still a single box but it will replace a T510, T610 and PE2950 all out of warranty.

Many thanks for any suggestions or advice.

Kane.

↧

Error while adding file share for File Server role

June 4, 2014, 8:23 am

≫ Next: Node and Disk Majority

≪ Previous: Pre purchase advice on Dell VRTX Cluster config

I'm getting this error when trying to add a file share on a Server 2012 R2 failover cluster:

"Unable to retrieve all data needed to run the wizard. Error details: Cannot retrieve information from server. Error occurred during enumeration of SMB shares: WinRM cannot complete the operation. Verify that the specified computer name is valid, that the computer is accessible over the network, and that a firewall exception for the WinRM service is enabled and allows access from this computer. By default, the WinRM firewall exception for public profiles limits access to remote comptuers within the same local subnet."

Despite the message, I can still create the share and it seems to work, but I'm concerned there may be an issue and I'd like to resolve it before putting the service into production. TIA!

↧

Node and Disk Majority

June 4, 2014, 1:04 am

≫ Next: Cannot browse Internet for Microsoft NLB in VMWare

≪ Previous: Error while adding file share for File Server role

Dear All,
Is there a way to configure "Node and Disk Majority" in Windows Server 2012 R2 Failover Cluster? If so, how?

Thanks in advance.

↧

Cannot browse Internet for Microsoft NLB in VMWare

June 5, 2014, 3:17 am

≫ Next: Unable to move SQL instance to another node

≪ Previous: Node and Disk Majority

Hi,

I have setup two Windows 2012 R2 virtual web servers that sit in Microsoft Network Load Balancing cluster. This cluster is configured using Multicast Mode.

On firewall level, this cluster IP is NAT to a public IP which allows Internet access. I can ping to either server or any other server that sits on the same subnet. However, I can't ping to Google DNS 8.8.8.8 from either of the web server.

Server 1: 192.168.5.2

Server 2: 192.168.5.3

Cluster IP: 192.168.5.4 NAT to a public IP

Has anyone encountered this before?

Kindly advise.

Thanks,

Shawn

↧

Unable to move SQL instance to another node

June 5, 2014, 3:44 am

≫ Next: Access is denied messages in Win2012 R2 Failover Cluster validation report and CSV entering a paused state

≪ Previous: Cannot browse Internet for Microsoft NLB in VMWare

Hi,

I m unable to move SQL instance to Node B from A. When i checked the cluster.log, this is the only error i see just before it failed. Can some one help me fix this.

- this was working earlier this is not a new cluster, no config changes are made.

000015fc.000035d4::2014/06/04-07:27:05.204 ERR   [RES] SQL Server <SQL Server (SQLSHR)>: [sqsrvres] ODBC
sqldriverconnect failed
000015fc.000035d4::2014/06/04-07:27:05.204 ERR   [RES] SQL Server <SQL Server (SQLSHR)>: [sqsrvres]
checkODBCConnectError: sqlstate = 08001; native error = ffffffff; message = [Microsoft][SQL Server Native Client
10.0]SQL Server Network Interfaces: Error Locating Server/Instance Specified [xFFFFFFFF].
000015fc.000035d4::2014/06/04-07:27:05.204 ERR   [RES] SQL Server <SQL Server (SQLSHR)>: [sqsrvres] ODBC
sqldriverconnect failed
000015fc.000035d4::2014/06/04-07:27:05.204 ERR   [RES] SQL Server <SQL Server (SQLSHR)>: [sqsrvres]
checkODBCConnectError: sqlstate = HYT00; native error = 0; message = [Microsoft][SQL Server Native Client 10.0]
Login timeout expired
000015fc.000035d4::2014/06/04-07:27:05.204 ERR   [RES] SQL Server <SQL Server (SQLSHR)>: [sqsrvres] ODBC
sqldriverconnect failed
000015fc.000035d4::2014/06/04-07:27:05.204 ERR   [RES] SQL Server <SQL Server (SQLSHR)>: [sqsrvres]
checkODBCConnectError: sqlstate = 08001; native error = ffffffff; message = [Microsoft][SQL Server Native Client
10.0]A network-related or instance-specific error has occurred while establishing a connection to SQL Server.
Server is not found or not accessible. Check if instance name is correct and if SQL Server is configured to allow
remote connections. For more information see SQL Server Books Online.
After a lot of google, i ensured that SQL browser service is running, however i m unable to failover. Please help!

↧