Quantcast
Channel: High Availability (Clustering) forum
Viewing all articles
Browse latest Browse all 6672

Event ID 1073 The Cluster service was halted to prevent an inconsistency within the failover cluster. The error code was '668'.

$
0
0

Hi everyone!

There is a 5-node SQL Server 2012 failover cluster based on Windows server 2012 Datacenter and built on IBM Bladecenter HS23 type 7875. Cluster nodes are using SAN-boot from IBM Storwize v3700 and LUN's from IBM Storwize v7000.
Periodically on different nodes of the cluster appears an error vent ID 1073 The Cluster service was halted to prevent an inconsistency within the failover cluster. The error code was '668', and Event ID 7031 The Cluster Service service terminated unexpectedly.  It has done this 1 time(s). TThe following corrective action will be taken in 60000 milliseconds: Restart the service и Event ID 7024 The Cluster Service service terminated with the following service-specific error: An assertion failure has occurred. After these errors have appeared cluster node hangs in "joining" state and the same happens to all nodes that will be rebooted or turned off, and all operations I try to preform on cluster(stopping cluster service, pause, evict, etc) are failling. Cluster returns to normal state only after all of its node are rebooted. Here's is the piece of cluster log at the time the error occurred:   

00000b4c.00000c7c::2014/04/21-03:32:25.939 INFO  [VSS] Backing up part of the system state [VSS] OnPrepareBackup: starting new session dfb4fbf0-db28-40d2-af3a-82e66a271267
00000b4c.00000c7c::2014/04/21-03:32:25.939 INFO  [VSS] OnPrepareBackup returning - true
00000b4c.00001194::2014/04/21-03:32:26.704 INFO  [GUM] Node 7: Processing RequestLock 4:4744
00000b4c.00001198::2014/04/21-03:32:26.704 INFO  [GUM] Node 7: Processing GrantLock to 4 (sent by 3 gumid: 11271)
00000b4c.00000e2c::2014/04/21-03:32:26.704 ERR   mscs::GumAgent::ExecuteQueuedUpdate: TransactionInProgress(5918)' because of 'Cannot restart an in-progress transaction'
00000b4c.00001194::2014/04/21-03:32:26.719 ERR   Failed type check .?AUBoxedNodeSet@mscs@@
00000b4c.00001194::2014/04/21-03:32:26.719 ERR   [CORE] mscs::ClusterCore::DeliverMessage: TypeMismatch(1629)' because of 'failed type check'
00000b4c.00000e2c::2014/04/21-03:32:26.750 INFO  [VSS] HandleBackupGum - Initiating the backup
00000b4c.00000e2c::2014/04/21-03:32:26.750 INFO  [VSS] HandleOnFreezeGum - Stopping the Death Timer
00000b4c.00000e2c::2014/04/21-03:32:26.750 INFO  [VSS] HandleBackupGum - Completed the backup Request
00000b4c.00000e2c::2014/04/21-03:32:26.750 ERR   [GUM] Node 7: sequenceNumber + 1 == payload->GumId (5129, 11272)
00000b4c.00000e2c::2014/04/21-03:32:26.750 ERR   mscs::GumAgent::ExecuteQueuedUpdate: AssertionFailed(668)' because of 'failed assertion'(sequenceNumber + 1 == payload->GumId is false)
00000b4c.00000e2c::2014/04/21-03:32:26.750 ERR   GumHandler failed (status = 668)
00000b4c.00000e2c::2014/04/21-03:32:26.750 ERR   GumHandler failed (status = 668), executing OnStop
00000b4c.00000e2c::2014/04/21-03:32:26.750 INFO  [DM]: Shutting down, so unloading the cluster database.
00000b4c.00000e2c::2014/04/21-03:32:26.750 INFO  [DM] Shutting down, so unloading the cluster database (waitForLock: false).
00000b4c.00000e2c::2014/04/21-03:32:26.813 ERR   FatalError is Calling Exit Process.
00000b4c.00000b50::2014/04/21-03:32:26.813 INFO  [CS] About to exit process...
000015d0.000015d4::2014/04/21-03:32:26.828 WARN  [RHS] Cluster service has terminated.
00001618.0000161c::2014/04/21-03:32:26.828 WARN  [RHS] Cluster service has terminated.
00001588.0000158c::2014/04/21-03:32:26.828 WARN  [RHS] Cluster service has terminated.
000015f4.000015f8::2014/04/21-03:32:26.828 WARN  [RHS] Cluster service has terminated. 

All of the reccommeded failover cluster updates and hotfixes are installed and the cluster is validated. 


Viewing all articles
Browse latest Browse all 6672

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>