server 2012 Hyper-V CSV volume reporting incorrect space

September 1, 2016, 9:24 am

≫ Next: Cluster Service terminated by GUM Task

≪ Previous: Validating Cluster Configuration

I have a 3-node server 2012 Hyper-V cluster with 10 CSV's.

Our main CSV (3.5TB in size holding the majority of our VM's) seems to be increasing in used space even though there have been no VM's added or any changes to it.

A month ago this same CSV ran out of space so all the VM's ground to a halt.

This was most unexpected because, as I said above, there have been no changes to the CSV or the VM's contained within.
During this space outage I moved some VM's around and re-booted one off the nodes. suddenly the CSV free space jumped to 1.7TB free, which is what I expect but strange as what caused the free space to be used up.

now a month later the free space is decreasing again (now down from 1.7TB to 970GB). again there has been no changes to the VM's on the CSV

windirstat puts used pace for the volume at 1.8TB but looking at cluster properties in windows explorer the used space is 2.58TB.

I do not know why this is.

Anyone have any ideas other than reboot the nodes to see if this fixes?

thanks

↧

Cluster Service terminated by GUM Task

July 20, 2016, 12:21 pm

≫ Next: MPIO with Windows Failover Cluster

≪ Previous: server 2012 Hyper-V CSV volume reporting incorrect space

I've had an issue where one of my Windows 2012 R2 Hyper-V hosts just decided to keel over and die on me. The event which I'm seeing is as follows:

Log Name:      System
Source:        Microsoft-Windows-FailoverClustering
Date:          20.07.2016 20:39:19
Event ID:      5377
Task Category: Global Update Mgr
Level:         Error
Keywords:
User:          SYSTEM
Computer:      mgmt45.mgmt.local
Description:
An internal Cluster service operation exceeded the defined threshold of '110' seconds. The Cluster service has been terminated to recover. Service Control Manager will restart the Cluster service and the node will rejoin the cluster.
Event Xml:<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event"><System><Provider Name="Microsoft-Windows-FailoverClustering" Guid="{BAF908EA-3421-4CA9-9B84-6689B8C6F85F}" /><EventID>5377</EventID><Version>0</Version><Level>2</Level><Task>6</Task><Opcode>0</Opcode><Keywords>0x8000000000000000</Keywords><TimeCreated SystemTime="2016-07-20T18:39:19.244464800Z" /><EventRecordID>347017</EventRecordID><Correlation /><Execution ProcessID="4596" ThreadID="9184" /><Channel>System</Channel><Computer>mgmt45.mgmt.local</Computer><Security UserID="S-1-5-18" /></System><EventData><Data Name="OperationName">SynchronizeState</Data><Data Name="ThresholdTimeInSec">110</Data></EventData></Event>

I'm finding extremely little information regarding event 5377 on the Internet. Apart from doing the standard checking for latest windows updates, and rebooting - how can I prevent this from happening again in the future? This crash took down 64 virtual machines.

↧

MPIO with Windows Failover Cluster

September 21, 2016, 11:22 am

≫ Next: Validate SCSI-3 Persistent Reservation failed during Cluster validation

≪ Previous: Cluster Service terminated by GUM Task

Hi Team,

I am looking some pointers on configuring MPIO for Windows Failover cluster with shared storage. Is it mandatory to install MPIO feature before or with Windows server failover cluster with shared storage on Windows level or is it something that can be taken care at the Storage level? Any article that could help with configuring Windows Server MPIO feature will be appreciated. Thanks

Regards,

↧

Validate SCSI-3 Persistent Reservation failed during Cluster validation

September 21, 2016, 11:19 am

≫ Next: Team up iScsi networks in windows 2012 R2 failover cluster

≪ Previous: MPIO with Windows Failover Cluster

Hi Team,

We have 'Validate SCSI-3 Persistent Reservation' check failed during Cluster validation.

Cluster nodes are running on VMware virtualization platform and shared storage is from EMC storage mapped as pass through LUN directly with Cluster node VMs. Any one any pointers pls. what could be the issues here. Is there anything that need to be corrected or need to check in Windows Server or more related to VMWare and storage related issue? Any pointers will be appreciated. Thanks

Regards,

↧

Team up iScsi networks in windows 2012 R2 failover cluster

September 22, 2016, 10:45 pm

≫ Next: Cannot add a second node!

≪ Previous: Validate SCSI-3 Persistent Reservation failed during Cluster validation

Hi Team,

I have a fail over cluster with 3 nodes. We are using iSCSI initiators to connect SAN storage. Each node has two iSCSI network adapters.

Need your suggestion is it feasible or advisable to team up both iSCSI networks, currently they are not teamed up.

And also iSCSI network cluster use set as NONE, is that right?

Regards,

↧

Cannot add a second node!

September 7, 2016, 1:36 pm

≫ Next: Assigning Permission to Cluster

≪ Previous: Team up iScsi networks in windows 2012 R2 failover cluster

I have a two node hyper-v cluster that i am running. One of my node failed (crashed) , I reinstalled my server, setup as it was before the crash occured. I tried to re-add it bu it keeps falling with the following error messages:

On the servers i tried to add:

[QUORUM] An attempt to form cluster failed due to insufficient quorum votes. Try starting additional cluster node(s) with current vote or as a last resort use Force Quorum option to start the cluster. Look below for quorum information,
00000fcc.00000204::2016/09/07-16:20:48.958 ERR [QUORUM] To achieve quorum cluster needs at least 2 of quorum votes. There is only 1 quorum votes running
00000fcc.00000204::2016/09/07-16:20:48.958 ERR [QUORUM] List of running node(s) attempting to form cluster: VM01,
00000fcc.00000204::2016/09/07-16:20:48.958 ERR [QUORUM] List of running node(s) with current vote: VM01,
00000fcc.00000204::2016/09/07-16:20:48.958 ERR [QUORUM] Attempt to start some or all of the following down node(s) that have current vote: EXITINGVM, EXISTINGVM0,
00000fcc.00000204::2016/09/07-16:20:48.958 ERR join/form timeout (status = 258)

Any help from you will be appreciated and thanks in advance.

↧

Assigning Permission to Cluster

September 25, 2016, 11:30 am

≫ Next: Failover Clustering Check - Validate CSV Settings

≪ Previous: Cannot add a second node!

Hi Team,

We have installed Windows Failover Server with an account that has domain admin rights.

Once failover cluster is created, we are handing over the Cluster access to SQL Admin. We have an SQL installation user account created as 'SQL_admin'. SQL Admin logs into the node using the SQL_admin account but not able to connect with the Cluster.

SQL_admin account is added to 'Local Administrators' Group on each node those are part of cluster. I have logged into one of the cluster node with domain admin and try to add 'SQL_admin' account to 'Cluster permission' but not able to so with an error (screen shot attach)

We are getting attach error and need some pointers to provide full access to SQL_admin account so that he can start installing SQL instances.

Any help would be highly appreciated. Thanks

Regards,

↧

Failover Clustering Check - Validate CSV Settings

September 25, 2016, 3:00 pm

≫ Next: Windows 2012 R2 - Hyper-V. Shared CSV, iSCSI and SATA.

≪ Previous: Assigning Permission to Cluster

Hello,

I have a lab environment.

While checking failover requirements, i get a warning error for my CSV Storage.

"Failure while setting up to run cluster shared volumes support testing on node cluster1. The password does not meet the password policy requirements. check the minimum password length, password complexity and password history requirements."

I updated my password policy in my domain to have a lower requirement. Now the thing is, i'm testing this lab before applying to a live environment and want to minimize as much errors on the cluster validation and I don't want my network to exempt the cluster objects to have a lower password requirement.

It there anyway we can set the cluster to create a password that is compliant on my domain Password Security?

For God, and Country.

↧

Windows 2012 R2 - Hyper-V. Shared CSV, iSCSI and SATA.

September 4, 2016, 3:19 pm

≫ Next: MPIO and moving the tempdb to a different disk

≪ Previous: Failover Clustering Check - Validate CSV Settings

Hey Guys,

I would like to build a 2x Node Hyper-V cluster using a 3rd computer with 6x SATA HDDs as my "Shared Storage". I would like to be able to do things like live migration of one VM from one host to another etc.

I Was planning on using the Windows 2012 R2 iSCSI Target software built into windows, but I wanted to find out if I could use that solution to create shared volumes (CSV) for my cluster?

Thanks,

Robert

↧

MPIO and moving the tempdb to a different disk

September 26, 2016, 9:18 pm

≫ Next: Can we "suspend" a Hyper-V cluster while keeping all VM running

≪ Previous: Windows 2012 R2 - Hyper-V. Shared CSV, iSCSI and SATA.

We need to move the SQL server temp database of our SQL server cluster to new partition on the same disk. I've found some instructions that discuss moving the temp database, however I'm trying to find out whether there will be any additional issues as a result of the change.

1. The disk resides on SAN storage, are there any additional steps required regarding MPIO?

2. Are there any steps that need to be carried out on the secondary (failover) cluster node?

(current steps identified)

USE master
GO
ALTER DATABASE TempDB MODIFY FILE
(NAME = tempdev, FILENAME = 'd:\datatempdb.mdf')
GO
ALTER DATABASE TempDB MODIFY FILE
(NAME = templog, FILENAME = 'e:\datatemplog.ldf')
GO

stop the SQL server instance

move files to the new location

restart the instance

(Environment Details)

OS: Windows Server 2008 R2

Platform: SQL Server 2008 R2

Clustering: Windows Failover Cluster Manager

↧

Can we "suspend" a Hyper-V cluster while keeping all VM running

September 27, 2016, 6:47 am

≫ Next: Windows 2012 R2 DHCP cluster changing mode

≪ Previous: MPIO and moving the tempdb to a different disk

We have multiple large Hyper-V clusters with 100's of VM and the network people need to perform maintenance that will take ~ 10 minutes with unpredictable network connectivity. The only option I can find is to shutdown / save all VM - The cluster is already tuned to ride the maximum allowed network failure of 45 seconds (SameSubnetThreshold=30 & SameSubnetDelay=1500). Is there a way to "suspend" the entire cluster such that VM continue running (All VHDx are thick so not VHDx expansion needed)? VM will also lose network, but they would handle it like when a stand-alone server looses network - up to the application.

IE temporarily configure it to take no action and ignore all failures? VMWare appear to have a way to pause the entire cluster functionality (disable Host Monitoring) without affecting running VM.

↧

Windows 2012 R2 DHCP cluster changing mode

September 27, 2016, 9:33 am

≫ Next: Problem with deleting custom resource (Other server) type in Windows 2012 R2

≪ Previous: Can we "suspend" a Hyper-V cluster while keeping all VM running

In a Windows 2012 R2 DHCP cluster hot-standby mode can you switch the modes between servers? I want the current standby server to be the primary and the primary to standby without losing the current zones.

↧

Problem with deleting custom resource (Other server) type in Windows 2012 R2

September 27, 2016, 8:44 am

≫ Next: 2012R2 SOFS with high Disk Response Times and Hyper-V VM peformance

≪ Previous: Windows 2012 R2 DHCP cluster changing mode

I created custom resource type dll by SDK sample (ClipbookServer.dll).

For registering resource type to 2-node cluster using PowerShell Add-CustomResourceType. After execute Add-CustomResourceType ClipbookServer.dll appears in C:\Windows\Cluster directory in all cluster noded.

But after execute Remove-CustomResourceType ClipbookServer.dll not deleted from C:\Windows\Cluster.

How can perform automatically delete the file from c:\windows\cluster on all cluster node?

↧

2012R2 SOFS with high Disk Response Times and Hyper-V VM peformance

September 27, 2016, 5:30 pm

≫ Next: windows load balancing

≪ Previous: Problem with deleting custom resource (Other server) type in Windows 2012 R2

We are running a 2012R2 Cluster with SOFS serving up Hyper-V. Dell Hardware. In the Event Logs |Application and Services Logs|Microsoft|Windows|SMBServer we're seeing hundreds of repeated Warning Events SMBServer Event ID1020

File system operation has taken longer than expected.

Client Name: \\[fe80::d199:8860:d21:d7d7]
Client Address: [fe80::d199:8860:d21:d7d7%26]:49353
User Name: XXXXX\CLIUSR
Session ID: 0x80C0400000061
Share Name: \\*\b03a302b-1fdc-4c75-8c79-25d058749253-135266304$
File Name: SHARES\SW01DATAVOL2\ts15075a59.NNN.XXXXXX.com\Virtual Machines\C92B52CC-5739-4747-B6AE-CF4725B0505E\C92B52CC-5739-4747-B6AE-CF4725B0505E.vsv
Command: 11
Duration (in milliseconds): 208159633
Warning Threshold (in milliseconds): 120000

Guidance:

The underlying file system has taken too long to respond to an operation. This typically indicates a problem with the storage and not SMB.

The Disk Response(ms) times are very as shown in the Task Manager Resource Monitor. Currently in the 300-1000ms. This is occurring on Standalone 2012R2 Storage Spaces servers along with the Clustered SOFS. Performance of the VM is very bad if even able to logon. Most times the servers become inaccessible and kick current user off the system. We previously saw this in 2012R2 Clustered Storage Spaces in 2014. Anyone else aware of this issue.

We had a MS ticket on it back in 2014 but dropped the case when it became to time consuming and returned the hardware. We could never get past the MS Tier1 and Tier2 Engineers to get the ticket escalated. If I remember correctly the issue had to do with Disk Cache Flushing. My understanding is that MS created a patch to resolve the issue for another company but since our ticket wasn't elevated MS was unaware of our issue until later.

Thanks

Update to this: Back in 2014 when we went to a MS meeting the term was "excess disk cache flushes".

This blog https://blogs.msdn.microsoft.com/clustering/2014/06/05/cluster-shared-volume-performance-counters/

The perf counters for Cluster CSV File System Flushes. The values for the 4 volumes are

401,161 272,914 115,836 778,944

These seems to be very high but I don't have a gauge to determine it.

Dave Kreitel

↧

windows load balancing

September 27, 2016, 5:55 pm

≫ Next: Heartbeat Configuration

≪ Previous: 2012R2 SOFS with high Disk Response Times and Hyper-V VM peformance

Hi gys

does windows server works as load balancer or just works for high availability ?

I tried NLB for two web servers, just one of servers (who has top priority) answers me,so there is no load balancing?

thanks

↧

Heartbeat Configuration

September 28, 2016, 8:44 am

≫ Next: Live migrations fail during drain from Cluster-Aware Updating

≪ Previous: windows load balancing

for a windows2012 cluster is it necessary to have a private network configuration? sep nic, separate vlan? I see conflicting articles. What is msft stance now?

↧

Live migrations fail during drain from Cluster-Aware Updating

September 28, 2016, 8:19 am

≫ Next: Windows 2012 R2 Cluster Issue

≪ Previous: Heartbeat Configuration

We're trying to implement Cluster-Aware updating but we keep running into issues where virtual machine migrations fail to migrate.

Our cluster(s) have plenty of memory allowing frequently for 2 nodes of a 6 node cluster to be completely devoid of roles. We've kicked off CAU, it patches and reboots the nodes with no roles and then moves onto the others. While attempting to drain one of the remaining nodes, it will kick off live migrations (no low priority roles). Since our max migration value is 2, we will continually get 21501 warnings as it works through the list. Towards the end, and only occasionally, the last few will fail with a 21502 due to not enough memory. This then hangs the drain until manual intervention.

21501
Live migration of 'SCVMM BRMWD-SPDEV02' failed.
Virtual machine migration operation for 'BRMWD-SPDEV02' failed at migration destination 'BRMWD-HYPV02'. (Virtual machine ID 2A4EC899-079C-4355-A503-F097FAF33E2B)
Failed to perform migration on virtual machine 'BRMWD-SPDEV02' because virtual machine migration limit '2' was reached, please wait for completion of an ongoing migration operation. (Virtual machine ID 2A4EC899-079C-4355-A503-F097FAF33E2B)

21502
Live migration of 'Virtual Machine BRMWT-FE01' failed.
Virtual machine migration operation for 'BRMWT-FE01' failed at migration destination 'BRMWD-HYPV02'. (Virtual machine ID 385026E5-7B2F-46EA-ADFE-EF854F76A4FE)
'BRMWT-FE01' could not initialize. (Virtual machine ID 385026E5-7B2F-46EA-ADFE-EF854F76A4FE)
Not enough memory in the system to start the virtual machine BRMWT-FE01 with ram size 2048 megabytes. (Virtual machine ID 385026E5-7B2F-46EA-ADFE-EF854F76A4FE)

I know we could likely just increase the number of live migrations to get around this or even assigning all VMs to preferred owners to keep the cluster more balanced. This is unfounded but it seems like when a CAU drain is initiated it is picking a static host to move all VMs to rather than using the best possible node on each migration.

Can someone confirm for me if this is accurate or if there is any way of changing this?

↧

Windows 2012 R2 Cluster Issue

September 28, 2016, 11:53 am

≫ Next: Duplicate Services in Cluster

≪ Previous: Live migrations fail during drain from Cluster-Aware Updating

Dear All,

We are facing cluster issue on windows 2012 R2 cluster. We have configured Windows guest clustering on VMware 5.5 and getting below error on one of the cluster disk.

"Cluster resource 'Cluster Disk 4' of type 'Physical Disk' in clustered role 'KMPRODDCTMCSSRV' failed. The error code was '0xaa' ('The requested resource is in use.').

Based on the failure policies for the resource and role, the cluster service may try to bring the resource online on this node or move the group to another node of the cluster and then restart it. Check the resource and group state using Failover Cluster Manager or the Get-ClusterResource Windows PowerShell cmdlet."

Regards,

Hakim. B

Hakim.B Sr.System Administrator

↧

Duplicate Services in Cluster

September 9, 2016, 6:43 am

≫ Next: setting up Network cards for Failover Clustering

≪ Previous: Windows 2012 R2 Cluster Issue

I have an issue where if a service is started manually on a cluster node the Cluster Service will not bring it down and I end up with duplicates of services running across Cluster Nodes. is there not functionality in a cluster that checks to ensure that services only run on one node at a time?

↧

setting up Network cards for Failover Clustering

September 29, 2016, 12:13 pm

≫ Next: Move cluster to another location

≪ Previous: Duplicate Services in Cluster

I’m hoping I can get some advice on setting up physical network cards in a failover cluster environment. We have 2 nodes that use SAS to connect to the storage controller. The 2 nodes each have a 4 port 10G network adapter card. My plan is to setup a 2 node failover cluster teaming NICs 1-2 and creating a converged network to be used for Live Migration, Host Management, and CSV. NICs 3-4 will be teamed for the virtual networks used by the guest VMs.

The other way is to team all 4 network cards and create converged networks for all 5 networks.

Please advise

↧