Highly available VM has same drive letter on different nodes of failovercluster 2012.
Event ID 1212 - Cluster network name resource 'Cluster Name' cannot be brought online.
Hello,
We implemented a Windows Server 2012 hyper-V cluster recently. All was working correctly. However we decommissioned one of our 2003 Domain controller recently. Since then whenever I try to move core cluster resource to another, the following error is display in the event viewer.
Cluster network name resource 'Cluster Name' cannot be brought online. Attempt to locate a writeable domain controller (in domain\\nameofdecomissionedomaincontroller.domainname.local) in order to create or update a computer object associated with the resource failed for the following reason:
The server is not operational.
The error code was '8250'. Ensure that a writeable domain controller is accessible to this node within the configured domain. Also ensure that the DNS server is running in order to resolve the name of the domain controller.
 But the core resource is brought online without any issue and the cluster is working correctly.
I did a search nameofdecomissionedomaincontroller.domainname.local  into the registry, the only entry I found is below.
I guess this is where failover clustering is caching this setting and trying to contact the demoted DC every time I try to move a resource. I already tried to restart each cluster node and checked that the DC was decommissioned correctly.
Is it safe to edit the registry with existing DC name? Or any other solution is most welcomed.
Irfan Goolab SALES ENGINEER (Microsoft UC) MCP, MCSA, MCTS, MCITP, MCT
Failed to write data to VHD
I have an 8 blade cluster running Windows 2008R2 SP1.  We are running about 350 virtual Windows 7 workstations that users RDP into via Wyse Win7e thin clients.  The servers each are running dual 12 core Intel processors with about 96GB of memory. Each has dual active/active 8Gbps fiber channel paths each to a separate fiber channel switch which has 4 8Gbps paths each to an EMC CX4-120 SAN.  I have 4 RAID5 disks groups on the EMC setup to give the cluster it's Clustered Storage Disks: Volume1 through Volume4.  The VHDs and XML live on the same CSV and for the most part those virtual machine files are spread across these 4 CSVs pretty evenly and utilization is pretty low on the drives.  The SAN also have 100GB of FAST cache SSD drives to also further improve performance.
The issue is occasionally I get an error on a specific blade where a virtual will blue screen and reboot. Â The event log "Microsoft-Windows-VHDMP/Operational" will log 4 events:
Error; Event ID 6 "Failed to write data to VHD c:\clusterstorage\VolumeX\Win7WorkstationXXX.VHD. Â Error status 0xC000009A"
Error; Event ID 4 "Failed to surface VHD c:\clusterstorage\VolumeX\Win7WorkstationXXX.VHD. Â Surface attempt was cancelled"
Error; Event ID 6 "Failed to write data to VHD c:\clusterstorage\VolumeX\Win7WorkstationXXX.VHD. Â Error status 0xC000009A"
Informational; Event ID 2 "The VHD c:\clusterstorage\VolumeX\Win7WorkstationXXX.VHD has been removed (unsurfaced) as disk number 0."
If I migrate the virtual machine with the issue to another blade the error stops and the machine runs fine until the next random virtual machine starts doing the same thing. Â There does not seem to be a rhyme or reason behind when the next VM will have issues; could be a an hour, could be days. Â This can happen on any virtual running on any of the 8 blades.
Any ideas what causes this?
Windows Server 2008 R2 - Cluster Name IP address issue / Cluster down
We have two Windows server 2008 R2 Nodes in a Cluster with a node & disk majority setup.
We run all of our production Hyper-V cluster shared volumes / VM's on the service.
One of our nodes can no longer communicated with the CSV.
Looking in the the Cluster MMC....the following is displayed as an error:
"The Cluster Network name is not online"
Then expanding out to "Cluster Core Resources"the cluster name is showing a "FAILED" Status.
On examination I can see it has no IP address, however "Add" and "EDIT" are greyed out on the general tab so I can't add a valid IP.
cluster res "Cluster IP Address" /priv address=192.168.3.252
Even running as an elevated administrtor command prompt I receive:
Â
System error 5007 has occurred (0x0000138f).
The cluster resource could not be found.
Â
This has all occured since we migrated our subnet, all NIC's are up and pinging and both nodes.
The cluster services and even the nodes have been restarted several times today.
Â
Heres the error as detailed in cluster res
Â
Â
Microsoft Windows [Version 6.1.7600]
Copyright (c) 2009 Microsoft Corporation. Â All rights reserved.
Â
Listing status for all available resources:
Â
Resource       Group         Node       Status
-------------------- -------------------- --------------- ------
Cluster Disk 1    Cluster Group     HyperVLon01   Online
Cluster Disk 2 Â Â Â 7f39802a-5786-40ec-bd1c-af2a62b6db8a HyperVLon02 Â Â Online
Â
Cluster Disk 3 Â Â Â 4d0b5ec9-4445-4899-bece-714231080057 HyperVLon02 Â Â Online
Â
Cluster Name     Cluster Group     HyperVLon01   Failed
Â
When I run cluster res "Cluster IP Address" /privÂ
I get the following error:
Â
System error 5007 has occurred (0x0000138f).
The cluster resource could not be found.
Â
Â
The dns entry is valid, the Active Directory records are intact, this issue is purely assigning an IP.
Any help gratefully appreciated.
Thanks
John
New design for SQL Server ERP system
Hello, everybody:
I'm looking for feedback on a purpose HA solution for a ERP system that run on SQL Server 2012.Â
Server1 - Windows 2012 R2 Hyper-V with SQL Server 2014 with SSD RAID-5 clustered as a fail-over with Server2Â
Server2 - Windows 2012 R2 Hyper-V with SQL Server 2014 with SSD RAID-5 clustered as a fail-over with Server1
Server3 - Windows 2012 R2 Hyper-V running ERP application Hyper-V replica with Server 4Â
Server4 - Windows 2012 R2 Hyper-V running ERP application
Questions: Does it make sense to have 4 host hyper-v machine in this configuration, won't this require a total of 6 Windows 2012R2 licenses? Â I prefer using Hyper-V since you can move the machines between each other for maintenance. Â
Also, with SQL Server it was recommended not to use a replica since it would defeat the SQL server clustering built-in. Â Can you add a RAID-5 on-board controller to a shared disk cluster? Â My other options would be iSCSI, SAN, or NAS.
Thanks!
Disabling CAU Schedule
Hello Everyone,
after disabling the schedule for Cluster Aware Updating, I receive some warnings in the Cluster Validation Reports as the CAU Role and it's resources are offline in the Cluster.
Question: Is it supposed to be like that? Can I ignore this warnings?
Background is: We want to use CAU, but start it manually for each cluster.
Thanks,
Jens
jensit.wordpress.com
Cluster-aware updating - Self updating not working
Hi,
I have a Windows Server 2012 failover cluster with 2 nodes, and I am having problems gettign the Self updating to work properly.
The Analyze CAU Readiness does not report any issues, and I have been able to run a remote update with no problems. I don't get any errors or failure messages in the CAU client, only this message: "WARNING: The Updating Run has been triggered, but it
has not yet started and might take a long time or might fail. You
can use Get-CauRun to monitor an Updating Run in progress."
In the Event Viewer is see 2 errors and 1 warning for each run, Events 1015, 1007 and 1022.
1015: Failed to acquire lock on node "node2". This could be due to a different instance of orchestrator that owns the lock on this node.
1007: Error Message:There was a failure in a Common Information Model (CIM) operation, that is, an operation performed by software that Cluster-Aware Updating depends on.
Does anyone have any idea what is causing this to fail?
Thanks!
Network Validation gets freeze
Hi ,
I do have an Exchange Server 2010 SP1 running in my environment. I do have 3 Mailbox server on single DAG, few day back I found my machine was getting fail-over without any issues on server. Viewing the log I found that there was an event of Event ID 1177 and 1135. It suggested to validate the Clustering of the servers and while checking the server cluster. I found the validation getting freeze on Network Validation.
Even I tried to check the Network Stability by transferring 6 GB of single file from one Mailbox server to another, it got failed. Below is the snap of my Cluster Validation. Request for your help on this.
CAU and WSUS
Networking hyper-v cluster
Hi,
I am coming from a vmware environment to hyper-v and building a two-node cluster with shared storage and I have some questions regarding NIC configuration. My hosts have 8 nics each. I was planning on configuring as follows...
1&2 Team - Management Network (Production LAN) eg 192.168.100.0/22
3&4 Team - VM traffic to Network (Production LAN) eg 192.168.100.0/22
5&6 Team - iSCSI traffic to shared storage eg 10.10.10.0/29
7&8 Team - Live migration (crossover cable )
Q1. In lots of places I read recommendations about separating the management traffic. But the hosts have to be part of the domain, so how can I achieve separation of management traffic? Should I just team nics 1-4? Nic's 1 and 3 will be going into one switch and 2 and 4 into a different switch.
Q2. When creating the cluster and assigning a VIP, how should I set that up? Which of the above networks does it go in?
I have read lots of articles and everyone seems to be doing it differently and using different terms to describe their network segregation.
Thanks in advance
2-node SQL Failover Cluster (Crossover cables)
Hello,
I have 2 HP servers and an HP MSA. The goal here is to cluster these 2 machines together and have them share the MSA storage for use with SQL server. The HP MSA has dual controllers so it connected to each server twice using crossover cables. The current configuration has both server connected to a management switch, as well as the management port on the MSA.Â
Each server is then connected to each iSCSI controller on the MSA via a crossover.
Server1:
management 172.16.100.1
MSA A110.0.0.1
MSA B110.0.0.3
Server2:
management 172.16.100.2
MSA A210.0.0.2
MSA B210.0.0.4
MSA:Â
management 172.16.100.3/172.16.100.4
A110.0.0.11A210.0.0.12
B110.0.0.13B210.0.0.14
I then created an iSCSI connection from each server to the MSA using both MSA addressed. So Server1 can discover the MSA using 10.0.0.11 or 10.0.0.13. After I made these connections and configured the disks I enabled windows cluster fail over. After putting both servers into the fail over cluster I was able to get past almost ALL validation checks besides it complaining that A1 NIC can not ping B1 NIC. An obvious issue due to them being isolated crossover networks. I confirmed that my cluster was functioning and even tested the disk fail over multiple times. When Server1 goes down Server2 is than the owner of all the data on the MSA. GREAT.
Now that I have explained the architecture our goal is to install SQL 2014 on these machines and use the MSA storage for the database so it can have a cold/hot backup. While this configuration would lose some data it would be minimal during fail over.Â
The problem is when installing SQL in a fail over configuration it re-runs the cluster validation and fails it due to these networks not being able to communicate. I have tried Teaming the NIC's together with no great outcome. I am really stuck as to how to get past this validation section. Most online instructions everyone is using this WSFC for virtual machines when we really just want to use it to share storage.Â
Thanks for any and all help I truly appreciate it.Â
Failover Cluster WMI Provider detected an invalid character
Hi,
Another thread on the event log warning 6230: Failover Cluster WMI Provider detected an invalid character.
"The private property name 'Volume ID' had an invalid character and has been changed to 'Volume_ID'. Valid characters for WMI property names are A-Z, a-z, 0-9, and '_'."
I get the warning in blocks of 10, for each private property every 15 minutes:
Staging_Path
Replication Group Name
Replication Group ID
Replicated Folder Root Path
Replicated Folder Name
Replicated Folder ID
Replicated Folder Flags
Member ID
Conflict Path
Volume ID
This is on a new Server 2012 R2 cluster for a file server running on VMWare ESXi 5.5 U3b.
DFS-R is setup also for this cluster.
IÂ have Server 2008 R2 clusters with DFS-R that do not have this spraying in the event log.
Most threads on this topic are a few years old and specifically show resolutions for 2008/2008R2 Pre SP1 Cluster setups, mostly KB974930.
I have created another cluster being careful with naming of objects and it did exactly the same.
Any advice on how to track this one down?
windows 2012 r2 CSV report enter pause c0130021
Hi,
i use windows 2012 r2 cluster 2 nodes fully patched with ibm storage v3700 FC connect and for the backup i use veeam backup and replication.i get this error few times a day.i must say that the VM's does not enter pause mode but the CSV change host when i get the error. I've checked and update the bios,drivers, firmware with IBM and the servers and storage are fully patched.
how to fix this problem...?
THXÂ
Cluster Shared Volume 'Volume2' ('CSV2') has entered a paused state because of '(c0130021)'. All I/O will temporarily be queued until a path to the volume is reestablished.
Software snapshot creation on Cluster Shared Volume(s) ('\\?\Volume{74937dcb-1bd3-4af1-865e-94b24a509a86}\') with snapshot set id 'b428bb9b-a027-41cd-802a-552cc047ddc8' failed with error 'HrError(0x80042306)(2147754758)'. Please check the state of the CSV resources and the system events of the resource owner nodes.
Driver developer question. How to save metadata inside the CSV volume before system restart?
I Â have a CSV-volume problem on which I have got stuck. I am attaching my minifilter below CSVFs, and during InstanceTeardownStart callback I am failing to save(write) metadata file with STATUS_FILE_INVALID. So files are already invalidated here.. also I can't use Pre IRP_MJ_SHUTDOWN that comes later, the same for query teardown. So how can I save metadata inside CSV volume during restart? See no possibilities..Â
I know that only the MS CSV developers can help with this, and don't know better place to ask this.
Monitoring Server (Opmanager) shows clear/online status for one of the MS SQL Server 2012 on Windows 2012 R2 virtual machines
Environment:-
- Opmanger monitoring server across multiple WAN connections installed on subnet 10.250.1.xx
- 3 x MS SQL Server 2012 Enterprise edition installed on two MS Windows 2012 R2 virtual machines on clustered environment
- 1 x MS SQL Server is on 10.15.16.x subnet
- 2 x MS SQL Server is on 10.15.18.xx subnet
No Issue:-
- No issue from monitoring application to SQL Server on 10.15.16.xx subnet the status shows "Online"
- No issue from monitoring application to any other server on subnet 10.15.18.xx submnet
Issue:-
- One on the SQL server on 10.15.18.xx subnet
- Monitoring application shows "Online" status for one of SQL server. If I restart SQL server with "critical" status updates to "online" after restart and other SQL server with "online" status changes to "critical"
- Basically one of the SQL Server on cluster environment subnet 10.15.18.xx is always showing on "Critical" status by monitoring application server
- I can ping from server status "online" both direction
- I cannot ping from server status "critical" both direction
- I can trace from server status "online" both direction
- I cannot trace from server status "critical" both direction
- No errors into event logs
I have done my troubleshooting and also posted on opmanger forums with no luck on resolution
I believe it is more cluster issue when service restarts
Any idea on resolution please?
Muhammad Mehdi
Cluster node starts and stops constantly
One node of a two node cluster constantly stops and starts with the following errors. They are virtualized in VMWare. Both nodes are on the same host and on the same subnet. The file share witness is accessible from both nodes, permissions are
full control. I can watch the NICs fail in the cluster manager one at a time until the node goes down. This failure only occurs with node 1. This node has been rebuilt once by cloning the base 2012R2 server. Any help is greatly appreciated.Â
File share witness resource 'File Share Witness' failed to arbitrate for the file share . Please ensure that file share exists and is accessible by the cluster.
Cluster resource 'File Share Witness' of type 'File Share Witness' in clustered role 'Cluster Group' failed.
The Cluster service is shutting down because quorum was lost. This could be due to the loss of network connectivity between some or all nodes in the cluster, or a failover of the witness disk.
The cluster Resource Hosting Subsystem (RHS) process was terminated and will be restarted. This is typically associated with cluster health detection and recovery of a resource. Refer to the System event log to determine which resource and resource DLL is causing the issue.
The Cluster Service service terminated with the following service-specific error:
A quorum of cluster nodes was not present to form a cluster.
The Cluster Service service terminated unexpectedly. It has done this 16 time(s). The following corrective action will be taken in 60000 milliseconds: Restart the service.
2012 R2 Cluster - Active Node ejects all other nodes - random times
ISSUE
We have a 4  node 2012 R2 Cluster - Active\Passive \File Share\ and Passive DR Server
Our issue is that our active node appears to be losing all cluster communication and is ejecting all other nodes and we can not find any system event log items to indicate loss of local area connection or issues with network dropping. We have a third party monitoring tool that during these events has never lost a ping to this system showing it as down.
Our current Band-Aid fix is to set the Cluster Service to restart automatically after failure. This gets the cluster back online after 60 seconds but we are still down for 60 seconds. We have not enabled automatic failover due to fact that all applications have not been tested on node 2 of production as of yet.
Here are the variables for our environment.
Cluster is physical on Dell Hardware. Current network team shows no errors within Open Manage SA.
Network team shows no indication of flapping on the switch.
Systems:
Active - SQL-CL02 - 1 Vote (Active Cluster Owner)
Passive- SQL-CL03 - 1 Vote
File share - WIN2012-FS01 - 1 Vote
PassiveDR- SQL-CL01 - 0 Vote
Cluster Networking Info:
Production - Network in use for cluster communications.
10.100.1.7/26
Backup Network - Disabled for cluster communications.
DR - Network in use for cluster communications.
10.200.1.7/26
Failure Events in order of time from cluster event logs.
1135 - Cluster node 'SQL-CL03' was removed from the active failover cluster membership. The Cluster service on this node may have stopped. This could also be due to the node having lost communication with other active nodes in the failover cluster. Run the Validate a Configuration wizard to check your network configuration. If the condition persists, check for hardware or software errors related to the network adapters on this node. Also check for failures in any other network components to which the node is connected such as hubs, switches, or bridges.
***Â (No network connections identified, we have a 3rd party monitoring tool that showed active pings thorough out this event.)
1135 - Cluster node 'SQL-CL01' was removed from the active failover cluster membership. The Cluster service on this node may have stopped. This could also be due to the node having lost communication with other active nodes in the failover cluster. Run the Validate a Configuration wizard to check your network configuration. If the condition persists, check for hardware or software errors related to the network adapters on this node. Also check for failures in any other network components to which the node is connected such as hubs, switches, or bridges.
1564 - File share witness resource 'File Share Witness' failed to arbitrate for the file share '\\WIN2012-FS01\Witness'. Please ensure that file share '\\WIN2012-FS01\Witness' exists and is accessible by the cluster.
Based on the failure policies for the resource and role, the cluster service may try to bring the resource online on this node or move the group to another node of the cluster and then restart it. Check the resource and group state using Failover Cluster Manager or the Get-ClusterResource Windows PowerShell cmdlet.
1069 - Cluster resource 'File Share Witness' of type 'File Share Witness' in clustered role 'Cluster Group' failed.
Based on the failure policies for the resource and role, the cluster service may try to bring the resource online on this node or move the group to another node of the cluster and then restart it. Check the resource and group state using Failover Cluster Manager or the Get-ClusterResource Windows PowerShell cmdlet.
1177 - The Cluster service is shutting down because quorum was lost. This could be due to the loss of network connectivity between some or all nodes in the cluster, or a failover of the witness disk.
Run the Validate a Configuration wizard to check your network configuration. If the condition persists, check for hardware or software errors related to the network adapter. Also check for failures in any other network components to which the node is connected such as hubs, switches, or bridges.
1561 - The cluster service has determined that this node does not have the latest copy of cluster configuration data. Therefore, the cluster service has prevented itself from starting on this node.
Try starting the cluster service on all nodes in the cluster. If the cluster service can be started on other nodes with the latest copy of the cluster configuration data, this node will be able to subsequently join the started cluster successfully.
If there are no nodes available with the latest copy of the cluster configuration data, please consult the documentation for 'Force Cluster Start' in the failover cluster manager snapin, or the 'forcequorum' startup option. Note that this action of forcing quorum should be considered a last resort, since some cluster configuration changes may well be lost.
1069 - Cluster resource 'WIN2012-SQLAG-01_10.100.1.7' of type 'IP Address' in clustered role 'WIN2012-SQLAG-01' failed.
Based on the failure policies for the resource and role, the cluster service may try to bring the resource online on this node or move the group to another node of the cluster and then restart it. Check the resource and group state using Failover Cluster Manager or the Get-ClusterResource Windows PowerShell cmdlet.
Thanks for your consideration on this issue.Where else might we search for more information on this issue.
-D
Cluster fail over issue
Hi
I have a cluster with three nodes
2 Nodes are in one datacenter ( 1.0.1.1 , 1.0.1.2)
1 Node is in secondary data center (2.0.1.1)
Cluster Name is ( CBCLUS01) and there are few roles in there.
Wherever a server(e.g 1.0.1.1) in primary datacenter restart, the cluster role(CBCLUS01) move to Node(2.0.1.1) in secondary data  center whereas it should remain in the same datacenter and move to second server in primary datacenter.
Any thoughts why its behaving like this. Quorum is configured as Node and Share Majority and Node in Secondary data center does not have any vote.
validate configuration failed.... used to work
I'm starting to test failover clustering with HYPERV and Server 2008R2. several weeks ago, i successfully validate 2 nodes with 1 error. That is the username being used doesn't have the privilege to create computer in the domain. I then requested a new username with the appropriate permission as listed on the docs. Now i tried the validate configuration again, but it won't even let me add a node. I'm getting this error:
Unable to determine if you have administrator privileges on serverÂ
 'servername'. Please ensure sure that the server service and remoteÂ
 registry services are enabled, and that the firewall is properly configuredÂ
 for remote access.
I disabled firewall and all the services are enabled. but i'm still getting this error. the only thing that has changed since the last time i validated these 2 nodes are windows update.Â
Does anyone know how i should fix this? i'm trying to uninstall all the updates installed from several weeks ago. but this is going to take a while because i need to restart everytime i uninstall an update.
Â
Â
Maximum LUNS in Hyper-V 2012 Cluster
Hi,
In Hyper-V 2008 R2, I was told that "Windows does not have a great track record managing many LUNs. From experience we see scale issues anywhere at 150 LUNs and above."
I was also told "MSFT is currently fully engaged in the next revision of the platform that all this is built on....."
Wondering if 2012 can support more LUNS.
Is there any one that has an idea how many LUNS will work well on a Hyper-V 2012 cluster.
Please don't answer this question unless you have solid knowledge about this matter.
(sorry I asked this recently but wasn't given a direct answer, so I am asking again more directly).
Thanks
Daniel