CSV errors STATUS_IO_TIMEOUT Windows 2012 Hyper-V Failover cluster

November 6, 2012, 6:14 pm

≫ Next: Adding Clustered Storage, Why do i need to create a windows volume first?

Hi there,

I'm seeing these errors sometimes, on a Windows 2012 Hyper-V Failover Cluster.

Cluster Shared Volume 'Volume1' ('Cluster Disk 1') is no longer available on this node because of 'STATUS_IO_TIMEOUT(c00000b5)'. All I/O will temporarily be queued until a path to the volume is reestablished.

There are three nodes, each with 6 NICs. first two NICs are teamed and connected to a VM virtual switch. Second two are teamed (one active) used for cluster comms. Third two are teamed (one active) and connected to a virtual switch - with the management OS and another cluster NIC.

redirected access seems to work fine (tested by removing the CSVs from one node). But it's weird that we keep seeing these on the cluster logs.

Also see STATUS_CONNECTION_DISCONNECTED sometimes too.

Does anyone know what this could be?

↧

Adding Clustered Storage, Why do i need to create a windows volume first?

November 11, 2012, 9:54 am

≫ Next: Mpclaim.exe is not running in windows server 2008 SP2 x_64

≪ Previous: CSV errors STATUS_IO_TIMEOUT Windows 2012 Hyper-V Failover cluster

I have 2 drives that are located on an iSCSI target. I am adding one as a quorum drive and the other as Clustered storage.

I have gone through creating the iSCSI target, initiator and linking the two. I can see the disks in Disk Manager and brought them online.
In this state, i can add them as cluster storage, but they wont work as a quorum drive. To make one work as a quorum drive, i need to create a volume in Disk Manager on the disk.

My questions are:

1. Why can i add the disks in cluster manager without a volume created?

2. Why do i need to add a volume before i can use the drive as a quorum drive?

3. Do i need to add a volume via Disk manager for any storage Im using in the cluster, such as for sql server files?

Thanks!

↧

Mpclaim.exe is not running in windows server 2008 SP2 x_64

November 5, 2012, 1:07 am

≫ Next: Unable to get Computer Object using GUID.

≪ Previous: Adding Clustered Storage, Why do i need to create a windows volume first?

I was added Multipath IO Features in windows server 2008 SP2 x_64 using "Add Features" wizard in server manager.

After that I added my storage and devices in MPIO devices.

Then I ran "mpclaim.exe -s -d" command but it was failed to execute. It always gives the following message:

"Install MPIO OC then ensures msdsm claims pass in devices"

I was also applied KB 2522766, as per suggested in Microsoft site, but it is still showing the same message.

Then I have also tried following commands :

“ocsetup MultipathIo /norestart” and then run “ mpclaim –r –i –a “” ”

but it fails to execute all mpclaim command options and give the same message.

I was tried the same settings in "windows server 2008 r2 x_64" they are running properly here.

Is there any issue with OS or anything else????

Please help me.

Any help is highly appreciated .

Thanks & Regards,

Ankit

↧

Unable to get Computer Object using GUID.

October 29, 2012, 6:34 am

≫ Next: upgrade path from 2008 hyper V cluster to 2012

≪ Previous: Mpclaim.exe is not running in windows server 2008 SP2 x_64

I have a NAS Cluster set up with HP Servers and Left Hand Storage on the back end. The system was set up on one domain, we have migrated to a new domain and need to migrate the cluster to the new domain as well. However when trying to migrate the cluster, get the following error:

Cluster network name resource 'Cluster Name' cannot be brought online. The computer object associated with the resource could not be updated in domain 'mcg.local' for the following reason:

Unable to get Computer Object using GUID.

The text for the associated error code is: There is no such object on the server.

The cluster identity 'NASCLUSTER$' may lack permissions required to update the object. Please work with your domain administrator to ensure that the cluster identity can update computer objects in the domain.

Have checked the DNS on both domains as well as the permissions on both domains. Trying to figure what needs to be done to resolve the issue.

Paul Boespflug

↧

upgrade path from 2008 hyper V cluster to 2012

November 19, 2012, 10:06 am

≫ Next: Continuous Availability in Windows Server 2012 Storage Spaces write performance drop

≪ Previous: Unable to get Computer Object using GUID.

Dear friends,

i am not sure if i am on the right forum.. and my appologies if this has been discussed earlier also...

i have windows 2008r2 cluster on two servers .. serving production hyper V clients/ vms ... and when i look at the features provided by 2012 cluster and hyperV i feel like upgrading it ... but i am not very sure whats the right way to do the same without disturbing my current clients .. because i cant afford to have any downtime .. please advise if this is possible and how... Thanks for the help and time ..:)

Thanks
Happiness Always
Jatin

↧

Continuous Availability in Windows Server 2012 Storage Spaces write performance drop

November 12, 2012, 4:44 pm

≫ Next: Redo 4 Node Cluster - Mrrorview replicated Disks

≪ Previous: upgrade path from 2008 hyper V cluster to 2012

Hi there!

we built a test Environment for Storage Spaces and Hyper-V with SMB 3.0. We have 8 SAS discs in a JBOD box. We connected two supermicro Servers. We get the expected Performance out of the discs until we activate continuous availability for Fileserver or scale out Fileserver. Before we activate it, we get a write peformance (just simple copying of large files) about 200-400 MB/sec. Once the continous availability is activated it drops to 38MB/sec.

Is this a "normal behavior"? Did we configure something wrong? Again, this is reporduceable. You can switch this option on or off which leads to an speed increase and decrease. I think the write cache is switched off in continuous availability, but is the impact really that high?

Thanks for any help!

Florian

↧

Redo 4 Node Cluster - Mrrorview replicated Disks

November 19, 2012, 2:07 am

≫ Next: The Cluster service is shutting down because quorum was lost

≪ Previous: Continuous Availability in Windows Server 2012 Storage Spaces write performance drop

I currently have 2 clusters A & B on 2 sites. The 2 Clusters have 2 instances (Inst1 & Inst2) of SQL 2008 R2 Standard on Windows 2008 R2 Ent each. After changes to SAN, I now have synchronously replicated disks between sites/rooms. All nodes are in the same subnet.

My question is what is the best way to create Node AND/OR Disk Majority Single Cluster so one SQL instance (Inst1 needs HA no DR) is not failed over to second site and the other (Inst2 needs HA & DR) can be recovered from DR site. HA is must within each site for both instance. But can live with no Automatic Start-up after fail over to second site.

Don't plan to use expensive Cluster Enabler Software. I have to perform this carefully without too much downtime. I am thinking along the lines of Migrating all instances(applications) to Cluster A and then trashing Cluster B. Is that possible? Any suggestions much appreciated.

↧

The Cluster service is shutting down because quorum was lost

November 15, 2012, 11:20 am

≫ Next: Failover cluster repeating warning 4694 on physical servers?

≪ Previous: Redo 4 Node Cluster - Mrrorview replicated Disks

Hi ,

we have Active-Pasive-Active SCC exchnage 2007 Cluster running. daily in the 11 AM time we are getting below event error and after that cluster failing over to pasive node .

"The Cluster service is shutting down because quorum was lost. This could be due to the loss of network connectivity between some or all nodes in the cluster, or a failover of the witness disk.
Run the Validate a Configuration wizard to check your network configuration. If the condition persists, check for hardware or software errors related to the network adapter. Also check for failures in any other network components to which the node is connected such as hubs, switches, or bridges."

we check all n/w component looks fine .

pl. help if any issue on cluster lavel .

thanks

↧

Failover cluster repeating warning 4694 on physical servers?

January 24, 2011, 1:01 pm

≫ Next: Mirror Storage via Storage Pool?

≪ Previous: The Cluster service is shutting down because quorum was lost

Can anyone tell me why I'm getting this: http://support.microsoft.com/kb/2014304 on a pari of machines that are physical boxes?

The KB article seems to indicate that repeated occurrances of the 4694 error is due to not having Hyper-V tools on a virtualized server. However in my case the servers in my failover cluster are physical boxes and I'm still seeing this issue. Can anyone advise on why this might be happening or how to fix it? I'm hoping there is a solution other than installing the Hyper-V tools (because I'd rather not have someone using those tools on these boxes).

↧

Mirror Storage via Storage Pool?

November 19, 2012, 11:07 pm

≫ Next: Windows Clustering -ownership issue

≪ Previous: Failover cluster repeating warning 4694 on physical servers?

Is it possible to mirror a Storage using storage pools?

Idea:

take LUN 1 from storage A...

take LUN 2 from storage B...

bring them into a storage pool, create a virtual disk on it and use this to store a CSV on it.

Is this possible?

I've read in the Blogpost How to Configure a Clustered Storage Space in Windows Server 2012 that the disks have to be SAS connectet but I don't know if understood it right that fiberchannel Storages (like in the example) are not supportet for use fore storage spaces?

↧

Windows Clustering -ownership issue

November 14, 2012, 11:19 am

≫ Next: Cluster heartbeat network and cluster IP to AD and DNS communication

≪ Previous: Mirror Storage via Storage Pool?

We recently setup 2 Dell PowerVault NX3000 running on the Windows Storage Server 2008 R2 Enterprise SP1 as a clustering. Let's say NAS1 and NAS2. We have 3 nodes, and 2 of the nodes setting as prefered owner NAS1 and 1 node set as NAS2. For some reason, the current owner on 1 node keeps changing and flipping over to the other node NAS1. We have reviewed our network engineeering and network setup and configuration are fine. Based on the testing we have so far, the only way to keep consistent on the current owner on that one node is both NAS01 and NAS02 connecting to one core switch (either A or B side), but this solution is only for temporary. If the core switch goes down, we need to physical move the cable to other working switch.

If anyone experiences this problem and know the the fix for this, can you kindly shed some lights for me. Also, in Windows Clustering, is there a such thing for tuning the sensisitivity/timing or some sort....

Much appreciated.

bwong

↧

Cluster heartbeat network and cluster IP to AD and DNS communication

November 4, 2012, 7:58 pm

≫ Next: Cluster Validation fails despite installed 2531907 Hotfix

≪ Previous: Windows Clustering -ownership issue

Hi Experts,

Have the following two questions, hopefully should be straightforward. Here goes,

1. Any problem with using a common private layer 2 VLAN / network for each of the heartbeat and live migration networks for multiple separate clusters (as shown below)? AFAIK each individual cluster should have its own separate vlan / network id for heartbeat traffic though I'm not aware what exactly could go wrong.

Cluster 1

Node1 - VLAN100 - 10.192.168.1.x - Heartbeat network

- VLAN101 - 10.192.168.2.x - LM network

Node2 - VLAN100 - 10.192.168.1.x - Heartbeat network

- VLAN101 - 10.192.168.2.x - LM network

Cluster 2

Node1 - VLAN100 - 10.192.168.1.x - Heartbeat network

- VLAN101 - 10.192.168.2.x - LM network

Node2 - VLAN100 - 10.192.168.1.x - Heartbeat network

- VLAN101 - 10.192.168.2.x - LM network

2. when the DCs and DNS servers are separated from the cluster nodes by a firewall which ports and services would the cluster IP / name need access to? I know the cluster name and ip needs to be registered in DNS and cluster object (will be pre-created in this case) created in AD, etc so assuming whatever ports we need to allow for the node names and IPs has to be allowed for the cluster name and IP as well.

Thank you for your help in advance :)

↧

Cluster Validation fails despite installed 2531907 Hotfix

November 15, 2012, 9:13 am

≫ Next: 100 % failover with WSS 2012

≪ Previous: Cluster heartbeat network and cluster IP to AD and DNS communication

Have a 2008 R2 SP1 Cluster with three nodes. Try to add a forth node. Have installed Hotfix 2531907 on all four nodes.

When running cluster validation, I'm getting the error "Failed to get SCSI page 83h VPD descriptors for cluster disk.." on node one. But node one is productive with a lot of running VM's and has no visible problem. There aren't any errors in the failover cluster manager.

Does anyone have an advice how to further troubleshoot this problem? Would prefer to add the forth node without selecting the option "will never need Microsoft support for this cluster..."

Thank you all in advance for any advice

Franz

↧

100 % failover with WSS 2012

November 19, 2012, 10:19 pm

≫ Next: Use of ip-adresses

≪ Previous: Cluster Validation fails despite installed 2531907 Hotfix

Hello Technet

Is there a way to use windows storage server 2012, to create redundant storage space for iSCSI targets? I've manged to create a iscsi cluster, but it is still single point of failure, as of the disks that are shared.

I've been looking at the storage pool option, but how will you set up that cluster with a 100 % 'safe' quorum disk. Which also have to be shared?

best regards,

jesper vindum, denmark

↧

Use of ip-adresses

November 8, 2012, 12:02 am

≫ Next: Impossible to use Failover Cluster management tools on Server 2012 to manage 2008 R2 Cluster?

≪ Previous: 100 % failover with WSS 2012

hey everybody

I'll first sketch our setup to make a little bit of sense:

We have a 2-node virtual cluster (windows 2008R2) that is used to host some services.
I'm just gonna pick some random ip's:

clusterhost1 has the ip 10.255.59.2
clusterhost2 has the ip 10.255.59.3

The cluster got the ip 10.255.59.1 assigned to it
So far so good.

Now we have grouped the services into 1 group (Using the option "create empty service or application") and created an accesspoint for it with the ip 10.255.59.4

Now; i have received a question of one of the programmers of which ip to use for one of his processes.
He needs to know which ip is used when receiving or sending.

I have put a little testproces on the cluster that just accepts a telnet session on a certain port.
I've tried both ip's of the accesspoint and the ip of the cluster itself and both seem to work. Now I think it's safe to presume that using the ip of the accesspoint is better for sending to

now; let's say that my service will send some data back to an application; which ip will it use? I would think the ip of the accesspoint but i'm not sure.
Could somebody help me with the answer please?

Thanks in advance

↧

Impossible to use Failover Cluster management tools on Server 2012 to manage 2008 R2 Cluster?

November 19, 2012, 12:54 am

≫ Next: feedback on cluster errors & related windows events - sql failover cluster

≪ Previous: Use of ip-adresses

I just wanted to make sure that once again the Failover Clustering tools aren't backwards compatible.

EX: Have Server 2012 Management servers running the Failover console for a 2008 R2 Hyper-V cluster.

↧

feedback on cluster errors & related windows events - sql failover cluster

November 18, 2012, 3:32 pm

≫ Next: New to Clusters -Member Server Roles -> Cluster Roles or leave on Member?

≪ Previous: Impossible to use Failover Cluster management tools on Server 2012 to manage 2008 R2 Cluster?

Background:

HP Proliant DL380 G5 servers connected to an HP EVA 4400 san box.
Windows Server 2008 R2 and SQL Server 2008.
Sql failover clustering on two nodes.

Problem summary:

Failover cluster for a sql cluster failing the past few days. Found cluster errors in the event logs on two servers. Ran cluster validation 10 times. Failed 2 out of 10.

Messages seen in cluster validation wizard:

Failed to validate file data on cluster disk 4 partition 1, failure reason: The system cannot find the file specified.
An error occurred while executing the test.
There was an error getting information about the running processes on the nodes.
There was an error retrieving information about the Processes from node 'SQL02'.
Not found

================================

ran chkdsk on c drive of both servers - no errors

ran sfc /scannow on both servers - no errors

increased the size of the sql filestream drive - yesterday - still had errors early this morning

================================

highlights of windows events seen:

sql01 event log highlights:

11-14 1:34PM:

event 1055
Health check for file share resource 'SQL Server FILESTREAM share (MSSQLSERVER)' failed. Retrieving information for share 'FSData' (scoped to network name ...SQL) indicated that the share does not exist (error code '53'). Please ensure the share exists and is accessible.

event 1069
Cluster resource 'SQL Server FILESTREAM share (MSSQLSERVER)' in clustered service or application 'SQL Server (MSSQLSERVER)' failed.

event 1077
Health check for IP interface 'IP Address ....30' (address '....30') failed (status is '1168'). Run the Validate a Configuration wizard to ensure that the network adapter is functioning properly.

event 1069
Cluster resource 'IP Address ....30' in clustered service or application '...SQDtc' failed.

event 7034
The Distributed Transaction Coordinator (6589ecf4-6303-422b-9de3-f90653f68a14) service terminated unexpectedly. It has done this 1 time(s).

more related events, then

event 1135
Cluster node '...SQL02' was removed from the active failover cluster membership. The Cluster service on this node may have stopped. This could also be due to the node having lost communication with other active nodes in the failover cluster. Run the Validate a Configuration wizard to check your network configuration. If the condition persists, check for hardware or software errors related to the network adapters on this node. Also check for failures in any other network components to which the node is connected such as hubs, switches, or bridges.

11-17 4:53AM

event 1215
Cluster network name resource 'SQL Network Name (...SQL)' failed a health check. Network name '...SQL' is no longer registered on this node. The error code was '1453'. Check for hardware or software errors related to the network adapter. Also, you can run the Validate a Configuration wizard to check your network configuration.

11-18 1:14AM

event 1215
Cluster network name resource '...SQDtc' failed a health check. Network name '...SQDTC' is no longer registered on this node. The error code was '1453'. Check for hardware or software errors related to the network adapter. Also, you can run the Validate a Configuration wizard to check your network configuration.

1:46AM

event 1055
Health check for file share resource 'SQL Server FILESTREAM share (MSSQLSERVER)' failed. Retrieving information for share 'FSData' (scoped to network name ...SQL) indicated that the share does not exist (error code '1726'). Please ensure the share exists and is accessible.

3:16am

event 6
An I/O operation initiated by the Registry failed unrecoverably.The Registry could not flush hive (file): '\SystemRoot\System32\Config\SOFTWARE'.

4:14am

event 137
The default transaction resource manager on volume G: encountered a non-retryable error and could not start. The data contains the error code.

event 7024
The Cluster Service service terminated with service-specific error Insufficient quota to complete the requested service..
sql02 event log highlights:

11-14 1:34pm

event 1135
Cluster node '...sql01' was removed from the active failover cluster membership. The Cluster service on this node may have stopped. This could also be due to the node having lost communication with other active nodes in the failover cluster. Run the Validate a Configuration wizard to check your network configuration. If the condition persists, check for hardware or software errors related to the network adapters on this node. Also check for failures in any other network components to which the node is connected such as hubs, switches, or bridges.

11-18 4:14am

event 1135
Cluster node '...sql01' was removed from the active failover cluster membership. The Cluster service on this node may have stopped. This could also be due to the node having lost communication with other active nodes in the failover cluster. Run the Validate a Configuration wizard to check your network configuration. If the condition persists, check for hardware or software errors related to the network adapters on this node. Also check for failures in any other network components to which the node is connected such as hubs, switches, or bridges.
415am

event 1069

Cluster resource 'SQL Server Agent' in clustered service or application 'SQL Server (MSSQLSERVER)' failed.

================================

HP EVA 4400 log excerpts:

01:17:34
29-Oct-2012 Yes 31101 SCell:SAM
SC Event Code: 06324e13 - An HSV300 controller has detected only one port of all Fibre Channel devices on a loop.
01:17:34
29-Oct-2012 Yes 31260 SCell:SAM
SC Event Code: 09cdc305 - A Fibre Channel port has transitioned to the FAILED state.
01:17:34
29-Oct-2012 Yes 3028 SCell:SAM
Cannot find description for SC Event Code: 066a0028
01:17:34
29-Oct-2012 Yes 31101 SCell:SAM
SC Event Code: 06324e13 - An HSV300 controller has detected only one port of all Fibre Channel devices on a loop.
01:17:34
29-Oct-2012 Yes 31260 SCell:SAM
SC Event Code: 09cdc305 - A Fibre Channel port has transitioned to the FAILED state.

02:08:03:545
29-Oct-2012
Controller 2 066a0028 #12857
Corrective action code: 00 More details

02:08:03:531
29-Oct-2012
Controller 2 0319000a #12856
An HSV300 controller has begun discovering devices on the backend loops.
Corrective action code: 00 More details

02:08:03:531
29-Oct-2012
Controller 2 06324e13 #12855
An HSV300 controller has detected only one port of all Fibre Channel devices on a loop.
Corrective action code: 4e More details

02:08:03:531
29-Oct-2012
Controller 2 09cdc305 #12854
A Fibre Channel port has transitioned to the FAILED state.
Corrective action code: c3 More details

02:08:27:643
29-Oct-2012
Controller 1 031a000a #12853
An HSV300 controller has completed discovering devices on the backend loops.
Corrective action code: 00 More details

02:08:25:329
29-Oct-2012
Controller 1 066a0028 #12852
Corrective action code: 00

↧

New to Clusters -Member Server Roles -> Cluster Roles or leave on Member?

November 20, 2012, 8:01 am

≫ Next: Clustering Resources

≪ Previous: feedback on cluster errors & related windows events - sql failover cluster

We purchased a 60 Drive DAS unit with 30 2Tb Drives. I can have Max of 4 Hosts Connected redundantly to the DAS. We have 4 Hosts Connected.

The Cluster was Originally intended to run Hyper-V VMs in a Cluster. The more I read about Clustering and Windows 2012, it seems like I might want to re-think my Member Server roles.

The two that I'm thinking about at the moment are File Servers and DHCP. These have

We have 12 TBs of File server Storage. Should I add the File Server Role to the Cluster or Keep with my two VM Member Servers that Replicate Data?

Same with DHCP?

Now that I have the Cluster I want to be sure I use it to the fullest and not just for Hyper-V VMs... We have 15 VMs that are going to be Spread across the 4 Host Cluster. 5 Of which get decent usage, the other 10 are pre-production test units or low usage member or DC Servers.

Any Comments or Suggestions would be Great...

Thank you,

Scott<-

↧

Clustering Resources

September 25, 2009, 6:09 pm

≫ Next: One VM over spanning over few CSVs?

≪ Previous: New to Clusters -Member Server Roles -> Cluster Roles or leave on Member?

Hi Cluster Fans,

We've published a comprehensive list of cluster resource which you may find helpful to our team's blog: http://blogs.msdn.com/clustering/archive/2009/08/21/9878286.aspx

Categories include:

-Useful Sources

-Windows Server 2008 R2

-Core

-Deployment, Migration & Upgrades

-Exchange Server

-File Server, DFS-R, DFS-N & NFS

-Hyper-V

-Miscellaneous

-Multi-Site Clustering

-Network Load Balancing

-Other Resources & Workloads

-PowerShell, Cluster.exe & Scripting

-Print Clustering

-SQL Server

-Utilities

Thanks!
Symon Perriman
Program Manager II
Clustering & High-Availability
Microsoft

SymonP_MSFT

↧

One VM over spanning over few CSVs?

November 16, 2012, 1:19 pm

≫ Next: Need insight with Windows server 2012 hyper-v and storage pool clustering

≪ Previous: Clustering Resources

I have a client who is looking to get best performance for heavy-duty virtual machines. Configuration of the virtual machine in the cluster is like this:

a. Operating system on fixed size vhd - placed on CSV1

b. SQL database with trans. logs on second fixed-size VHD - placed on CSV2

c. Other data files and application files on third fixed-size VHD - placed on CSV3

Now, although this is working now - I am having problems with migration between nodes (fails). My general question is:

1. Is spanning one VM with few VHDs (each VHD on it's CSV) supported?

2. Is there anything specific that should be taken care of? For example - I am unable to alter dependencies to have all CSVs dependent on this VM i.e. if moving VM between nodes. to automatically also move all dependant CSVs? Is this possible somehow (to have this dependancy)?

3. Where should I keep VM configuration files? What is best practice for placing configuration files?

Thank you,

↧