Quantcast
Channel: High Availability (Clustering) forum
Viewing all 6672 articles
Browse latest View live

DR node 3 showinf as down with its network unavailable. Any threshold

$
0
0

Hi Team,

We have WSFC between Primary and  DR sites. 2 nodes in primary and 1 node in DR. Unfortunately, node 3 goes down very frequently and we are not able to detect the root cause. Even if primary nodes are up , node 3 never rejoins by itself. We have to evict and then rejoin it to the cluster. We tried to do test to see the behavior for node 3. We disabled the NICs on node 3 for 5 min and bring it back, it connects and rejoins the cluster by itself. As a 2nd test, we again disabled the NIC and kept the node 3 in DR down for 45 min. Now if we resume the network (NIC), node 3 still remains in 'down' status.

Is there any threshold after which down node actually stops trying other nodes in the cluster? We tried starting cluster services on node 3 manually , even then also it never comes online.

Cluster has SQL AG installed but i am not sure if SQL could bring down the node and hold it not to come online..

Any pointers will be appreciated

Regards,


Failover cluster server - File Server role is clustered - Shadow copies do not seem to travel to other node when failing over

$
0
0

Hi,

New to 2012 and implementing a clustered environment for our File Services role.  Have got to a point where I have successfully configured the Shadow copy settings.

Have a large (15tb) disk.  S:

Have a VSS drive (volume shadow copy drive) V:

Have successfully configured through Windows Explorer the Shadow copy settings.

Created dependencies in Failcover Cluster Server console whereby S: depends on V:

However, when I failover the resource and browse the Client Access Point share there are no entries under the "Previous Versions" tab. 

When I visit the S: drive in windows explorer and open the Shadow copy dialogue box, there are entries showing the times and dates of the shadow copies ran when on the original node.  So the disk knows about the shadow copies that were ran on the original node but the "previous versions" tab has no entries to display.

This is in a 2012 server (NOT R2 version).

Can anyone explain what might be the reason?  Do I have an "issue" or is this by design?

All help apprecieated!

Kathy


Kathleen Hayhurst Senior IT Support Analyst



Clustering 2 Windows 2003 Server machines

$
0
0
Hi, I have Win 2003 machine which has 2 application with databases respectively - they're 3rd party applications. For now I've got one machine with RAID 1 configured, but if something fails I want to make another node and to make some kind of synchronizing the database between 2 servers. Is this possible, because I'll need another server for the storage - for example some NAS4Free NAS machine I can configure and ofcourse I'll use higher version of Windows Server? Thank you.

convert 2 nodes cluster to standalone

$
0
0

Hi,

I'd like to convert 2 nodes cluster (2008 server) to a standalonde server :

- move all services to one node (node1)

- stop node 2

- delete node2 from cluster

- delete cluster role from node 2

and then, I do not know what to do : convert node1 to standalone ? or migrate disks frome node1 to the standalone new server (node2)

I'd searched for some help in forums, but I do not find any example like mine. thanks for your help.


hatem

Load Balancing

$
0
0

Question with Window Server 2012 Network Load balancing

My port range is set From 2000 to 2050, when I send file through port 2008 to cluster, will the file reallocate to other port or still will through port 2008

Or how i can reallocate the transmit port when traffic?

Best regards

WSFC Virtual IPs and GARP

$
0
0
Where I work we are setting up a new system and I am experiencing a problem I am hoping others have experienced and can offer some recommendations on.

I have setup a new Windows Server 2012 R2 2-node WSFC that is hosting a SQL Server AlwaysOn Availability Group. I have setup AGs before and have never really had problems with them, however now I am starting to see an issue. Our WSFC nodes are on one VLAN and subnet and the clients that connect to SQL (SharePoint servers in this case) are in another VLAN and subnet. Whenever a failover of the AG occurs the SharePoint servers lose connectivity to the AG for approximately 13-15 minutes before connectivity is restored. It is not a routing issue as continuous pings to the WSFC nodes and the WSFC IP never fail, just pings and other connection attempts to the AG listener. It is also not a problem with SQL as I can verify connectivity to a specific AG node.

After looking at this problem with our networking guys the problem appears to be related to gratuitous ARP (GARP). When failover of the AG occurs a GARP request is supposed to update devices on the local network of updated IP/MAC information so that requests to that IP are sent to the right network interface. However I work in a federal government environment and DISA STIGs mandate that infrastructure routers disable GARP (V-5618 for those that are interested https://www.stigviewer.com/stig/infrastructure_router__cisco/2013-10-08/finding/V-5618). This seems to have the effect of traffic outside of the local network timing out after a failover until the entry in the VLANs ARP table expires and gets updated. Disabling that setting for testing showed AG failovers being near instantaneous.

Has anyone encountered this or have any recommendations? I am trying to make the case for multiple NICs on the SharePoint servers so that they can communicate with the SQL servers on the same VLAN, but I am getting push back and am trying to see if there is an alternate solution that can be researched.

Joie Andrew "Since 1982"

Cluster Aware Updating Corruption

$
0
0

Hi everyone!

We're getting the below error when a CAU run occurs for one cluster in particular (all the other clusters are fine).

CAU run {00000000-0000-0000-0000-000000000000} on cluster vc11 failed. Error Message:The data stored by the Cluster-Aware Updating cluster resource is corrupt. The name of the corrupt data is: CauRunState. Error Code:-2146233088 Stack:   at MS.Internal.ClusterAwareUpdating.CrmRunStateManager._LoadState()   at MS.Internal.ClusterAwareUpdating.CrmRunStateManager.<InitSessionStateAsync>d__3b.MoveNext()--- End of stack trace from previous location where exception was thrown ---   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)   at MS.Internal.ClusterAwareUpdating.RunStateImplBase.<InitSessionAsync>d__4.MoveNext()--- End of stack trace from previous location where exception was thrown ---   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)   at Microsoft.ClusterAwareUpdating.Commands.InvokeCauRunCommand.<_ProcessCluster>d__78.MoveNext()


We have tried removing the cluster role and re-creating it, but the same thing occurs.

I can't find anything else online relating to this corruption error.

Any ideas?


Thanks!

Pter

SQL Server 2008 r2 failed on Windows Server 2012

$
0
0

Hi All

I have a Windows 2012 Server and i want to install cluster SQL 2008 R2

I validate my configration without any error and i create my cluster windows without any problem,

I create also DTC role and every thing is running and online but,

when i want to install sql server failover cluster that message appear

  •  Cluster Service verification Failed
  • Cluster shared disk available check failed

I try to install sql with command line but the same problem.


Storage Spaces Direct Windows Server 2016 Lab Testing physical disks issue

$
0
0

alright folks , first of all thanks for the help if i get a answer.

we are always on the look to test out some new stuff and this time we want to test Storage Spaces Direct,

in case it proves to be good , we can implement this in production.

for testing purpose i use 2 HP DL380 Gen 9 servers , i had some problems getting the disks to be recognized as SAS in Windows Server , but after a bit of google time i found out that after deleting all of the RAID config's on the HP P440ar and putting it into HBA Mode , it is possible.

then , after upgrading the server with the latest SPP i am now able to see the drives come up as SAS, so finally i thought i was able to start using S2D , but no , thats not the case.

at this point the 2 servers are identical , and they both have 2 SAS HDD 10K 146GB drives and 4 SAS SSD 460GB drives in there.

i installed windows server 2016 in UEFI mode on one of the 460GB drives , you could ask me why but i had some issue's installing and i just wanted a working system to test out S2D , i didn't cared about the storage at that point because after the Lab i'm erasing the configuration.

at this point , all i am trying to do is execute the command Enable-ClusterStorageSpacesDirect but i get a error straight away , saying S2D is not supported on my system , i verified the drives are shown as SAS in server manager , but when i run the cluster validation test , and specifically the storage spaces direct tests , i get the following error on one of the drives :

Disk is a boot volume. Disk is a system volume. Disk is used for paging files. Disk partition style is GPT. Disk has a System Partition. Cannot cluster a disk with a System Partition. Disk has a Microsoft Reserved Partition. Disk has a Basic Data Partition. Cannot cluster a disk with a Basic Data Partition. Disk has a Microsoft Recovery Partition. Disk type is DYNAMIC.

this error makes sense , since i had to install my OS on once of the drives , but what does that mean at point of configuration , did i made a mistake or can i exclude one of the disks ... i could use some help understanding this a bit better.

Cluster disks showing reserved on both nodes.

$
0
0

Hello Folks,

We have configured file services on the Windows Storage server 2012 R2 cluster. Team has recently restarted both nodes at the same time and after that my cluster went down. I am not able to open cluster as cluster service is not stable. It is continuously restarting due to loss of quorum disk. When i checked disk management on both nodes, I found that all cluster disks are reserved. I tried to remove reservation with the help of following...

1. Clear-clusterreservation: Command executed sucessfully but still disk is showing reserverd.

2. Cluster node Server name /clear

3. Tried to remove attacheddisk registry settings...after server reboot that entry is coming automatically. 

 

Please help me to remove cluster disk reservation. 



Thanks, Chinmay.

Random Cluster Failures

$
0
0

Hey guys, 

Really need a hand here, I have a production cluster with 2 R630s 256g RAM, 3 R610s 192g RAM 1 that is a hot spare on 2012R2 Data Center. Recently I updated the NICS with Microsoft drivers (intel ethernet server adapter x520-2 driver 2012r2 data center) and shortly after starting having a lot of VMs randomly failing on random hosts, a few at a time.

160VMs that average 30-40 VMs per host.

After updates, re-installs of actual intel drivers, pushing out VM hardware re-configurations, i'd finally realized a huge issue. The driver update cut the VMQ ports back to the default 32. 

Reconfigured all of them back to 64 and for a few days i had no issues and was sure I had found the issue.

Came in this morning to find out over the weekend there was another 15 reboots.

So far the only commonality I've found is that this has only happened to our Gen 1 systems (we have 90 so far 51 have had reboots)

Here's a snipit from the cluster log around a VM failure:

0000117c.00002cfc::2016/12/19-07:34:57.705 INFO  [RHS] Resource Virtual Machine Configuration <VM NAME> called SetResourceLockedMode. LockedModeEnabled0, LockedModeReason0.
00000d8c.000029d0::2016/12/19-07:34:57.705 INFO  [RCM] HandleMonitorReply: LOCKEDMODE for 'Virtual Machine Configuration <VM NAME>', gen(0) result 0/0.
00000d8c.000029d0::2016/12/19-07:34:57.705 INFO  [RCM] Virtual Machine Configuration epcr-harvardil: Flags 1 removed from StatusInformation. New StatusInformation 0
0000117c.00002cfc::2016/12/19-07:34:57.705 INFO  [RHS] Resource Virtual Machine <VM NAME> called SetResourceLockedMode. LockedModeEnabled0, LockedModeReason0.
00000d8c.000029d0::2016/12/19-07:34:57.705 INFO  [RCM] <VM NAME>: Removed Flags 1 from StatusInformation. New StatusInformation 0
0000117c.00002cfc::2016/12/19-07:34:57.705 INFO  [RES] Virtual Machine <Virtual Machine <VM NAME>>: Current state 'Terminated', event 'VmStopped'
00000d8c.000029d0::2016/12/19-07:34:57.705 INFO  [RCM] HandleMonitorReply: LOCKEDMODE for 'Virtual Machine <VM NAME>', gen(3) result 0/0.
00000d8c.00000944::2016/12/19-07:34:57.705 INFO  [GUM] Node 3: executing request locally, gumId:71035, my action: /dm/update, # of updates: 1
00000d8c.00000f90::2016/12/19-07:34:57.705 INFO  [DM] Starting replica transaction, paxos: 460:460:576650, smartPtr: HDL( 2c83f5f2b0 ), internalPtr: HDL( 2c85294340 )
00000d8c.00000f90::2016/12/19-07:34:57.720 INFO  [DM] Finished replica transaction, paxos: 460:460:576650, smartPtr: HDL( 2c83f5f2b0 ), internalPtr: HDL( 2c85294340 ), status: 0
00000d8c.00000944::2016/12/19-07:34:57.720 INFO  [RCM] HandleMonitorReply: INMEMORY_NODELOCAL_PROPERTIES for 'Virtual Machine <VM NAME>', gen(3) result 0/0.

Logs are also littered with these SQL errors which was what eventually led me to updating the hardware configurations of the VMs:

00000af0.0000239c::2016/12/19-03:22:41.113 ERR   [RHS] s_RhsRpcCreateResType: (126)' because of 'Error loading resource DLL fssres.dll.'
00000cec.000006f8::2016/12/19-03:22:41.113 INFO  [RCM] result of first load attempt for type SQL Server FILESTREAM Share: 126
000014e0.000026c0::2016/12/19-03:22:41.129 INFO  [RES] Physical Disk: HarddiskpIsPartitionHidden: device \Device\Harddisk2\ClusterPartition2 0
00000af0.0000239c::2016/12/19-03:22:41.238 ERR   [RHS] s_RhsRpcCreateResType: (126)' because of 'Error loading resource DLL hadrres.dll.'
00000cec.00001e88::2016/12/19-03:22:41.238 INFO  [RCM] result of first load attempt for type SQL Server Availability Group: 126
00000af0.0000239c::2016/12/19-03:22:41.254 ERR   [RHS] s_RhsRpcCreateResType: (126)' because of 'Error loading resource DLL fssres.dll.'
00000cec.00001e88::2016/12/19-03:22:41.254 INFO  [RCM] result of first load attempt for type SQL Server FILESTREAM Share: 126

Any Ideas???

Can't configure Cluster Aware Updating

$
0
0

I'm trying to install the Cluster Aware Updating service. But I'm not be able to fix this error:

“Unable to create the CAU clustered role because a Network Name resource could not be created. This can occur if a computer account (virtual computer object) for the role could not be created in the domain. Check the event log for more information. If the cluster name account does not have permissions to create the object, you can pre-stage a computer account in Active Directory. Then, use the Add-CauClusterRole Windows PowerShell cmdlet with the VirtualComputerObjectName parameter to create the CAU clustered role. For more information about pre-staging computer accounts, see http://go.microsoft.com/fwlink/p/?LinkId=237624.”

I hava prestage a computeraccount: CAU-ATC
The computeraccount of the cluster is: ATC-CLUSTER

I give the ATC-CLUSTER account permission to create computeraccounts in the OU of the cluster.

But still, I get this error.

The error.

The OU with the accounts.

Enter the computer object...

The persmissions.

Error: The computer is joined to cluster when creating the Cluster

$
0
0

Hello Guys,

I have created a cluster to configure Hyper-V for 2 Nods, everything was greate and works perfectly, next day the storage hang and the cluster didn't work any more, I have destroyed the cluster the removed the cluster feature from both nods, deleted the cluster-computer from AD and the deleted the storage. after we fixed the storage, I have reconnect the storage, installed the cluster service on both nods, then I have validate the configuration and I had everything green 100%.

while creating the cluster, I faced an issue Unable "to successfully cleanup" I kept trying and removed the anti-virus, restarted the servers manytime, then I ended up to have another error, direclty when I add the server name on the creat cluster wizard, its telling me that the computer I'm adding is joined to cluster.

I think I need to do some cleaning to the previous cluster, can I have some help here ?

Regards..

Nour


Nour

Sharing entire cluster volume

$
0
0

Hi, 

I am trying to share the entire cluster volume in one shot instead of individually creating a share for each subfolder under the volume, is this possible?



Hyper-V Server 2016 with Intel Core2 Duo E7500 CPU

$
0
0

I just performed a rolling cluster upgrade and everything seemed to go well. All the nodes were successfully upgraded after passing "SLAT" testing per the documentation. Everything is working except for 2 of my nodes will not start any VMs.  Storage and VMs can be moved onto the nodes but when you try to start the VM you receive this message:

'Virtual Machine MYMACHINE' failed to start.
'MYMACHINE' failed to start. (Virtual machine ID 5FBD1590-7B64-4972-ADAD-E3D578B35349)
Virtual machine 'MYMACHINE' could not be started because the hypervisor is not running (Virtual machine ID 5FBD1590-7B64-4972-ADAD-E3D578B35349). The following actions may help you resolve the problem: 1) Verify that the processor of the physical computer has a supported version of hardware-assisted virtualization. 2) Verify that hardware-assisted virtualization and hardware-assisted data execution protection are enabled in the BIOS of the physical computer.  (If you edit the BIOS to enable either setting, you must turn off the power to the physical computer and then turn it back on.  Resetting the physical computer is not sufficient.) 3) If you have made changes to the Boot Configuration Data store, review these changes to ensure that the hypervisor is configured to launch automatically.

I have verified that all of these options are enabled in the BIOS.

Has anyone else had any success getting a VM to fire up on a Core 2 Duo PC with Hyper-V Server 2016 installed on it?

These machines worked fine with Hyper-V Server 2012 R2


Node failure in S2D Hyperconverged cluster

$
0
0
The data within a S2D cluster is reilient to a node failure, but what happens with the VM's that were running on the failed node?
Are they relaunched automatically on the remaining nodes?

Windows Failover cluster between Physical and Virtual nodes

$
0
0

Dear Team,

One of our customer wants to build a SQL Cluster where 1<sup>st</sup> node is Physical server and 2<sup>nd</sup> node will be VM running on VMware ESX 6.0.

As per my understanding MS supports this type of configuration in Windows 2012.

During Failover Cluster Validation – it failed at validating “MS MPIO based disks”

We were able to continue to cluster implementing by skipping this test, since we know these 2 nodes are having different version / type of MPIO. ( physical & Virtual )

We would like to know should It be any major issue or we can just continue using this setup, as this is going to be one of the critical database server in production.

Please provide your best supported configuration document from Microsoft.

Thanks,

ABUL

Setup New Windows 2012 R2/2016 Server As Domain Controller and Clustering

$
0
0

Actual Setting

1. Windows 2008 R2 Servers Work Group running as a Remote Desktop Services (RDS) Server (or old name: terminal services) giving remote offices access to a Medical Billing Apps.

2. SQL Server 2008 R2 Database

3. Application (1 Main Medical Billing Application)

4. About 100 users with 70 workstations

5. (3) Remote offices (remote in to the RDP server using Remote Desktop Services(RDS) to access the Main medical billing application)

6. The 2 Servers are located in the main office and the rest of the users are located in 3 difference remote offices

7. Workstations in the remote office and main office running Windows 7 Pro and Windows 10 Pro

Issues: users have been complaining with system slowness and needs to retire Win2k8 R2 Server

Propose for New Scenario with New Servers

1. Dell Servers ( 2 PowerEdge T330 Servers and 2 PowerEdge T630 Servers)

2. OS: Windows Server 2016

3. Database: SQL Server 2014/2016

New Planning For the Project

1. Use 1 of T330 server as a primary domain controller

2. Use the other T330 as a backup domain controller

3. Setup the 2 T630s as a Cluster host for Hyper-V VM to host the Medical Billing App and SLQ Server database

4. Use a VM or 2 VMs as RDS/Terminal Server for Medical Billing Apps for the remote offices to access the Medical Billing Apps

I need some help here with the above new proposal setup. Money is tight and I need to do this in the most efficient and financial way possible to save time and money.

a.) The servers come with onboard SATA RAID, should I use the onboard RAID or should I purchase external RAID hardware Controller?

b.) What is the most efficient way to setup these servers that provide flawless remote connection for the remote office users? NOTES: Remember, the Medical Billing Software is very expensive, therefore, it must be installed the same way as in the previous (actual environment) settings above on 1 Server and share via terminal services (Remote Desktop Services, RDS)

c.) How about DirectAccess vs. VPN for the remote offices? Is DirectAccess a feasible solution over VPN?

d.) What is the best way possible to setup this new system as mentioned above?

I look forward to read your input soon!!!

Thanks 

Cluster Storage Disks vs. Pools

$
0
0
I'm setting up a Hyper-v Failover cluster for the first time and am unsure of when and why to put my disks into a pool or just create disks. I have (2) LUNs on my DAS. One RAID 10 with 15K drives in it for SQL and another RAID 10 with 7.2K drives for general storage. I think creating two disks and not using a pool makes the most sense. However, I'm unsure of what circumstance using a pool would be better.

Delete and recreate bitlocker-encrypted clustered file shares?

$
0
0

Hi all - I have a Server 2012 Cluster connecting to a SAN with Basic Disks shared out to 2 different Clustered Shared Volumes.

However, I would like to extend one of the 4TB drives to 6TB (only supported with Dynamic Disks, AFAIK, which in Server 2012 are only supported with an add-on from Symantec which I won't be able to purchase, AFAIK), and I would like to install iSCSI on the Cluster.  The CSV's are bitlocker-protected.

What's the best way to go about this?  Is it as simple as removing the cluster roles, installing the iSCSI services (which can only be set for Server 2012 Clusters at creation - it just won't work trying to install and configure after creating the Cluster)  and reinstalling the Cluster roles with the same names with the same nodes, and all the sub-shares/permissions will be intact?  I think that I can add the storage to the CSV Disk pool if it doesn't pick up the extra 2TB that have been provisioned on the SAN side, but even that seems like it could go sideways with bitlocker.

Thanks!


-Ken

Viewing all 6672 articles
Browse latest View live