Quantcast
Channel: High Availability (Clustering) forum
Viewing all 6672 articles
Browse latest View live

Move cluster to another location

$
0
0

Hi

I have build windows 2012 R2 cluster and it is all working fine. However both nodes in the cluster have to be physically moved to another location.

Never done something like this.

Location that I am supposed to move servers in just few meters again. Would it be proper solution to shut one node first move it  to another location and bring it back and then shut another node

Servers at this moment dont have any roles or even disk attached to it so basically it is just clean cluster with nothing on it. is it ok just shut down and move. As I said no even shared disk no quorum it is just bare cluster


Dalibor Bosic


File share witness on a NAS -- permission problem

$
0
0

My research has not yet yielded any results.  I'm trying to configure a file share witness on a Windows 2012 failover cluster.  The file share is on a NAS.  My login has full control on the file share.  When I try to configure the witness using the Configure Cluster Quorum Wizard, I get an error message:

"You do not have sufficient privilege to get the information for the file share '[share]'."

The cluster service on all nodes as running as local system account.  We tried giving the machine accounts full control on the file share, and that didn't work.

I would appreciate any suggestions.  Thanks.

SMB Access denied for Cluster Role Resource

$
0
0

Dear All,

   I have Window 2008 R2 File Server Fail over cluster which is having in Production. As part of DR fail-over test i have created another stand alone Windows 2008 R2 Server with File Server role enabled. 

currently File Server disk (Disk) replication to DR with 3rd party product and during fail-over productioncluster role offline and attaching production disk to DR stand alone machine

Once disk attached to the DR host then changing the "Cluster Role - DNS "A" record IP Address pointing to DR Server .

when the users are trying to access the user Home folder or shared folder user getting access denied error. tried the \\DNS and FQDN (the access denied error. )

when i login to any workstation or Server with local administrator try to access same SMB using \\DNS and FQDN name it's working fine.

Any idea? 

Failover Cluster Manager in Windows 10

$
0
0

I am trying to figure out if there is a way to connect through failover cluster manager on windows 10 to a failover cluster in a server 2008r2 ?.

I know the versions aren't compatible , with windows 10 being Version 10.0 and 2008r2 being version 6.1. Has anyone tried this or is having the same issues ?.

Thanks.




how to remove the Client Access point from cluster?

$
0
0

hi, experts

windows 2012 cluster, i added 1 unuseful Client access point to my role, how to remove it?

Regards

Garey

active-active to active-passive

$
0
0

hi
I want to know how I can change the active-active cluster role to active-passive?

in other words I want to change active server to passive server

How to rebalance a Cluster Virtual Disk in Windows Server 2016

$
0
0

Hi,

I've a Windows Server 2016 with Storage Spaces Direct configured.

After a disk failure of one of the disks (successful repaired) the vDisk now reports the status of "needs rebalance".

Any idea on how to fix this?

Repair-Virtualdisk or Repair-StoragePool didn't worked, also Optimize-StoragePool did not work.


How to import sched task XML into cluster

$
0
0

I have a 2-node Windows Server 2012 cluster with a SQL Server 2008 R2 resource.  I want to create a cluster aware scheduled task which runs a batch file weekdays at 8am and every hour for 12 hours thereafter.  I already have a sched task which I imported from a stand alone SQL server into both of my cluster nodes before i realized that that does not create a cluster-aware task.  I can export the existing task as an XML file.  I can use this PowerShell to import it into the cluster to make it cluster-aware...

Parameter Set: Xml
Register-ClusteredScheduledTask [-TaskName] <String> [[-TaskType] <ClusterTaskTypeEnum> ] [-Xml] <String> [[-Cluster] <String> ] [[-Resource] <String> ] [-AsJob] [-CimSession <CimSession[]> ] [-ThrottleLimit <Int32> ] [ <CommonParameters>]
comes from here...

https://technet.microsoft.com/en-us/library/jj649806(v=wps.630).aspx

Is there any examples of using that syntax?  For ex, what does the -XML string parameter look like?  The above article doesn't have an example of using -XML.  Thanks.




Is it possible to increase LUN sizes once it is mapped as cluster resource for SQL instance?

$
0
0

Hi Team,

We have SQL Failover cluster running with multiple SQL instances on Windows Failover clustering. We are using mount points with mapped volumes mounted as folders. We have mount point of around 3 GB and wondering if is it possible to increase the LUN size for mount points without breaking the cluster or instance? If i take mount point and its mounted volumes offline, would clustering allow to increase its size? or it is not possible unless we uninstall SQL instance that is using this mount point first, then re format and reassign the mount volumes again?

I am wondering what could if i could able to just increase the LUN size of mount point just stop the sql names instance in cluster console, bring the disk offline and then just increase the mount point LUN size? or any other way presenting another LUN and merging it with existing (like we extend drive on machine some time with unallocated space ).

Any pointers pls. Thanks

Regards,

Event 1207, Source: Microsoft-Windows-FailoverClustering-Cluster network name resource 'Network Name' cannot be brought online

$
0
0

Hi,

The following event is logged daily on my SCC mailbox server.

Log Name:      System

Source:        Microsoft-Windows-FailoverClustering
Date:          8/26/2012 11:17:14 AM
Event ID:      1207
Task Category: Network Name Resource
Level:         Error
Keywords:      
User:          SYSTEM
Computer:      MAILBOX01.DOMAIN.LOCAL
Description:
Cluster network name resource 'Network Name (MAIL01)' cannot be brought online. The computer object associated with the resource could not be updated in domain 'DOMAIN.LOCAL' for the following reason:
Unable to update password for computer account.

The text for the associated error code is: Access is denied.

 
The cluster identity 'CLUSTER01$' may lack permissions required to update the object. Please work with your domain administrator to ensure that the cluster identity can update computer objects in the domain.
Event Xml:
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
  <System>
    <Provider Name="Microsoft-Windows-FailoverClustering" Guid="{baf908ea-3421-4ca9-9b84-6689b8c6f85f}" />
    <EventID>1207</EventID>
    <Version>0</Version>
    <Level>2</Level>
    <Task>19</Task>
    <Opcode>0</Opcode>
    <Keywords>0x8000000000000000</Keywords>
    <TimeCreated SystemTime="2012-08-26T08:17:14.773Z" />
    <EventRecordID>541802</EventRecordID>
    <Correlation />
    <Execution ProcessID="6364" ThreadID="8036" />
    <Channel>System</Channel>
    <Computer>MAILBOX01.DOMAIN.LOCAL</Computer>
    <Security UserID="S-1-5-18" />
  </System>
  <EventData>
    <Data Name="ResourceName">Network Name (MAIL01)</Data>
    <Data Name="DomainName">DOMAIN.LOCAL</Data>
    <Data Name="FailureString">Unable to update password for computer account</Data>
    <Data Name="Status">Access is denied.
</Data>
    <Data Name="ClusterIdentity">CLUSTER01$</Data>
    <Data Name="BinaryParameterLength">4</Data>
    <Data Name="BinaryData">05000000</Data>
  </EventData>
</Event>

The thing is my Cluster name resource is in fact ONLINE. The Cluster identity does have the necessary permission to update the object. Basically everything is working as it's supposed to. Still I am getting this event once daily at a particular time.

I would like to know why I am getting this event and any way to get rid of it? 

Thanks in advance

Shadow Copies on 2012 R2 File Server Cluster

$
0
0

Hello all!

I've inherited a two node physical 2012 R2 file server cluster that contains a few SMB shares on a single clustered disk.  I'd like to enable shadow copies for this shared disk but want to store the shadow copy data on its own shared disk as per shadow copy best practices (at least on non-clustered file servers).

I created a second cluster disk, assigned it to the same resources as the SMB clustered disk. 

Now historically I've enabled shadow copies through computer management but I want to ensure the shadow copies are cluster aware so in the Failover Cluster Manager I open Storage | Disks and right click the SMB cluster disk and click the Shadow Copies tab. From the tab I can Enable shadow copies, this sounds like what I'm looking for, unfortunately it does not give me an option to choose a disk/volume to store my shadow copy data, so this can't be right.

My next step was to connect computer management to the cluster virtual server name (NOCFS4) and through the System Tools | Shared Folders | All Tasks | Configure Shadow Copies it shows me the correct number of shares on the SMB cluster disk plus I can see the Settings button for configuring the location and size limit for shadow copies, however once I tweak the shadow copy location and size settings, click Ok and click Enable to turn on the shadow copies I get a long pause and an error about not being able to create a schedule. So it seems connecting to the cluster virtual object is not the answer either.

That leaves using computer management to connect to one of the physical server nodes of the cluster. When I open the shadow copy interface on the physical node I note that it shows a 0 for the number of shares it detects on the SMB clustered disk. This doesn't surprise me since this interface isn't cluster resource aware.

So I'm stuck. Does anyone know the "Microsoft way" to enable shadow copies on a clustered disk while storing the shadow copy data on a second cluster disk both attached to the same cluster resource?

Storage Spaces Direct

$
0
0

Hi,

Does the -CacheMode Disabled parameter for Enable-ClusterS2D still exist in the release version of Windows Server 2016?

I am sure it worked in TP5 but I am getting the following message

Enable-ClusterStorageSpacesDirect : A parameter cannot be found that matches parameter name 'CacheMode'.

Cheers,

Iain

Server 2012R2 Failover Clustering - Clusterstorage folder - Empty Folders

$
0
0

We have 8 Virtual Hosts connected to a HP P2000 SAN. We use separate network for iSCSI, Live Migration, CSV and the main 'Production' network.

We are finding that VM's drop off the network with a "Critical error". What in fact has happened is that the host has lost connection to one of the Cluster Storage folders on c:\cluster storage. If you then look at another host, it can see these folders and its contents. So we have to import the VM onto that host to get it functional again.

We have checked iSCSI initiator and all connections are there. Also MPIO appears to be correct. Rebooting the host seems to fix the issue and the folders contain VMs again. However when you try and move VMs back onto that host, some of the folders go offline again.  Cluster validation has been run on all hosts and no issues are reported.

We have various event log entries including this:

Cluster Shared Volume 'CSV19' ('CSV19') is no longer accessible from this cluster node because of error '(1460)'. Please troubleshoot this node's connectivity to the storage device and network connectivity.

Has anybody seen this issue before?

Thanks in advance

Shaun

Windows 2012 Fail-Over cluster losing cluster name

$
0
0

Hello

We are running a two nodes fail-over cluster on Windows 2012 R2 running four SQL Server instances and connected to network via a converged network adapter HP CN1200E for LAN communications and iScsi LUN access.

A couple of days ago the cluster service failed on node 2 while all instances were running on it.

Event logs show that issues occurred with some Windows cluster resources:

Event 1215 was raised on node 2.
Cluster network name resource 'Cluster Name' failed a health check. Network name 'PROD-SQL-BU' is no longer registered on this node.  The error code was '-1073741663'. Check for hardware or software errors related to the network adapter. Also, you can run the Validate a Configuration wizard to check your network configuration.

All cluster resources then failed and all SQL Server instances were activated on node 1.

I have to admit I am a bit puzzled out.
I am NOT a beginner regarding fail-over clusters however I have to admit I do not know what to do to solve the issue with that cluster.

This cluster runs many resources as you can see in the summary below:

> Role "SQL Server (AX)"

  • Role:Analysis Services (AX)
  • Storage:3 LUN
  • Server name :PSBU(IP address x.x.x.36)
  • Other ress. :SQL Server (AX), SQL Server Agent (AX)

> Role "SQL Server (DWH)"

  • Role:Analysis Services (DWH)
  • Storage:3 LUN
  • Server name :PSBU-DWH(IP address x.x.x.196)
  • Other ress. :SQL Server (DWH), SQL Server Agent (DWH)
  • File server :\\PSBU-DWH

> Role "SQL Server (M1)"

  • File server :\\PSBU-M1
  • Storage:3 LUN
  • Server name :PSBU-M1(IP address x.x.x.37)
  • Other ress. :SQL Server (MISC1), SQL Server Agent (MISC1)

> Role "SQL Server (SP)"

  • Storage:3 LUN
  • Server name :PSBU-SP(IP address x.x.x.127)
  • Other ress. :SQL Server (SP), SQL Server Agent (SP)

As I said earlier network access is made using a Converged network Adapter HP CN1200E.
The card has an "iScsi" personnality allowing both classic LAN communications and iScsi communications.

Windows Server sees both 10 GBits ports and a Team is created using both 10GBits CN1200E ports in switch independent mode.

Another 1Gbits adapter connection on each node is made to allow cluster communications between nodes using a non-routed IP address.
(This is old fashioned but was set up to prevent complete loss of communications between nodes if the CN1200E card fails).

All monitoring software and probes showed that the network interfaces of CN1200E adapter did not failed.
Cisco logs also showed that network communication was never lost with the 10 Gbits adapter.

CLUSTER VALIDATION REPORT

I ran the report this morning.
It showed the configuration was not correct with a specific settings of each cluster role : for each role the report advised that I should set parameter "RegisterAllProviderIP" to 0 to prevent issues.

I changed that setting and it will be applied next time the roles move from one node to another.

QUESTIONS

What should I also check?

Florent




Add Node to SQL Server failover Cluster failed with invalid SKU error

$
0
0

Dear Team,

After creating new SQL Server failover cluster node, trying to add the 2nd node to the SQL failover cluster, after entering the product key (same key used for 1st node installation), hit next, I'm getting an error: SQL Server Setup encountered the following error: The current SKU is invalid.

Appreciate your support for the same ASAP.

Regards,

Hakim. B


Hakim.B Sr.System Administrator


Migrate 2008 R2 File Server Cluster to a 2012 R2 Standalone File Server

$
0
0

I need to migrate a Win 2008 R2 File Server Cluster to a 2012 R2 Standalone File Server. The migration includes all files, permissions and shares to a standalone server. After the migration I want the server to have the same name as the cluster service name. 

Unable to connect to cluster

$
0
0

Hi Team,

I am not able to connect to cluster in windows 2012 server getting error message as below.




Sivakumar Thayumanavan

Any beter way to do it?

$
0
0

2 node cluster windows 2008 r2 with node and file witness share
the cluster resource (DHCP) is on cluster disk F
now, we need to decommision the storage which holds file share and cluster disk.
We need to switch file share and cluster disk to new storage.
my plan is to take cluster disk offline and add new one. change the new one to F drive
and copy files from old disk to the new one.
then chanbge quorum from node and file witness share
the cluster resource to node majority; then change it to node and file witness share
(with new storage share)

Any beter way to do it?

Thank you!

NLB - Duplicate packets

$
0
0

Hi All,

I am running Server 2012 R2 and I have configured NLB between two of my webservers.

When I ping the NLB IP address from a Mac or Linux box I notice that I get duplicate packets. I send 10 packets and get 20 returns.  Is this normal?  I would have thought the NLB would send the ping to one address to answer not both.

10 packets transmitted, 10 packets received, +9 duplicates, 0.0% packet lossround-trip min/avg/max/stddev = 0.222/1.353/10.805/3.238 ms

Thanks.

Scale Out File Server SMB redirection locking up CSVs

$
0
0

Problem - Physical hosts have HyperV running and a vhdx located in a SOFS CSV (HyperV hosts different than SOFS cluster nodes).  During start up of the VM when SMB redirection occurs or when trying to move CSVs with an active SMB connection between cluster nodes locks up the CSV.  

All physical hosts and VMs are Windows 2012 R2 with updates to ~July 2016
All physical hosts are Cisco C220s with latest OS updates and 1 update behind on firmware
SOFS is a two physical node cluster with SAS connected JBOD
4 CSVs exist, all exhibiting the same issue
SOFS cluster nodes have the below networks:
Mgmt - teamed 10G - no cluster use
cluster0 - single 10G nic - cluster only
cluster1 - single 10G nic - cluster only
SOFS0 - single 10G nic - cluster/client
SOFS1 - single 10G nic - cluster/client (currently set to none for troubleshooting)
Backup - Teamed 10G - no cluster use
LiveMigration - Teamed 10G no cluster use/only network for live migrations
Cluster validation runs clean
When nothing is connected to the CSV shares I can fail CSVs and SOFS role without any errors
Currently each CSV is used by a single HyperV server and has a single vhdx in it.

HyperV host networks
SOFS0 - single 10g nic
SOFS1 - single 10g nic
Backup Team
Mgmt Team
Customer Network Team

I believe both problems are related;
Problem 1)
CSV share is owned by SOFSA
When I boot a VM with a secondary vhdx located in SOFS (OS is in local RAID disk), checking the SMBClient logs on HyperV host and SMBServer logs on SOFS hosts I can see:
HyperV host hits SOFSB.  
HyperV host connects and share is seen as asymmetric/continuous availability transfer.  Witness registration completes.  
SOFSB issues redirect to SOFSA.  
HyperV host gets redirection request and establishes connection to SOFSA (4 event log messages, SMB client reconnect, session reconnect, share reconnect and witness registration). 
At the same second as the previous 4 SMB reconnect messages, but last in sequence. so the 5th message, a message is received to redirect to another cluster node.
HyperV looses session and share during reconnect and SMB Client successfully moved, but no messages on session or share reconnect.
After 59 seconds on the SOFSA I have errors the re-open failed (event id 1016), client session expired
After 60 seconds HyperV registers a request timeout due to no response from server.  Server is responding to TCP but not SMB (event id 30809)
HyperV host then immediately registers a connections to SOFSB for the share, goes through the same redirection sequence to SOFSA (who owns the share).  SMB Client, session reconnect, share reconnect, witness registration successful.
2 seconds later on SOFSA I have a reopened failed, the file is temporarily unavailable (event ID 1016)  I can see the source/destination/share that matches with what is occurring.  Error just continues every 5 seconds.
If I go and try to 'inspect' the drive from HyperV it times out and on SOFSA I get a warning (event ID 30805) client lost its session - Error {Network Name Not Found} - The specified share name can not be found share name \SOFSClusterName\$IPC
Now we just repeat errors client established session to server, lost session to server network name not found server \SOFSClusterName - same session ID in connect/disconnect for each pair of connect/disconnect

Now the great part - 
If I go into failover cluster (FOC) and I try to move the CSV to the other node, the CSV gets stuck in pending offilne.  After a few minutes any other CSVs owned by the same node go into pending offline and hang.  I can reboot and wait 10 minutes for it to finally die and failover or wait 20 for FOC to completely die on both nodes of the cluster.  In the cluster logs, the SOFS node is never fully releasing the CSV to move.  The last message you will see related to teh volume is:
Volume {c7cdc2d5-e1f9-40c5-b36d-43523e2996f1} transitioning from 4 to 2.
Volume {c7cdc2d5-e1f9-40c5-b36d-43523e2996f1} moved to state 2. Reson 7; Status 0x0.
Volume {c7cdc2d5-e1f9-40c5-b36d-43523e2996f1} transitioning from 2 to 1.

Normally you see :
Volume {c7cdc2d5-e1f9-40c5-b36d-43523e2996f1} transitioning from 4 to 2.
Volume {c7cdc2d5-e1f9-40c5-b36d-43523e2996f1} moved to state 2. Reson 7; Status 0x0.
Volume {c7cdc2d5-e1f9-40c5-b36d-43523e2996f1} transitioning from 2 to 1.
Volume {c7cdc2d5-e1f9-40c5-b36d-43523e2996f1} moved to state 1. Reson 5; Status 0x0.
Volume4; Volume target path \??\GLOBALROOT\Device\Harddisk39\ClusterPartition1; File System target path \??\GLOBALROOT\Device\Harddisk39\ClusterPartition1.
Volume {c7cdc2d5-e1f9-40c5-b36d-43523e2996f1} transitioning from 1 to SetDownlevel. Local true; Flags 0x1; CountersName
Volume {c7cdc2d5-e1f9-40c5-b36d-43523e2996f1} moved to state 3. Reson 3; Status 0x0.
Volume {c7cdc2d5-e1f9-40c5-b36d-43523e2996f1} transitioning from 3 to 4.
Volume {c7cdc2d5-e1f9-40c5-b36d-43523e2996f1} moved to state 4. Reson 4; Status 0x0.

Issue is consistent across all 4 CSVs I have.  I believe the issue has always existed.  If I get the HyperV hosts lined up right to initially hit the SOFS server that owns the CSV, everything boots up fine.  When it doesn't VMs and FOC hangs and I have to go through reboots and VMs loose their drives and I have to reboot those as well. It only when it gets redirected to a different SOFS server that the issue comes up which leads me to the next problem.

Problem2: 
Assuming all the VMs connected to the right SOFS CSV owner on boot and everyone is running/working fine for days/weeks/months (yes this has been sitting around for a while as unresolved problem).  If I try and move a CSV for SOFS maintenance purposes the CSV hangs in offline pending.  Eventually the FOC hangs and I have to spend 2 hours to get things lined up right (after I do what ever I was planning on doing) so the VMs boot.

Things done/verified
Windows firewall is off
I've turned off IPv6
Removed Teaming from all nodes using SOFS0/1 network and cluster0/1 (used to be windows team vs individual networks)
Turned off client/network access from SOFS1 network
turned off CSV balancer - hindsight doesn't work without it due to redirection of CSVs due to asymentic storage
updated permissions for SOFS share to include HyperV host, SOFS cluster nodes - didn't make any difference/never see access denied errors

One item I see I don't understand is on the SOFS cluster nodes, in SMBClient/connectivity logs, I see network connection failed to the cluster adddresses:

The network connection failed.
Error: {Device Timeout}
The specified I/O operation on %hs was not completed before the time-out period expired.
Server name: fe80::98f9:c138:xxxxx%32
Server address: x.x.x.x:445
Connection type: Wsk
Guidance:
This indicates a problem with the underlying network or transport, such as with TCP/IP, and not with SMB. A firewall that blocks port 445 or 5445 can also cause this issue.

The server name is the 'Tunnel adapter Local Area Connection* 12:' on the other SOSF cluster node.  So SOFSA generating errors to SOFSB and SOFSB generating errors connecting to SOFSA.   This was occuring before and after the cluster0/1 network interfaces were teamed



Thanks-








Viewing all 6672 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>