Hi Mates,
Today we are going to see the
continuation of the part-1.
In the previous post we explored
some scenarios with & with out Quorum. However the solution what I
provided in Case-4 can't be done in real time. As we can't wait for the
situations to get normal.
As a DBA we want to ensure Business
continuity so here I will give you the solution for the same.
Frankly speaking I have seen no of
times people hitting the panic button. I mean instead of fixing the solution
either they start looking for RCA or even some times they run out of solutions.
From now if you are in the same
situation as in case 4 (means only one node is up & cluster service is
being down on the same node) just follow the below steps.
Step 1: Ensure Fail over cluster
module is Available. you can check this by firing the command
Get-Module -ListAvailable if it is
not then import it by firing Import-Module FailoverClusters
Step 2: once the cmdlets are
available then we can fire the below command
Start-ClusterNode -Name
"NODE10" -FixQuorum
The Start-ClusterNode PowerShell cmdlet will start the Cluster Service on the current node. The -FixQuorum parameter will force the cluster node to start even if quorum has not been active. In this case, quorum will not be active because you only have 1 out of the 3 possible votes in the cluster.
Step 3: Set
the NodeWeight property of the cluster node to guarantee that it is a
voting member of the quorum.Once the WSFC has been brought online, make sure
that the cluster node is guaranteed as a voting member. This can be done by
using the Get-ClusterNode PowerShell
cmdlet, setting the NodeWeight property equal to 1.
(Get-ClusterNode –Name
"NODE10").NodeWeight = 1
Step 4: we can check the node
weight as mentioned above in the screen shot. Lastly whether you are operating
in Synchronous or Asynchronous when we are running in force quorum mode then we
need to fire only the below command to bring the AG online. If we fire just
Failover we will get the below error.
alter availability group SQLAGMUL
FORCE_FAILOVER_ALLOW_DATA_LOSS
These steps would bring back the AG online. once the node is back & your witness issue if fixed everything would work as expected.
Hope
this would be useful for you :) Thanks for reading
Comments