Hi Friends,
This is very simple post and explores the new setting of what we have on
SQL Server 2017. I would like to thank David Fowler as it is inspiration from
his blog
Okay, let us begin by understanding what Synchronous really means?
Hi Friends,
This is very simple post and explore the new setting of what we have on
SQL Server 2017.
Okay, let us begin by understanding what Synchronous really means?
Normally, if we have got a secondary replica in synchronous commit mode,
our primary cannot commit any transactions until that transaction has been
hardened to disk on all the synchronous secondaries.
When you commit a transaction on the primary, SQL Server will send all
changes over to its secondary nodes. It will then wait for confirmation from
the synchronous nodes that the changes have been hardened before it can commit
locally.
This ensures that those synchronous secondaries are always up to date
and will mean that in the event of a fail over, there is no risk of data loss.
Any delay in the secondary writing down those changes will mean a delay on
committing on the primary (you will see this in the form of HADR_SYNC_COMMIT
waits on the primary).
The primary replica is constantly monitoring the state of its
secondaries. With the use of continuous ping, the primary node always knows if
the secondaries are up or down.
It is when SQL detects that one of its synchronous replicas goes offline
is when interesting things can happen.
So, here is the discussion that came up, if a synchronous replica goes
offline for whatever reason, SQL won’t be able to commit any transactions and
that means we can be confident that the secondary is up to date, right?
Well, not quite. Let us try running the following code to simply update
a table with a timestamp every second.
We have 3 replicas running on Microsoft SQL Server 2017 (RTM-CU18) and
all of them are in synchronous Mode.
I started testing
create table synctest(sname datetime)
TRUNCATE TABLE SyncTest
WHILE 1=1
BEGIN
INSERT INTO synctest VALUES (GETDATE())
WAITFOR DELAY '00:00:01'
END
select * from synctest
After a short while I Stopped SQL Services on one of the secondary
replicas and here are the results
There were no records
for 10 seconds
Interesting, we can see that there was a delay when the synchronous
secondary went offline as we would expect. But then after a short while, it
looks like SQL Server decided to continue committing transactions. Is that
right?
Session-Timeout
And here is where the
misinterpretation came in. Availability groups have a
setting, session-timeout. This timeout controls how long a
primary replica will wait to receive a ping from a secondary (by default, this
is 10 seconds). Once that timeout period is exceeded, the primary will no
longer wait for that replica when committing a transaction, effectively
switching it to asynchronous commit.
This is the reason why, after 10 seconds of the secondary going down, we
start to see transactions being committed again.
So, My Question is how will we ensure that we can avoid this
situation?
SQL Server 2017 introduced REQUIRED_SYNCHRONIZED_SECONDARIES_TO_COMMIT cluster
resource setting.
This setting guarantees the specified number of secondary replicas write
the transaction data to log before the primary replica commits each transaction
The default setting is Zero and I changed to one and carried out the
same exercise as above and still the results are same where we got 10 seconds
lag and the transactions continued post that.
Finally, after changing this setting to 2 as expected the transactions
didn't run as the primary can't continue because the transactions can't get
hardened on to secondary. Below is the error message
(1 row affected)
Remote harden of transaction 'INSERT' (ID 0x0000000001677d4d
0000:0000081c) started at Sep 14 2020 2:03PM in database 'india' at LSN
(36:2213:3) failed.
(1
row affected)
Msg
596, Level 21, State 1, Line 3
Cannot
continue the execution because the session is in the kill state.
Msg
0, Level 20, State 0, Line 3
A
severe error occurred on the current command. The results, if any, should
be discarded.
So Now think about this setting and implement it as per your needs.
Below is the link which explains about the setting
REQUIRED_SYNCHRONIZED_SECONDARIES_TO_COMMIT
Comments