Always ON Failover History

Hi Friends,

Today I am going to share an article which I think would be useful for many of us. Many times we need to find when the availability group fail over has happened. As you know we can find that information from Always ON health session extended events. The problem is we don't have history to track them and it is little difficult to gather that information across the estate.

In my organization we have many Availability groups being deployed hence I thought this can be handy.

The logic is very simple get the data from your central monitoring server using your favorite means (powershell or SSIS) across the estate and load them in to a table.

Below is the output



As you can see from the above image it contains the current replica and the corresponding Listener Name followed by the fail over date and the most important previous replica information. I added collection date as well which can be handy if you want to filter on respective dates.


Below is the code that I made use of to capture this information.


DECLARE @FileName NVARCHAR(4000)
SELECT @FileName = target_data.value('(EventFileTarget/File/@name)[1]', 'nvarchar(4000)')
    FROM (
           SELECT CAST(target_data AS XML) target_data
            FROM sys.dm_xe_sessions s
            JOIN sys.dm_xe_session_targets t
                ON s.address = t.event_session_address
            WHERE s.name = N'AlwaysOn_health'
         ) ft;

WITH    base
          AS (
               SELECT XEData.value('(event/@timestamp)[1]', 'datetime2(3)') AS event_timestamp
                   ,XEData.value('(event/data/text)[1]', 'VARCHAR(255)') AS previous_state
                   ,XEData.value('(event/data/text)[2]', 'VARCHAR(255)') AS current_state
                   ,ar.replica_server_name
   ,ar.group_id
                FROM (
                       SELECT CAST(event_data AS XML) XEData
                           ,*
                        FROM sys.fn_xe_file_target_read_file(@FileName, NULL, NULL, NULL)
                        WHERE object_name = 'availability_replica_state_change'
                     ) event_data
                JOIN sys.availability_replicas ar
                    ON ar.replica_id = XEData.value('(event/data/value)[5]', 'VARCHAR(255)')
             )
    SELECT DATEADD(HOUR, DATEDIFF(HOUR, GETUTCDATE(), GETDATE()), event_timestamp) AS event_timestamp
           
          
   ,convert(varchar,getdate(),20) as [current date],
   dns_name as [listner_name]
        FROM base  b inner join sys.availability_group_listeners sag 
on b.group_id=sag.group_id where current_state='PRIMARY_NORMAL'
        ORDER BY event_timestamp DESC;

The above query has been obtained from this LINK and I modified little bit.


once you get this information we can make use of LAG and Partition functions to obtain the data as per our needs. Something like below will help you


create table AlwaysON_servers
(
   servername varchar(20)
)

insert AlwaysON_servers values('XXXXXXX')
insert AlwaysON_servers values('YYYYYYY')

create table AlwaysON_servers_history
(
   hostname varchar(30),
   ListenerName varchar(30),
   restartdate datetime primary key,
   currentdate datetime
)

select hostname as current_replica,listenername,restartdate,currentdate,
lag(hostname) over(partition by listenername order by restartdate)as previous_replica from  AlwaysON_servers_history


In case if you need any further information please feel free to comment.

Comments