04 May Automating DNS Zone Failover
Intro
In cloud and hybrid environments, DNS failover automation is important for maintaining high availability. By automatically switching DNS records to healthy endpoints during an outage, you can minimize downtime and ensure continuous access to applications. Azure Private DNS zones provide internal name resolution for resources within virtual networks, but by design their records aren’t reachable from the public internet. This isolation means that if an Azure environment becomes unreachable (e.g. due to a region outage or network failure), you cannot resolve critical domain names via the Private DNS zone. Automating DNS failover addresses this gap by detecting failures and dynamically updating DNS records (in Azure or an alternate DNS provider) so that clients are redirected to backup systems. In essence, a DNS failover system continuously monitors service health and, upon detecting an issue, automatically modifies DNS entries to point to a secondary location. This ensures that whether you’re dealing with cloud services or on-premises resources, name resolution will seamlessly redirect to a recovery environment, helping maintain application uptime.
DNS failover scenarios
In hybrid scenarios, Azure Private DNS often works in tandem with on-premises DNS. For example, an on-premises environment might use a conditional forwarder to direct queries for an Azure Private DNS zone (like cloud.azrtestbench.com
) to an Azure DNS Private Resolver in the cloud. If the site-to-site VPN or ExpressRoute link goes down, those DNS queries will fail because the Azure Private DNS zone can’t be reached. This is one of several failure scenarios where automated DNS failover is essential.

Contingency planning
To achieve seamless DNS failover, you need a solid contingency plan that covers both DNS configuration and monitoring. The diagram below illustrates the general idea: an active primary environment and a passive secondary environment. A DNS-based failover mechanism (such as Azure Traffic Manager for public DNS) constantly monitors the primary’s health and automatically directs clients to the secondary when the primary degrades. We can implement a similar concept using Azure Private DNS and an alternate provider through custom automation.

PowerShell
Automating DNS failover involves three main tasks: health checks, DNS record updates and decision logic. PowerShell is a convenient tool for this in Azure environments, thanks to the Azure PowerShell modules. Below is an example approach using PowerShell scripts
Define endpoints and variables
Begin by specifying the primary and secondary IP addresses (or endpoints) for the service, the DNS record to update and any other parameters. You’ll also need the resource group and Private DNS zone name for Azure. For example:
$resourceGroup = "MyDNSResourceGroup"
$zoneName = "contoso.internal"
$recordName = "app"
$primaryIP = "10.1.2.5" # Primary private endpoint IP
$secondaryIP = "52.4.6.8" # Secondary public/alternate endpoint IP
$recordType = "A"
$ttl = 30 # desired TTL for the record (in seconds)
In a real script, these values could come from a config file or arguments.
Health check the primary service
Use a cmdlet like Test-Connection
or a web request to determine if the primary endpoint is healthy. For instance, if a ping or TCP connection to the primary IP fails, we consider the service down. Alternatively, use an HTTP probe if the service is a web application (e.g., Invoke-WebRequest -Uri http://app.contoso.com/health
). Example with ping:
$primaryAlive = Test-Connection -ComputerName $primaryIP -Count 2 -Quiet
if ($primaryAlive) {
Write-Host "Primary endpoint is reachable."
} else {
Write-Warning "Primary endpoint is down. Initiating DNS failover..."
}
In this snippet, -Quiet
returns a simple Boolean indicating success or failure of the ping.
Failover Logic – update DNS record
check indicates failure, the script will update the DNS record in the Private DNS zone to point to the secondary IP. This uses Azure PowerShell cmdlets to modify the DNS record set. A straightforward way is to retrieve the existing record set, modify it, and apply the update:
if (-not $primaryAlive) {
# Get the current DNS record set
$recordSet = Get-AzPrivateDnsRecordSet -ResourceGroupName $resourceGroup `
-ZoneName $zoneName -Name $recordName -RecordType $recordType
# Remove any existing A records and add the secondary IP
$recordSet.RecordSets.Records.Clear() # Clear current IPs
Add-AzPrivateDnsRecordConfig -RecordSet $recordSet -Ipv4Address $secondaryIP
# Update TTL if needed
$recordSet.RecordSets.TTL = $ttl
# Push the changes to Azure
Set-AzPrivateDnsRecordSet -RecordSet $recordSet -Overwrite
Write-Host "DNS record updated to secondary IP $secondaryIP"
}
The above snippet uses Get-AzPrivateDnsRecordSet
to fetch the record (e.g., the A record for “app.contoso.internal”), then clears out the old IP and adds the new one and finally applies the update with Set-AzPrivateDnsRecordSet
. (In Azure PowerShell, the Private DNS cmdlets allow you to build a record set object and then commit it); also adjust the TTL to a low value for rapid propagation. The -Overwrite
flag ensures the update goes through even if the record has changed since we fetched it.
Note: In production, include error handling (try/catch) around these commands and possibly send notifications when a failover occurs.
(Optional) Update alternate provider
If you are also maintaining a secondary DNS provider (for example, a public Azure DNS zone or a third-party DNS), you would include API calls or cmdlets for that provider here. For Azure public DNS, you could use similar Azure PowerShell commands (Get-AzDnsRecordSet
/ Set-AzDnsRecordSet
from the Az.Network module) to update the public DNS zone. For third-party services, you might call their REST API or use CLI tools.
The principle is the same: switch the A record (or alias) to the backup endpoint.
Failback monitoring
The script can loop (perhaps running on a schedule via Azure Automation or a Windows Scheduled Task). After failing over, it should keep checking the primary. Once the primary service is back up, the script can reverse the process: update the DNS record back to the primary IP. Be careful to only fail back when the primary is fully confirmed healthy to avoid flapping. You might implement a threshold (e.g. require 5 consecutive successful checks before failing back).
Wrapping it all up
Automating DNS failover between Azure Private DNS zones and alternate providers ensures that your internal and hybrid applications remain available even when disasters strike. We began by recognizing the importance of DNS failover and how Azure Private DNS, while powerful for internal name resolution, requires thoughtful planning to integrate into a failover strategy. By examining common failure scenarios highlighted why a dual-provider DNS setup (or backup DNS zone) can save critical downtime.
Planning is the backbone of successful DNS failover: adjust TTLs to balance rapid failover with stability, set up rigorous health monitoring and choose a secondary DNS solution that will be reachable when the primary is not; keep DNS records in sync across primary and secondary providers and to rehearse the failover process through drills
With a plan in place, we can use automation tools to do the heavy lifting. PowerShell scripts can interface directly with Azure Private DNS to update records on-the-fly, triggered by health check logic; showed how to use Azure PowerShell cmdlets to detect a failure and modify a Private DNS record set with a new IP address in real-time. For those preferring Azure CLI or working in Linux environments, a Bash script with az network private-dns
commands achieves the same outcome, removing the failed endpoint’s DNS record and adding a new one that directs clients to the backup service. Both approaches rely on Azure’s APIs to quickly propagate changes and they can be extended to update other DNS providers’ records as well.
No Comments