Hi All,
In this post I would like to demonstrate how we’ve achieved high availability and disaster recover policies for Azure services (Function App, Web API, Logic App etc) deployed in Primary and Secondary region.
Traffic Manager -> Traffic manager is sitting on the top of DNS router – it just does the transfer the API call to the API Gateway.
API Gateway -> It verifies API endpoint, it can have specific format of hosting to validate the right external calls. App gateway transfer all the call to the public Azure API Management.
Azure API Management – > APIM is multi region premium tier, with primary and secondary instance. APIM is one entity with two copy running on primary and secondary region. APIM sends to the request to it closest region, for example if the request
is routed from the primary region then the API Gateway will transfer it to the public IP of APIM hosted in primary region. We always send the traffic to one side of APIM.
We’ve APP service environment sitting in Primary and Secondary under VNET1 and VNET2 respectively, so we’ve a copy of our Azure services (Function App, Web API, Logic App) running on both Primary and Secondary region of APP service environment.
In general, on a regular day everything will be routed to the primary region. If there is unavailability of components in Primary region of app service environment, the call will be routed to the azure services in app service environment in secondary region. So this is how it is replicated to primary and disaster recovery side.
So for secondary side to be used, we’ve implemented API policy in such a way, if there is failure in Azure services in App service environment in primary region, the call will re-direct to the Azure services in App service environment in secondary region.
This will allow us to implement High Availability and Disaster Recovery. So how to achieve this in APIM policy :
<policies> <inbound> <base /> <authentication-managed-identity resource="{{clientid}}" ignore-error="false" output-token-variable-name="" /> <choose> <when condition="@(context.Deployment.Region.Equals("region name", System.StringComparison.InvariantCultureIgnoreCase))"> <set-backend-service base-url="https://{{webapi-primaryhostname}}" /> <cache-lookup-value key="is-primary-down-cache-key" default-value="@((bool)false)" variable-name="is-primary-down" /> <choose> <when condition="@((bool)context.Variables["is-primary-down"])"> <set-backend-service base-url="https://{{webapi-secondaryhostname}}" /> </when> </choose> </when> <when condition="@(context.Deployment.Region.Equals("region name", System.StringComparison.InvariantCultureIgnoreCase))"> <set-backend-service base-url="https://{{webapi-secondaryhostname}}" /> <cache-lookup-value key="is-secondary-down-cache-key" default-value="@((bool)false)" variable-name="is-secondary-down" /> <choose> <when condition="@((bool)context.Variables["is-secondary-down"])"> <set-backend-service base-url="https://{{webapi-primaryhostname}}" /> </when> </choose> </when> </choose> </inbound> <backend> <retry condition="@(context.Response.StatusCode == 404 || context.Response.StatusCode == 403 || context.Response.StatusCode == 503)" count="2" interval="1" first-fast-retry="true"> <choose> <when condition="@(context.Response != null && (context.Response.StatusCode == 404 || context.Response.StatusCode == 403 || context.Response.StatusCode == 503) )"> <choose> <when condition="@(context.Request.Url.Host.Equals("{{webapi-primaryhostname}}"))"> <set-backend-service base-url="https://{{secondaryhostname}}" /> <cache-store-value key="is-primary-down-cache-key" value="@(true)" duration="{{drroutinginterval}}" /> </when> <when condition="@(context.Request.Url.Host.Equals("{{webapi-secondaryhostname}}"))"> <set-backend-service base-url="https://{{webapi-primaryhostname}}" /> <cache-store-value key="is-secondary-down-cache-key" value="@(true)" duration="{{drroutinginterval}}" /> </when> </choose> </when> </choose> <forward-request /> <choose> <when condition="@(context.Response != null && (context.Response.StatusCode == 200))"> <choose> <when condition="@(context.Request.Url.Host.Equals("{{webapi-primaryhostname}}"))"> <cache-remove-value key="is-primary-down-cache-key" /> </when> <when condition="@(context.Request.Url.Host.Equals("{{webapi-secondaryhostname}}"))"> <cache-remove-value key="is-secondary-down-cache-key" /> </when> </choose> </when> </choose> </retry> </backend> <outbound> <base /> </outbound> </policies>
Here, if there is a failure (http status code 404, 403, 503 ) with the primary host “<set-backend-service base-url=”https://{{webapi-primaryhostname}}” />” retry condition will be executed, policy will call secondary host “<set-backend-service base-URL=”https://{{secondaryhostname}}” />” and set this secondary host in cache. Now, any subsequent API call will check if there is already base-url in cache it will use this base URL (in this case it is the secondary host) and redirect the call to secondary host. Cache will hold this value until seconds provided in the named-value {{drroutinginterval}}.
Thanks.