Health Check Endpoint Setup for Azure Apps

A VM running at 5% CPU is not a healthy application. A container that started successfully is not a healthy service. Health checks are the difference between Azure knowing whether your infrastructure is alive and Azure knowing whether your application is actually working. Without them, the load balancer routes traffic to broken instances, autoscaling replaces the wrong things, and on-call gets paged about symptoms because the root cause was undetectable.

This guide covers implementing health check endpoints in Azure App Service and Azure Functions: what to check, how to configure the integration, and the practices that separate a useful health check from one that fires false alarms.

What health checks actually do

Azure App Service's health check feature polls a path you define at a configurable interval (default 30 seconds). If the endpoint returns a non-2xx status code for a defined number of consecutive polls, Azure removes the instance from the load balancer rotation. With multiple instances, traffic shifts to the healthy ones automatically. With a single instance, Azure replaces it after continuous failure for up to one hour.

The implication: a health check endpoint that returns 200 when the application is broken is worse than no health check. It gives Azure a false signal and keeps broken traffic in rotation. The endpoint needs to reflect the actual health of the service, which means checking the things users depend on, not just whether the process is running.

Design the endpoint first

Plan what the health check needs to verify before writing any code. The right checks depend on your application, but the framework is consistent:

Liveness vs readiness. A liveness check answers: is the application process alive and capable of handling requests? A readiness check answers: are all the dependencies this application needs currently available? These are different questions. A liveness check failing should trigger instance replacement. A readiness check failing might just mean a downstream dependency is temporarily unavailable, which should pull the instance from rotation but not destroy it. Kubernetes popularised this separation; Azure App Service does not enforce it natively, but designing separate paths (/health/live and /health/ready) gives you the flexibility to use them differently in your monitoring.

Critical dependencies only. Check database connectivity, the message queue the application reads from, and any external APIs that would prevent the service from functioning if unavailable. Do not check nice-to-have integrations that do not break the core user journey. Every dependency you add to a health check is another thing that can cause a false positive.

Keep it fast. Azure's default timeout is one minute, but your endpoint should return in under 10 seconds. Use simple connectivity checks: a SELECT 1 against the database rather than a complex query, a ping to a queue endpoint rather than a full message round-trip. Use short timeouts on each dependency check (3-5 seconds) and return unhealthy immediately if any critical dependency times out rather than waiting for all of them. Cache results for 30-60 seconds on expensive checks to reduce load.

Configure health checks in App Service

In the Azure Portal, navigate to your App Service resource. Under Monitoring in the left menu, select Health check and toggle it on.

Set the Path to the endpoint your application exposes (e.g., /health or /api/health). Set the Ping interval: 30 seconds is the default and suits most scenarios. Shorter intervals detect failures faster but add load to the application.

Enable Load balancing behaviour. This is the setting that actually removes unhealthy instances from rotation. Without it, the health check monitors but does not act.

For multi-instance deployments, health checks work correctly with Basic tier and above. Free and Shared plans lack the ability to replace unhealthy instances, so health checks on those tiers provide monitoring only, not automated recovery.

If your health check endpoint requires authentication, you need to exempt it. A health check that gets blocked by your authentication layer will appear perpetually unhealthy. Add the health check path to your authentication exclusion list, and restrict its visibility through IP allowlisting to Azure's monitoring infrastructure and your own tooling rather than leaving it open on the public internet.

Implement the endpoint in ASP.NET Core

ASP.NET Core has health check middleware built in. Add the service in Program.cs:

builder.Services.AddHealthChecks()
    .AddSqlServer(connectionString, name: "database")
    .AddUrlGroup(new Uri("https://api.dependency.com/health"), name: "external-api");

Map the endpoint:

app.MapHealthChecks("/health", new HealthCheckOptions
{
    ResponseWriter = UIResponseWriter.WriteHealthCheckUIResponse
});

app.MapHealthChecks("/health/live", new HealthCheckOptions
{
    Predicate = _ => false  // liveness: just return 200 if the process is up
});

The liveness endpoint (/health/live) returns 200 if the application started correctly, with no dependency checks. The full health endpoint checks everything. Map them to different paths and configure Azure to poll whichever matches your recovery intention.

For Entity Framework:

builder.Services.AddHealthChecks()
    .AddDbContextCheck<ApplicationDbContext>();

For Azure Blob Storage, Service Bus, or other Azure services, the AspNetCore.Diagnostics.HealthChecks community library has pre-built checks that handle connection validation cleanly.

Implement the endpoint in Azure Functions

Azure Functions does not get the same built-in health check middleware. Create an HTTP-triggered function for this purpose:

[FunctionName("HealthCheck")]
public static async Task<IActionResult> Run(
    [HttpTrigger(AuthorizationLevel.Anonymous, "get", Route = "health")] HttpRequest req,
    ILogger log)
{
    var checks = new Dictionary<string, string>();
    var healthy = true;

    // Check database
    try
    {
        using var conn = new SqlConnection(Environment.GetEnvironmentVariable("SqlConnection"));
        await conn.OpenAsync();
        checks["database"] = "healthy";
    }
    catch (Exception ex)
    {
        checks["database"] = $"unhealthy: {ex.Message}";
        healthy = false;
    }

    var statusCode = healthy ? StatusCodes.Status200OK : StatusCodes.Status503ServiceUnavailable;
    return new ObjectResult(new { status = healthy ? "healthy" : "unhealthy", checks }) 
    { 
        StatusCode = statusCode 
    };
}

Keep the function lightweight. Avoid async operations that could time out under load. Set timeouts on each dependency check explicitly, and do not let a single slow dependency hold up the entire health response.

Functions on a Consumption plan do not support health check integration with App Service Load Balancer. Use a Premium or App Service plan for functions where automated instance replacement is needed.

Alert correctly

Configuring the health check is half the work. The other half is making sure it alerts in a way your team can act on.

The most common mistake: alerting on a single failure. Transient network blips, a momentary database connection reset, a deployment restart: all of these will trigger a single health check failure that is not a real incident. Set your alert threshold to three or more consecutive failures before triggering a notification.

Connect Azure Monitor to your health check endpoint status. Create a metric alert on HealthCheckStatus with the condition average < 1 (where 1 is healthy) over a 5-minute window. Route the alert to your on-call channel, not to email.

Log every health check response: the status code, the response time, and the state of each dependency. In Log Analytics:

AppServiceHTTPLogs
| where CsUriStem == "/health"
| where ScStatus != 200
| summarize count() by bin(TimeGenerated, 5m)

This query surfaces the pattern of failures, which is more useful than individual alerts for diagnosing intermittent issues.

Staging before production

Never deploy a health check configuration to production that has not been tested in staging for at least 24 hours. A poorly designed health check in production can trigger mass instance replacement, which is exactly the outage you were trying to prevent.

In staging, deliberately break the dependencies the health check monitors: stop the database, block the external API. Confirm the health check detects the failure within the expected window and that the instance is removed from rotation. Restore the dependency and confirm the instance returns to healthy. This is the test that validates your health check actually works, not just that it returns 200 when everything is fine.

Where Critical Cloud comes in

Health checks are one signal in a complete observability picture. Running them correctly, correlating their output with application traces, infrastructure metrics, and deployment events, and operating 24/7 so failures reach a human at 3am as well as 3pm, is what we do for technology-led and regulated businesses on Azure. As the world's first Powered by Datadog accredited partner, we connect health check status, APM traces, and infrastructure data into a single view, so a failing health check is the start of a diagnosis rather than just an alert. See how Critical Support works.