Skip to content
Metrics cost optimisation

Datadog Metrics Cost Optimisation:
Control Cardinality and Custom Metrics

Custom metric spend in Datadog is driven by cardinality: the number of unique time series your instrumentation generates. Understanding cardinality is the single most important step in controlling metrics cost.

Why metrics can become expensive

Datadog custom metrics are billed by the number of distinct time series you generate, not by the number of metric names. A metric name is just a label. A time series is a metric name combined with a specific set of tag values. Every unique combination of tag values creates a separate series, and it is the total count of those series that determines cost.

This means that a well-named, intentional metric with a high-cardinality tag can cost far more than dozens of coarser metrics without one. Most teams do not discover this until they receive a usage report and find that one or two metrics are responsible for the majority of their custom metric series count.

The pattern is consistent: a developer adds a useful-sounding tag (the user's plan tier, the name of the pod that processed the request) without knowing how many unique values it will take or how fast that cardinality will grow as the service scales.

What cardinality means in practical terms

Example: one metric, different tags

api.request.duration{env:prod}1 series
+ service tag (20 services)20 series
+ endpoint tag (50 endpoints)1,000 series
+ status_code tag (10 values)10,000 series
+ user_id tag (100k users)1 billion series

Adding user_id as a tag on a request duration metric is never a sound decision for cost. The service and endpoint combination is probably right; user_id is not. Knowing the difference is the core of tag governance.

High-cardinality tags to watch for

These tag patterns appear frequently in Datadog metric instrumentation and are common causes of unexpectedly high custom metric counts.

User identifiers

user_id, customer_id, account_id

Any tag whose values are unique per user or customer creates as many series as you have users, and grows unboundedly as your product scales. These tags belong in tracing and log attributes, not in metrics. If you need per-user aggregations, a separate reporting pipeline is a more cost-effective approach.

Request identifiers

request_id, trace_id, session_id

Request-scoped identifiers are unique per request. Adding them as metric tags creates one time series per request processed, which at any meaningful traffic volume generates enormous series counts. These identifiers are the correct level of granularity for traces, not for aggregated metrics.

Kubernetes runtime tags

pod_name, container_id, replica_set

Kubernetes pod names and container IDs are ephemeral: they change with every deployment. If these are included as metric tags, every rollout creates a new set of series that the old ones never had. Over time, this generates a long tail of inactive series that still consume custom metric allocation. Namespace and deployment_name are usually sufficient and are stable across rollouts.

Dynamic URL paths

url, path, endpoint with IDs embedded

URL paths that include entity IDs (e.g. /api/users/12345/orders) generate one time series per unique entity when used as metric tags. These should be normalised to a pattern like /api/users/{id}/orders before being used as a tag value, or replaced entirely with a higher-level route name tag that takes a small, fixed number of values.

How to reduce waste without losing useful visibility

Cardinality reduction does not mean losing insight. It means being deliberate about which dimensions belong in metrics and which belong in traces, logs or separate analytical pipelines.

Audit current series counts

Use Datadog's Metrics Summary to identify the metric names with the highest time series counts and the tags contributing most to cardinality. This audit typically reveals a small number of metrics responsible for a large proportion of custom metric spend, a Pareto distribution that makes prioritisation straightforward.

Replace high-cardinality tags with bucketed equivalents

A user_id tag can often be replaced with a user_tier tag (free, pro, enterprise) that takes three values instead of three million. A raw latency value in a tag can be replaced with a latency_bucket tag (fast, acceptable, slow). You keep the segmentation that supports operational decisions without the cardinality cost.

Use traces and logs for per-entity detail

The information that belongs in a user_id tag on a metric (the individual user's experience) belongs in a trace span attribute or a log attribute. Metrics aggregate; traces and logs provide per-request detail. Using the right tool for the right question eliminates the need for high-cardinality metric tags entirely.

Review third-party library defaults

Many instrumentation libraries, especially those for HTTP frameworks, databases and message queues, emit metrics with tags enabled by default. Some of those defaults include high-cardinality fields. Review the default tag sets for every library in your instrumentation stack and disable or replace high-cardinality defaults.

Use Flex Metrics for variable-cardinality data

Datadog's Flex Metrics capability lets you ingest metrics at full cardinality but store only the tag combinations that are actually queried. For metrics where you genuinely need high-cardinality instrumentation but query only a subset of combinations, Flex Metrics can reduce cost significantly without requiring instrumentation changes.

Establish a tag governance policy

Define which tags are allowed on custom metrics, what value ranges they should take and who approves new tags with unbounded cardinality. A tag governance policy prevents the problem from returning as services scale and new developers add instrumentation without understanding the cost implications.

Reviewing custom metrics before renewal

Custom metrics are often the line item that changes most in the months before a Datadog renewal. New services are shipped, instrumentation libraries are updated and developers add tags that seem useful without knowing what they will cost at scale. By the time renewal arrives, the custom metric count may be significantly higher than the committed allocation.

A structured metrics review 60 to 90 days before renewal gives procurement time to address the growth: either by reducing cardinality to bring usage back within committed limits, or by entering the renewal conversation with accurate data about genuine usage rather than peak overage.

What a pre-renewal metrics review covers

  • Total custom metric series count versus committed allocation
  • Top metric names by series count and the tags driving cardinality
  • Growth trajectory: how fast is custom metric count growing month on month?
  • Quick wins: which single tag removals or replacements would reduce count most?
  • Recommended allocation for the next contract period based on governed growth

Full guide

Datadog pricing and cost optimisation

The complete guide: all cost drivers including metrics, logs and APM, the 5-step Observability FinOps framework and the 30-day cost review plan.

Read the full guide

Related Datadog cost guides

Frequently asked questions

What is metric cardinality in Datadog?

Cardinality is the number of unique time series created by a metric. Each unique combination of tag values generates a separate series. A metric with a tag that takes 10,000 distinct values creates 10,000 series where one metric without that tag would create one. Custom metric cost in Datadog is based on the number of distinct time series, not the number of metric names, so cardinality is the primary cost driver.

Which tags cause high cardinality in Datadog?

Any tag where the number of distinct values is large or unbounded causes high cardinality. Common culprits include user_id, session_id, request_id, order_id and similar entity identifiers that are unique per request or user. Pod names and container IDs in Kubernetes environments also cause high cardinality when included as metric tags, because they change with every deployment.

How do I reduce custom metric costs in Datadog?

Start by auditing your custom metric series count and identifying the tags with the highest cardinality. Remove or replace tags where the cardinality does not justify the cost. Review libraries and SDKs that emit metrics automatically, as these often include high-cardinality tags by default. Use Datadog Flex Metrics for metrics where you need flexible retention rather than full series retention at high cardinality.

What is Flex Metrics in Datadog?

Flex Metrics lets you ingest metrics at high cardinality but store only the tag combinations you actually query, rather than every distinct series. This means you can instrument liberally without paying for series that are never queried, which addresses one of the most common causes of custom metric cost growth.

Reduce your custom metric spend

We audit your metric series counts, identify the cardinality drivers and implement the governance controls that keep metrics cost predictable as you scale.

Book a cost review Our Datadog practice