Skip to content
Datadog cost diagnosis

Why Is Datadog Expensive,
and How Do You Fix It?

Datadog is a capable, comprehensive observability platform. When it feels expensive, the cause is almost always unmanaged telemetry growth, not the platform being wrong for you. This is how you diagnose and fix it.

Datadog is valuable. But costs grow when telemetry is unmanaged.

Datadog is usage-based. You pay for what you send, what you index, what you retain and how many distinct time series you generate. That model works well when teams are deliberate about what they instrument and why. It becomes expensive when instrumentation grows without governance: when logs ship in full, metrics accumulate high-cardinality tags and retention stays at default across all sources.

This is not a failure of Datadog's pricing model. It is a signal that the telemetry strategy has not kept pace with the platform's usage. The right response is not to replace Datadog. It is to govern it.

Most teams that feel Datadog is too expensive have not completed a structured cost review. In our experience, a 30-day review consistently surfaces enough waste to bring cost meaningfully under control, without removing the coverage that makes Datadog valuable.

The root cause in most cases

  • Telemetry grows faster than commitments, creating on-demand overage at unplanned rates
  • No one has reviewed what is being ingested, indexed or retained since the original setup
  • Default settings (retention, sampling, all-hosts monitoring) were never tuned for the actual production environment
  • Attribution is absent: no one knows which teams or services drive which proportion of the bill
  • New products were enabled without reviewing whether existing usage was already healthy

The most common reasons Datadog feels expensive

In every case, the cause is identifiable and addressable. None of them require replacing the platform.

Logs

Logs are Datadog's most volume-sensitive product. Kubernetes and container logs, verbose application output and debug logging left enabled in production are the most common causes of unexpected log cost. The default: ship everything, index everything, retain at default. The fix: index intentionally, tune retention per source, use Observability Pipelines to reduce before ingest.

Custom metrics with high cardinality

A single metric name with a high-cardinality tag (user_id, request_id, pod name) generates as many time series as there are unique tag values. Custom metric cost is proportional to series count, not metric name count. One developer adding one poorly-chosen tag can multiply custom metric spend dramatically. The fix: audit series counts, replace high-cardinality tags with bounded equivalents.

Long retention defaults

Datadog's default log retention is often longer than any operational investigation ever requires. Applying the same retention to all log indexes, regardless of how frequently they are queried or what compliance obligation applies, means you are paying for retention you will never use. The fix: review retention per index and align it to actual operational and compliance needs.

Kubernetes and container scale

Kubernetes environments generate telemetry at scale: container logs, pod-level metrics, ephemeral hosts that each count against infrastructure allocation. When Kubernetes is deployed without reviewing agent configuration, exclusions and coverage scope, the platform monitors everything at full fidelity. The fix: define what needs monitoring and configure the agent to match, not to discover and monitor everything it finds.

APM and tracing volume

APM costs are driven by the volume of traces ingested and retained. Applying the same sampling rate to every service, environment and endpoint regardless of traffic volume or operational value is the most common cause of unexpectedly high APM spend. Health check endpoints, internal service-to-service calls and low-risk background jobs rarely need full trace retention. The fix: environment-appropriate sampling, error-biased and latency-biased retention rules.

Unused or over-deployed products

Products enabled across all environments during an evaluation and never scoped back. Synthetics tests running at frequencies set during initial setup. Session replay enabled for user journeys that are rarely visited. Each adds cost that may have been intentional at the time but has not been reviewed since. The fix: a product-by-product usage review that asks whether each enabled capability justifies its cost at its current scope.

Weak tagging and attribution

Without consistent tagging across hosts, services and logs, it is impossible to answer the question that would most help address the cost: which team or service is responsible for which proportion of the spend? Weak tagging makes cost reviews harder, makes governance policies harder to enforce and makes renewal negotiations harder to prepare for. The fix: a tagging standard applied consistently from infrastructure to application to log pipeline.

No governance or renewal process

Datadog cost tends to grow steadily and then spike as renewal approaches, when teams suddenly realise how far usage has drifted from the committed level. Without a governance cadence (regular usage reviews, attribution reporting and renewal readiness assessments) the only feedback loop is the renewal conversation itself, which is too late to affect the cost trajectory for the current period.

What to do before replacing Datadog

Migration is the most expensive and disruptive response to a cost problem. These are the steps to take first.

Related Datadog cost guides

Frequently asked questions

Why is Datadog so expensive?

Datadog's pricing is usage-based, which means cost grows with the volume of telemetry you generate. The platform delivers genuine value (it is one of the most capable observability platforms available) but when telemetry is unmanaged, costs grow faster than the business expects. The most common causes are high log volume, high-cardinality custom metrics, long retention defaults and product sprawl across environments that do not need full coverage.

Is Datadog too expensive compared to alternatives?

Datadog's value is in the breadth and depth of its platform: one tool for infrastructure, APM, logs, security, synthetics, cloud cost and more. Alternatives are often cheaper per dimension but require integrating multiple tools, each with its own cost and operational overhead. Before concluding that Datadog is too expensive for your organisation, it is worth establishing whether the cost is the platform or the way the platform is being used.

Should I replace Datadog to reduce costs?

Rarely. Replacing an observability platform is expensive, risky and time-consuming. Migrations disrupt engineering teams, lose historical data and often reveal that the new platform has its own cost growth patterns as usage scales. In most cases, a structured cost review identifies enough waste to bring spend under control without replacement, leaving the team with a more governable, attributable platform than they started with.

How quickly can Datadog costs be reduced?

Some controls take effect immediately: exclusion filters, retention changes and pipeline rules apply as soon as they are configured. Others, like cardinality reduction from tag changes, take effect over the subsequent billing period as old series age out. A 30-day structured review typically surfaces and implements enough changes to show a clear cost reduction within the first billing cycle after it completes.

Fix the cost. Keep the platform.

Tell us where the pressure is. We will show you what a governed, predictable Datadog spend looks like, and we will get you there in 30 days.

Book a cost review Our Datadog practice