Why Is Datadog Expensive, and How Do You Fix It?

01

Datadog is valuable. But costs grow when telemetry is unmanaged.

Datadog is usage-based. You pay for what you send, what you index, what you retain and how many distinct time series you generate. That model works well when teams are deliberate about what they instrument and why. It becomes expensive when instrumentation grows without governance: when logs ship in full, metrics accumulate high-cardinality tags and retention stays at default across all sources.

This is not a failure of Datadog's pricing model. It is a signal that the telemetry strategy has not kept pace with the platform's usage. The right response is not to replace Datadog. It is to govern it.

Most teams that feel Datadog is too expensive have not completed a structured cost review. In our experience, a 30-day review consistently surfaces enough waste to bring cost meaningfully under control, without removing the coverage that makes Datadog valuable.

02

The root cause in most cases

Telemetry grows faster than commitments, creating on-demand overage at unplanned rates
No one has reviewed what is being ingested, indexed or retained since the original setup
Default settings (retention, sampling, all-hosts monitoring) were never tuned for the actual production environment
Attribution is absent: no one knows which teams or services drive which proportion of the bill
New products were enabled without reviewing whether existing usage was already healthy

03

The most common reasons Datadog feels expensive

In every case, the cause is identifiable and addressable. None of them require replacing the platform.

01 Logs Kubernetes and container logs shipped without filters are the most common cause of unexpected log cost.

Logs are Datadog's most volume-sensitive product. Kubernetes and container logs, verbose application output and debug logging left enabled in production are the most common causes of unexpected log cost. The default: ship everything, index everything, retain at default. The fix: index intentionally, tune retention per source, use Observability Pipelines to reduce before ingest.

02 Custom metrics with high cardinality One poorly-chosen high-cardinality tag can multiply custom metric spend across the entire platform.

A single metric name with a high-cardinality tag (user_id, request_id, pod name) generates as many time series as there are unique tag values. Custom metric cost is proportional to series count, not metric name count. One developer adding one poorly-chosen tag can multiply custom metric spend dramatically. The fix: audit series counts, replace high-cardinality tags with bounded equivalents.

03 Long retention defaults Applying the same retention to every log index regardless of operational need means paying for retention you will never use.

Datadog's default log retention is often longer than any operational investigation ever requires. Applying the same retention to all log indexes, regardless of how frequently they are queried or what compliance obligation applies, means you are paying for retention you will never use. The fix: review retention per index and align it to actual operational and compliance needs.

04 Kubernetes and container scale Unreviewed agent configuration monitors every container at full fidelity, including workloads that do not need it.

Kubernetes environments generate telemetry at scale: container logs, pod-level metrics, ephemeral hosts that each count against infrastructure allocation. When Kubernetes is deployed without reviewing agent configuration, exclusions and coverage scope, the platform monitors everything at full fidelity. The fix: define what needs monitoring and configure the agent to match, not to discover and monitor everything it finds.

05 APM and tracing volume Applying the same sampling rate to every service and endpoint regardless of traffic volume drives unexpected APM spend.

APM costs are driven by the volume of traces ingested and retained. Applying the same sampling rate to every service, environment and endpoint regardless of traffic volume or operational value is the most common cause of unexpectedly high APM spend. Health check endpoints, internal service-to-service calls and low-risk background jobs rarely need full trace retention. The fix: environment-appropriate sampling, error-biased and latency-biased retention rules.

06 Unused or over-deployed products Products and synthetics enabled during evaluation and never scoped back continue accumulating cost silently.

Products enabled across all environments during an evaluation and never scoped back. Synthetics tests running at frequencies set during initial setup. Session replay enabled for user journeys that are rarely visited. Each adds cost that may have been intentional at the time but has not been reviewed since. The fix: a product-by-product usage review that asks whether each enabled capability justifies its cost at its current scope.

07 Weak tagging and attribution Without consistent tagging, it is impossible to identify which team or service is responsible for which proportion of the bill.

Without consistent tagging across hosts, services and logs, it is impossible to answer the question that would most help address the cost: which team or service is responsible for which proportion of the spend? Weak tagging makes cost reviews harder, makes governance policies harder to enforce and makes renewal negotiations harder to prepare for. The fix: a tagging standard applied consistently from infrastructure to application to log pipeline.

08 No governance or renewal process Without a governance cadence, costs grow steadily and only surface at renewal, when it is too late to affect the trajectory.

Datadog cost tends to grow steadily and then spike as renewal approaches, when teams suddenly realise how far usage has drifted from the committed level. Without a governance cadence (regular usage reviews, attribution reporting and renewal readiness assessments) the only feedback loop is the renewal conversation itself, which is too late to affect the cost trajectory for the current period.

04

What to do before replacing Datadog

Migration is the most expensive and disruptive response to a cost problem. These are the steps to take first.

Run a usage audit. Export Datadog usage reports across every product and build a baseline. Understand what you are actually using, at what volume and in which environment. Most teams are surprised by where the majority of their spend originates.
Get attribution in place. You cannot govern what you cannot attribute. Establish which teams and services drive which proportion of the bill. Usage attribution is a prerequisite for any meaningful cost conversation: with engineering teams, with finance and with Datadog at renewal.
Identify the top cost drivers. In most environments, a small number of sources (one Kubernetes cluster, one high-cardinality metric, one verbose service) account for a disproportionate share of the total cost. Finding them is the first step to addressing them.
Apply platform-native controls. Retention tuning, exclusion filters, sampling rule changes, tag governance and Observability Pipelines are all available without external tooling. Many cost problems can be substantially addressed using Datadog's own controls, applied deliberately.
Build a governance cadence. A weekly usage check, a monthly cost review and a quarterly renewal readiness assessment prevents the cost from drifting back after a review. Governance is the control that makes cost reduction permanent, not a one-time fix.
Talk to a Datadog partner. A Datadog partner with experience in cost governance can run the above steps faster, with more institutional knowledge of where waste typically hides and how to address it without losing coverage. Engaging a partner 90 days before renewal is usually enough time to make a meaningful difference to the renewal conversation.

05

Related Datadog cost guides

Full guide

Datadog Pricing & Cost Optimisation

The complete guide: cost driver table, 5-step FinOps framework, optimisation levers and the 30-day review plan.

Log costs

Datadog Logs Pricing: Control Log Costs

Ingestion, indexing and retention explained with practical controls for the most common log cost drivers.

Metrics and cardinality

Datadog Metrics Cost Optimisation

Why custom metrics get expensive, what cardinality means and how to reduce waste without losing visibility.

06

Frequently asked questions

Why is Datadog so expensive?

Datadog's pricing is usage-based, which means cost grows with the volume of telemetry you generate. The platform delivers genuine value (it is one of the most capable observability platforms available) but when telemetry is unmanaged, costs grow faster than the business expects. The most common causes are high log volume, high-cardinality custom metrics, long retention defaults and product sprawl across environments that do not need full coverage.

Is Datadog too expensive compared to alternatives?

Datadog's value is in the breadth and depth of its platform: one tool for infrastructure, APM, logs, security, synthetics, cloud cost and more. Alternatives are often cheaper per dimension but require integrating multiple tools, each with its own cost and operational overhead. Before concluding that Datadog is too expensive for your organisation, it is worth establishing whether the cost is the platform or the way the platform is being used.

Should I replace Datadog to reduce costs?

Rarely. Replacing an observability platform is expensive, risky and time-consuming. Migrations disrupt engineering teams, lose historical data and often reveal that the new platform has its own cost growth patterns as usage scales. In most cases, a structured cost review identifies enough waste to bring spend under control without replacement, leaving the team with a more governable, attributable platform than they started with.

How quickly can Datadog costs be reduced?

Some controls take effect immediately: exclusion filters, retention changes and pipeline rules apply as soon as they are configured. Others, like cardinality reduction from tag changes, take effect over the subsequent billing period as old series age out. A 30-day structured review typically surfaces and implements enough changes to show a clear cost reduction within the first billing cycle after it completes.

Why Is Datadog Expensive,and How Do You Fix It?