Databricks Waste Detection & Optimization Insights

Waste detection finds spend that likely produced little value or can be reduced safely. Optimization insights identify configuration changes that may reduce cost, improve reliability, or reduce operational risk.

Waste types

Type	What it means
Idle cluster	Cluster ran with little or no useful activity.
Idle warehouse	SQL warehouse clusters were running without queries.
Long-running cluster	Cluster ran for an unusually large portion of the observation window.
Failing queries	Failed SQL still consumed DBUs.
Long-running queries	Query runtime is high enough to deserve review.
Runaway job	A job run lasted far beyond its baseline.
Retry storm	Repeated failures or retries consumed avoidable compute.
Overprovisioned cluster	Utilization is low for the configured workers.
Long auto-termination	Idle timeout is much longer than recommended.
Unused table	Storage cost with no recent access signal.
Fat driver	Driver sizing appears excessive for observed usage.
Zombie model	Serving endpoint appears unused but still costs money.
Weekend waste	Non-production-like activity ran during weekends.
Poor pruning / Scanzilla	Queries scan much more data than expected.

Optimization types

Cluster and workload hygiene detectors include spot instance candidates, single-node candidates, excess workers, outdated runtimes, missing job timeouts, Photon candidates, interactive cluster misuse, pool candidates, instance-type mismatch, first-on-demand guardrails, pipeline dev mode, pipeline preview channel, and multi-cluster jobs.

Storage optimization detectors include poor read/write ratio, duplicate datasets, excessive retention, non-partitioned tables, and unoptimized storage.

How savings are estimated

Savings estimates are conservative approximations derived from recent observed cost and the recommended change. Examples:

Reducing idle runtime estimates savings from idle minutes/hours and recent cost rate.
Worker reduction estimates savings from the difference between current and recommended workers.
Single-node conversion estimates savings from worker cost that may be removed.
Storage opportunities use observed storage or predictive-optimization cost where available.

Treat estimates as prioritization guidance, not an invoice guarantee.

Activity filtering

Detectors require enough recent observations to avoid false positives. For example, cluster hygiene uses node timeline utilization and runtime windows; query hygiene uses query history; warehouse hygiene checks warehouse events and query activity.

From insight to action

Some waste insights generate action plans. Review the evidence, safety tier, and proposed change before approving execution. If a finding is expected or intentionally accepted, dismiss it so future reviews focus on new problems.