Skip to content

Databricks Cost Allocation and Chargeback

Cost Allocation is where admins manage how LakeSentry assigns spend to accountable owners. It combines two dimensions:

  • Accountability: org unit → department → team, used for chargeback and budgets.
  • Context: optional projects and shared buckets, used to explain why the spend happened.

LakeSentry is conservative: if a cost line cannot be attributed with a defensible path, it remains workspace/unattributed instead of being guessed.

Shows attribution quality, unattributed spend, and the largest allocation gaps. Use this tab first after onboarding or after adding new workspaces to find missing mappings, tags, or rules.

Rules override default ownership when you know how a resource should be charged. Lower priority numbers run first; at the same priority, workspace-specific rules are favored over global rules.

Rule types:

TypeUse forMatch criteria
ExactA known cluster, warehouse, job, or pipelineResource type + exact resource ID
PatternNaming conventions, tag conventions, or principal domainsResource type, regex pattern, principal domain, tags
ProportionalPlatform overheadSKU pattern and/or usage type, split by compute spend

Attribution modes:

  • Direct: assign 100% of the matched spend to one team and optional project.
  • Split: distribute spend across multiple teams by percentages that must total 100%.
  • Shared: mark a resource as shared infrastructure and place it in a shared bucket.

Lists resources and their current attribution path. Use it to inspect clusters, warehouses, jobs, pipelines, and other cost subjects before creating an exact rule.

Projects are a horizontal label for spend that cuts across the hierarchy. A team can own spend directly while the project explains the business context.

Shows Databricks tag usage and how tags contribute to attribution. Tag-based pattern rules are useful when teams already tag clusters, jobs, or warehouses consistently.

For each billing record, LakeSentry evaluates attribution in this order:

  1. Session-based allocation for eligible shared compute — SQL Serverless warehouses and all-purpose clusters — when query history or audit activity is available. Warehouse sessions split by query duration; all-purpose sessions split by command count. A gap greater than 2 hours starts a new session.
  2. Proportional rules for overhead categories such as networking, database, predictive optimization, and other platform usage.
  3. Exact and pattern rules by priority. Exact rules match a specific resource type and ID; pattern rules AND together name, tag, principal-domain, and resource-type criteria.
  4. Fallback waterfall: shared-resource owner, mapped user, mapped resource owner, known user without team, then workspace/unattributed.

See Cost Attribution & Confidence Tiers for the full attribution model and confidence tiers.

  • Start with Mappings so users and teams exist before creating rules.
  • Prefer exact rules for high-cost named resources.
  • Prefer pattern rules for stable naming conventions and required tags.
  • Use proportional rules only for overhead where no single owner exists.
  • Review the Overview tab after every connector backfill to catch newly unattributed spend.