Databricks Account and Connector Setup
Databricks connectors are the bridge between LakeSentry and your Databricks account. Each connector authenticates with OAuth M2M service-principal credentials or a PAT and provides access to billing data, compute metadata, and workload history through Databricks system tables.
This page covers the current self-service setup process from creating credentials to verifying connectivity. Direct Connection is the default self-service mode in Settings → Connector. External Connector/collector deployments are controlled deployments and are marked as coming soon in the setup UI.
Prerequisites
Section titled “Prerequisites”Before creating a Databricks connector, ensure you have:
- Databricks account admin access (to create service principals and grant permissions)
- Unity Catalog enabled on your Databricks account (required for system table access)
- At least one workspace URL per region you want LakeSentry to monitor
- A SQL warehouse the LakeSentry service principal can use
- Your Databricks account ID (found in the account console URL or settings page)
Step 1: Create a service principal
Section titled “Step 1: Create a service principal”LakeSentry authenticates using OAuth machine-to-machine (M2M) via a Databricks service principal.
- Go to your Databricks account console.
- Navigate to User Management > Service Principals.
- Click Add Service Principal and give it a descriptive name (e.g.,
lakesentry-reader). - Under OAuth, generate an OAuth secret. Copy both the Client ID and Secret.
Step 2: Grant system table permissions
Section titled “Step 2: Grant system table permissions”The service principal needs SQL warehouse access plus SELECT access to the system tables LakeSentry ingests. Run these SQL statements in a workspace with Unity Catalog enabled and replace lakesentry-reader with your service-principal name:
-- Allow LakeSentry to run SQL statements through the chosen warehouse.-- The exact securable name depends on how your Databricks workspace names warehouses.GRANT CAN USE ON SQL WAREHOUSE `<warehouse-name>` TO `lakesentry-reader`;
-- Grant access to billing tables (account-level)GRANT USE CATALOG ON CATALOG system TO `lakesentry-reader`;GRANT USE SCHEMA ON SCHEMA system.billing TO `lakesentry-reader`;GRANT SELECT ON TABLE system.billing.usage TO `lakesentry-reader`;GRANT SELECT ON TABLE system.billing.list_prices TO `lakesentry-reader`;
-- Grant access to compute tables (regional)GRANT USE SCHEMA ON SCHEMA system.compute TO `lakesentry-reader`;GRANT SELECT ON TABLE system.compute.clusters TO `lakesentry-reader`;GRANT SELECT ON TABLE system.compute.node_timeline TO `lakesentry-reader`;GRANT SELECT ON TABLE system.compute.node_types TO `lakesentry-reader`;GRANT SELECT ON TABLE system.compute.warehouses TO `lakesentry-reader`;GRANT SELECT ON TABLE system.compute.warehouse_events TO `lakesentry-reader`;
-- Grant access to job/pipeline tables (regional)GRANT USE SCHEMA ON SCHEMA system.lakeflow TO `lakesentry-reader`;GRANT SELECT ON TABLE system.lakeflow.jobs TO `lakesentry-reader`;GRANT SELECT ON TABLE system.lakeflow.job_tasks TO `lakesentry-reader`;GRANT SELECT ON TABLE system.lakeflow.job_run_timeline TO `lakesentry-reader`;GRANT SELECT ON TABLE system.lakeflow.job_task_run_timeline TO `lakesentry-reader`;GRANT SELECT ON TABLE system.lakeflow.pipelines TO `lakesentry-reader`;GRANT SELECT ON TABLE system.lakeflow.pipeline_update_timeline TO `lakesentry-reader`;
-- Grant access to query history (regional)GRANT USE SCHEMA ON SCHEMA system.query TO `lakesentry-reader`;GRANT SELECT ON TABLE system.query.history TO `lakesentry-reader`;
-- Grant access to workspace metadataGRANT USE SCHEMA ON SCHEMA system.access TO `lakesentry-reader`;GRANT SELECT ON TABLE system.access.workspaces_latest TO `lakesentry-reader`;GRANT SELECT ON TABLE system.access.table_lineage TO `lakesentry-reader`;GRANT SELECT ON TABLE system.access.clean_room_events TO `lakesentry-reader`;GRANT SELECT ON TABLE system.access.assistant_events TO `lakesentry-reader`;GRANT SELECT ON TABLE system.access.inbound_network TO `lakesentry-reader`;GRANT SELECT ON TABLE system.access.outbound_network TO `lakesentry-reader`;
-- Grant access to table metadataGRANT USE SCHEMA ON SCHEMA system.information_schema TO `lakesentry-reader`;GRANT SELECT ON TABLE system.information_schema.tables TO `lakesentry-reader`;
-- Grant access to model serving and storage metadataGRANT USE SCHEMA ON SCHEMA system.serving TO `lakesentry-reader`;GRANT SELECT ON TABLE system.serving.served_entities TO `lakesentry-reader`;GRANT SELECT ON TABLE system.serving.endpoint_usage TO `lakesentry-reader`;
GRANT USE SCHEMA ON SCHEMA system.storage TO `lakesentry-reader`;GRANT SELECT ON TABLE system.storage.predictive_optimization_operations_history TO `lakesentry-reader`;Planned or environment-dependent tables
Section titled “Planned or environment-dependent tables”Some LakeSentry pages depend on tables that may not exist in every Databricks deployment or are not yet part of the default direct-extraction registry.
-- MLflow tracking is planned for direct extraction. Grant only if LakeSentry support-- has enabled MLflow ingestion for your tenant.GRANT USE SCHEMA ON SCHEMA system.mlflow TO `lakesentry-reader`;GRANT SELECT ON TABLE system.mlflow.experiments_latest TO `lakesentry-reader`;GRANT SELECT ON TABLE system.mlflow.runs_latest TO `lakesentry-reader`;GRANT SELECT ON TABLE system.mlflow.run_metrics_history TO `lakesentry-reader`;The Audit Log page shows LakeSentry’s own internal audit trail. It does not require Databricks system.access.audit, which is not part of the current active extraction registry.
Step 3: Create the connector
Section titled “Step 3: Create the connector”- In LakeSentry, go to Settings > Connector.
- Click Connect to Databricks or Add Connector.
- Choose Direct Connection.
- Fill in the required fields:
| Field | Description |
|---|---|
| Workspace URL | The URL of any Databricks workspace in your account (e.g., https://adb-1234567890123456.7.azuredatabricks.net). Cloud provider is auto-detected from the URL. |
| OAuth Client ID | The client ID from the service principal you created. |
| OAuth Secret | The secret you saved in Step 1. |
| PAT | Optional alternative if you are using token authentication instead of OAuth M2M. |
- Click Validate Credentials. LakeSentry validates credentials and Databricks access.
- Once validated, save the connector. The connector status shows as active once extraction succeeds.
What the test validates
Section titled “What the test validates”The connection test checks:
- OAuth credentials are valid and not expired
- The service principal can list SQL warehouses (workspace-level API access)
- At least one SQL warehouse exists in the workspace
- The service principal can
SELECTfrom system tables (probed automatically)
If the test fails, verify that the service principal has workspace-level access, at least one SQL warehouse exists, and the OAuth secret hasn’t expired.
Step 4: Add additional regions
Section titled “Step 4: Add additional regions”If you operate Databricks in more than one region, add a separate connector for each additional region. Databricks system tables are regional, so a workspace in one region cannot provide complete metadata for another region.
- On Settings → Connector, click Add Connector.
- Select the region (e.g.,
eastus,westeurope,us-west-2). - Enter a workspace URL from that region (e.g.,
https://adb-1234567890123456.7.azuredatabricks.net). - Click Save.
For detailed information on multi-region configuration, see Region Connectors.
Step 5: Configure Data Sync
Section titled “Step 5: Configure Data Sync”For Direct Connection, LakeSentry extracts data on the Data Sync schedule shown in Settings → Connector.
| Schedule | Use when |
|---|---|
| Daily at 8 AM UTC | Default baseline for cost reporting. |
| Every 4 hours | You want fresher dashboards without hourly extraction. |
| Every hour | You need the freshest supported direct-sync cadence. |
| Paused | You need to stop extraction temporarily. |
You can also trigger an immediate sync, cancel a running sync, or reset checkpoints from the connector detail page.
Verifying connectivity
Section titled “Verifying connectivity”After the first sync completes, check Settings → Connector:
| Indicator | Healthy state |
|---|---|
| Connector status | Active or synced |
| Last sync | Shows a recent timestamp |
| Extraction runs | Recent run succeeded |
| Tables extracted | Lists successfully extracted system tables |
If the status stays pending after the first sync, see Data Freshness and Common Issues for common causes.
Security model
Section titled “Security model”- LakeSentry connects via a read-only service principal for reporting and detection. Databricks write permissions are only needed for approved executable actions.
- The service principal accesses system tables only — billing, compute, job, and query metadata. It never touches your business data, notebooks, or query results.
- Direct Connection uses the credentials you provide for Databricks validation and extraction. External Connector deployments use collector tokens that are hashed server-side.
- All data transfer happens over HTTPS.
Removing a connector
Section titled “Removing a connector”To disconnect LakeSentry from your Databricks account:
- Stop extraction — Pause Direct Connection Data Sync. If you use External Connector, disable or delete the Databricks jobs running the collector in each region.
- Delete connectors — Remove each connector from Settings → Connector.
- Revoke the service principal — In the Databricks account console, delete the service principal or rotate its OAuth secret.
Next steps
Section titled “Next steps”- Region Connectors — Multi-region setup and management
- Data Freshness — Understanding sync latency
- Collector Deployment — External Connector deployment reference