Databricks Asset Bundles resources

Databricks Asset Bundles allows you to specify information about the Azure Databricks resources used by the bundle in the resources mapping in the bundle configuration. See resources reference.

This page provides configuration reference for all supported resource types for bundles and provides details and an example for each supported type. For additional examples, see Bundle configuration examples.

The JSON schema for bundles that is used to validate YAML configuration is in the Databricks CLI GitHub repository.

Tip

To generate YAML for any existing resource, use the databricks bundle generate command. See databricks bundle generate.

Supported resources

The following table lists supported resource types for bundles (YAML and Python, where applicable). Some resources can be created by defining them in a bundle and deploying the bundle, and some resources can only be created by referencing an existing asset to include in the bundle.

Resource configuration defines a Databricks object that corresponds to a Databricks REST API object. The REST API object's supported create request fields, expressed as YAML, are the resource's supported keys. Links to documentation for each resource's corresponding object are in the table below.

Tip

The databricks bundle validate command returns warnings if unknown resource properties are found in bundle configuration files.

Resource	Python support	Corresponding REST API object
alert		Alert object
app		App object
catalog (Unity Catalog)		Catalog object
cluster		Cluster object
dashboard		Dashboard object
database_catalog		Database catalog object
database_instance		Database instance object
experiment		Experiment object
job	Jobs	Job object
model (legacy)		Model (legacy) object
model_serving_endpoint		Model serving endpoint object
pipeline	Pipelines	Pipeline object
postgres_branch		Postgres branch object
postgres_endpoint		Postgres compute endpoint object
postgres_project		Postgres project object
quality_monitor		Quality monitor object
registered_model (Unity Catalog)		Registered model object
schema (Unity Catalog)	Schemas	Schema object
secret_scope		Secret scope object
sql_warehouse		SQL warehouse object
synced_database_table		Synced database table object
volume (Unity Catalog)	Volumes	Volume object

alert

Type: Map

The alert resource defines a SQL alert (v2).

Added in Databricks CLI version 0.279.0

alerts:
  <alert-name>:
    <alert-field-name>: <alert-field-value>

Key	Type	Description
`custom_description`	String	Optional. Custom description for the alert. Supports mustache template. Added in Databricks CLI version 0.279.0
`custom_summary`	String	Optional. Custom summary for the alert. Supports mustache template. Added in Databricks CLI version 0.279.0
`display_name`	String	Required. The display name of the alert, for example, `Example alert`. Added in Databricks CLI version 0.279.0
`evaluation`	Map	Required. The evaluation configuration for the alert. See alert.evaluation. Added in Databricks CLI version 0.279.0
`lifecycle`	Map	Contains the lifecycle settings for a resource. It controls the behavior of the resource when it is deployed or destroyed. See lifecycle. Added in Databricks CLI version 0.279.0
`parent_path`	String	Optional. The workspace path of the folder containing the alert. Can only be set on create, and cannot be updated. Example: `/Users/someone@example.com`. Added in Databricks CLI version 0.279.0
`permissions`	Sequence	The alert permissions. See permissions. Added in Databricks CLI version 0.279.0
`query_text`	String	Required. Text of the query to be run, for example, `SELECT 1`. Added in Databricks CLI version 0.279.0
`run_as`	Map	Optional. Specifies the identity that will be used to run the alert. This field allows you to configure alerts to run as a specific user or service principal. See run_as. For user identity: Set `user_name` to the email of an active workspace user. Users can only set this to their own email. For service principal: Set `service_principal_name` to the application ID. Requires the servicePrincipal/user role. If not specified, the alert will run as the request user. Added in Databricks CLI version 0.279.0
`schedule`	Map	Required. The schedule configuration for the alert. See alert.schedule. Added in Databricks CLI version 0.279.0
`warehouse_id`	String	Required. ID of the SQL warehouse attached to the alert, for example, `a7066a8ef796be84`. Added in Databricks CLI version 0.279.0

alert.evaluation

Type: Map

The evaluation configuration for the alert.

Key	Type	Description
`comparison_operator`	String	The operator used for comparison in the alert evaluation.
`empty_result_state`	String	The alert state if the result is empty. Avoid setting this field to `UNKNOWN` because `UNKNOWN` state is planned to be deprecated.
`notification`	Map	The user or other destination to notify when the alert is triggered. See alert.evaluation.notification.
`source`	Map	The source column from the result to use to evaluate the alert. See alert.evaluation.source.
`threshold`	Map	The threshold to use for alert evaluation. This can be a column or a value. See alert.evaluation.threshold.

alert.evaluation.notification

Type: Map

The user or other destination to notify when the alert is triggered.

Key	Type	Description
`notify_on_ok`	Boolean	Optional. Whether to notify alert subscribers when alert returns back to normal.
`retrigger_seconds`	Integer	Optional. Number of seconds an alert waits after being triggered before it is allowed to send another notification. If set to `0` or omitted, the alert will not send any further notifications after the first trigger. Setting this value to `1` allows the alert to send a notification on every evaluation where the condition is met, effectively making it always retrigger for notification purposes.
`subscriptions`	Sequence	Optional. Unordered list of notification subscriptions. See alert.evaluation.notification.subscriptions.

alert.evaluation.notification.subscriptions

Type: Sequence

An unordered list of notification subscriptions.

Each item in the list is an AlertSubscription:

Key	Type	Description
`destination_id`	String	The ID of the notification destination.
`user_email`	String	The email address of the user to notify.

alert.evaluation.source

Type: Map

Source column from result to use to evaluate the alert.

Key	Type	Description
`aggregation`	String	The aggregation method to apply to the source column. Valid values are `SUM`, `COUNT`, `COUNT_DISTINCT`, `AVG`, `MEDIAN`, `MIN`, `MAX`, `STDDEV`
`display`	String	The display name for the source column.
`name`	String	The name of the source column from the query result.

alert.evaluation.threshold

Type: Map

Threshold to use for alert evaluation, can be a column or a value.

Key	Type	Description
`column`	Map	Column reference to use as the threshold. See alert.evaluation.source.
`value`	Map	Literal value to use as the threshold. See alert.evaluation.threshold.value.

alert.evaluation.threshold.value

Type: Map

Literal value to use as the threshold. Specify one of the following value types.

Key	Type	Description
`bool_value`	Boolean	Optional. Boolean value for the threshold, for example, `true`.
`double_value`	Double	Optional. Numeric value for the threshold, for example, `1.25`.
`string_value`	String	Optional. String value for the threshold, for example, `test`.

alert.schedule

Type: Map

The schedule configuration for the alert.

Key	Type	Description
`pause_status`	String	Optional. Whether this schedule is paused or not. Valid values: `UNPAUSED`, `PAUSED`. Default: `UNPAUSED`.
`quartz_cron_schedule`	String	Required. A cron expression using quartz syntax that specifies the schedule for this pipeline. The quartz format is described in quartz scheduler format.
`timezone_id`	String	Required. A Java timezone id. The schedule will be resolved using this timezone. This will be combined with the `quartz_cron_schedule` to determine the schedule. See SET TIME ZONE for details.

Examples

The following example configuration defines an alert with a simple evaluation:

resources:
  alerts:
    my_alert:
      display_name: my_alert
      evaluation:
        comparison_operator: EQUAL
        source:
          name: '1'
        threshold:
          value:
            double_value: 2
      query_text: select 2
      schedule:
        quartz_cron_schedule: '44 19 */1 * * ?'
        timezone_id: Europe/Amsterdam
      warehouse_id: 799f096837fzzzz4

The following example configuration defines an alert with permissions that evaluates using aggregation and sends notifications:

resources:
  alerts:
    my_alert:
      permissions:
        - level: CAN_MANAGE
          user_name: someone@example.com
      custom_summary: 'My alert'
      display_name: 'My alert'
      evaluation:
        comparison_operator: 'EQUAL'
        notification:
          notify_on_ok: false
          retrigger_seconds: 1
        source:
          aggregation: 'MAX'
          display: '1'
          name: '1'
        threshold:
          value:
            double_value: 2
      query_text: 'select 2'
      schedule:
        pause_status: 'UNPAUSED'
        quartz_cron_schedule: '44 19 */1 * * ?'
        timezone_id: 'Europe/Amsterdam'
      warehouse_id: 799f096837fzzzz4

app

Type: Map

The app resource defines a Databricks app. For information about Databricks Apps, see Databricks Apps.

To add an app, specify the settings to define the app, including the required source_code_path.

Tip

You can initialize a bundle with a Streamlit Databricks app using the following command:

databricks bundle init https://github.com/databricks/bundle-examples --template-dir contrib/templates/streamlit-app

Added in Databricks CLI version 0.239.0

apps:
  <app-name>:
    <app-field-name>: <app-field-value>

Key	Type	Description
`budget_policy_id`	String	The budget policy ID for the app. Added in Databricks CLI version 0.243.0
`compute_size`	String	The compute size for the app. Valid values are `MEDIUM` or `LARGE` but depend on workspace configuration. Added in Databricks CLI version 0.273.0
`config`	Map	App configuration commands and environment variables. When specified, this configuration is written to an app.yaml file in the source code path during deployment. This allows you to define app configuration directly in the bundle YAML instead of maintaining a separate app.yaml file. See app.config. Added in Databricks CLI version 0.283.0
`description`	String	The description of the app. Added in Databricks CLI version 0.239.0
`lifecycle`	Map	The behavior of the resource when it is deployed or destroyed. See lifecycle. Added in Databricks CLI version 0.268.0
`name`	String	The name of the app. The name must contain only lowercase alphanumeric characters and hyphens. It must be unique within the workspace. Added in Databricks CLI version 0.239.0
`permissions`	Sequence	The app's permissions. See permissions. Added in Databricks CLI version 0.239.0
`resources`	Sequence	The app compute resources. See app.resources. Added in Databricks CLI version 0.239.0
`source_code_path`	String	The `./app` local path of the Databricks app source code. Added in Databricks CLI version 0.239.0
`user_api_scopes`	Sequence	The user API scopes. Added in Databricks CLI version 0.246.0

app.config

App configuration commands and environment variables. See Configure Databricks app execution with app.yaml.

Key	Type	Description
`command`	Sequence	The commands to run the app, for example `["streamlit", "run", "app.py"]`
`env`	Sequence	A list of `name` and `value` pairs that specify app environment variables.

app.resources

Type: Sequence

A list of compute resources for the app.

Each item in the list is an AppResource:

Key	Type	Description
`description`	String	The description of the app resource.
`database`	Map	The settings that identify the Lakebase database to use. See app.resources.database.
`experiment`	Map	The settings that identify the MLflow experiment to use. See app.resources.experiment.
`genie_space`	Map	The settings that identify the Genie space to use. See app.resources.genie_space.
`job`	Map	The settings that identify the job resource to use. See app.resources.job.
`name`	String	The name of the app resource.
`secret`	Map	The settings that identify the Azure Databricks secret resource to use. See app.resources.secret.
`serving_endpoint`	Map	The settings that identify the model serving endpoint resource to use. See app.resources.serving_endpoint.
`sql_warehouse`	Map	The settings that identify the SQL warehouse resource to use. See app.resources.sql_warehouse.
`uc_securable`	Map	The settings that identify the Unity Catalog volume to use. See app.resources.uc_securable.

app.resources.database

Type: Map

The settings that identify the Lakebase database to use.

Key	Type	Description
`database_name`	String	The name of the database.
`instance_name`	String	The name of the database instance.
`permission`	String	The permission level for the database. Valid values are `CAN_CONNECT_AND_CREATE`.

app.resources.experiment

Type: Map

The settings that identify the MLflow experiment to use.

Key	Type	Description
`experiment_id`	String	The ID of the MLflow experiment.
`permission`	String	The permission level for the experiment. Valid values include `CAN_READ`, `CAN_EDIT`, `CAN_MANAGE`.

app.resources.genie_space

Type: Map

The settings that identify the Genie space to use.

Key	Type	Description
`name`	String	The name of the Genie space.
`permission`	String	The permission level for the space. Valid values include `CAN_VIEW`, `CAN_EDIT`, `CAN_MANAGE`, `CAN_RUN`.
`space_id`	String	The ID of the Genie space, for example `550e8400-e29b-41d4-a716-999955440000`.

app.resources.job

Type: Map

The settings that identify the job resource to use.

Key	Type	Description
`id`	String	The ID of the job.
`permission`	String	The permission level for the job. Valid values include `CAN_VIEW`, `CAN_MANAGE_RUN`, `CAN_MANAGE`, `IS_OWNER`.

app.resources.secret

Type: Map

The settings that identify the Azure Databricks secret resource to use.

Key	Type	Description
`key`	String	The key of the secret to grant permission.
`permission`	String	The permission level for the secret. Valid values include `READ`, `WRITE`, `MANAGE`.
`scope`	String	The name of the secret scope.

app.resources.serving_endpoint

Type: Map

The settings that identify the model serving endpoint resource to use.

Key	Type	Description
`name`	String	The name of the serving endpoint.
`permission`	String	The permission level for the serving endpoint. Valid values include `CAN_QUERY`, `CAN_MANAGE`, `CAN_VIEW`.

app.resources.sql_warehouse

Type: Map

The settings that identify the SQL warehouse to use.

Key	Type	Description
`id`	String	The ID of the SQL warehouse.
`permission`	String	The permission level for the SQL warehouse. Valid values include `CAN_USE`, `CAN_MANAGE`, `IS_OWNER`.

app.resources.uc_securable

Type: Map

The settings that identify the Unity Catalog volume to use.

Key	Type	Description
`permission`	String	The permission level for the Unity Catalog securable. Valid values are `READ_VOLUME` and `WRITE_VOLUME`.
`securable_full_name`	String	The full name of the Unity Catalog securable in the format `catalog.schema.volume`.
`securable_type`	String	The type of the Unity Catalog securable. Valid values are `VOLUME`.

Examples

For a tutorial that walks through creating a bundle that defines an app, see Manage Databricks apps using Databricks Asset Bundles.

The following example defines a basic app:

resources:
  apps:
    hello_world_app:
      name: 'hello-world-app'
      source_code_path: . # This assumes the app source code is at the root of the project.
      description: 'A Databricks app'

The following example creates an app named my_app that manages a job created by the bundle. For the complete example, see the bundle-examples GitHub repository.

resources:
  jobs:
    # Define a job in the bundle
    hello_world:
      name: hello_world
      tasks:
        - task_key: task
          spark_python_task:
            python_file: ../src/main.py
          environment_key: default

      environments:
        - environment_key: default
          spec:
            environment_version: '2'

  # Define an app that manages the job in the bundle
  apps:
    job_manager:
      name: 'job_manager_app'
      description: 'An app which manages a job created by this bundle'

      # The location of the source code for the app
      source_code_path: ../src/app

      # The resources in the bundle which this app has access to. This binds the resource in the app with the bundle resource.
      resources:
        - name: 'app-job'
          job:
            id: ${resources.jobs.hello_world.id}
            permission: 'CAN_MANAGE_RUN'

The corresponding app.yaml defines the configuration for running the app:

command:
  - flask
  - --app
  - app
  - run
  - --debug
env:
  - name: JOB_ID
    valueFrom: 'app-job'

The following example creates an app that has access to an MLflow experiment created by the bundle:

resources:
  experiments:
    # Define an MLflow experiment in the bundle
    my_experiment:
      name: /Users/${workspace.current_user.userName}/my-app-experiment

  apps:
    my_ml_app:
      name: 'my-ml-app'
      description: 'An app with access to an MLflow experiment'
      source_code_path: ./app

      # Grant the app access to the MLflow experiment
      resources:
        - name: 'app-experiment'
          experiment:
            experiment_id: ${resources.experiments.my_experiment.id}
            permission: 'CAN_MANAGE'

Alternatively, the following example defines an app with custom configuration defined in the bundle configuration:

resources:
  apps:
    my_app:
      name: my_app
      description: my_app_description
      source_code_path: ./app
      config:
        command: ['flask', '--app', 'app', 'run']
        env:
          - name: MY_ENV_VAR
            value: test_value
          - name: ANOTHER_VAR
            value: another_value

catalogs

Type: Map

The catalog resource allows you to define catalogs (Unity Catalog) in a bundle.

Note

Using Databricks Asset Bundles to define catalogs is only supported if you are using the direct deployment engine.

Added in Databricks CLI version 0.287.0

catalogs:
  <catalog-name>:
    <catalog-field-name>: <catalog-field-value>

Key	Type	Description
`comment`	String	A user-provided free-form text description of the catalog. Added in Databricks CLI version 0.287.0
`connection_name`	String	The name of the connection to an external data source. Added in Databricks CLI version 0.287.0
`grants`	Sequence	The grants associated with the catalog. See grant. Added in Databricks CLI version 0.287.0
`lifecycle`	Map	Contains the lifecycle settings for a resource. It controls the behavior of the resource when it is deployed or destroyed. See lifecycle. Added in Databricks CLI version 0.287.0
`name`	String	Required. The name of the catalog. Added in Databricks CLI version 0.287.0
`options`	Object	A map of key-value properties attached to the securable. Added in Databricks CLI version 0.287.0
`properties`	Object	A map of key-value properties attached to the securable. Added in Databricks CLI version 0.287.0
`provider_name`	String	The name of the delta sharing provider. A Delta Sharing catalog is a catalog that is based on a Delta share on a remote sharing server. See What is Delta Sharing?. Added in Databricks CLI version 0.287.0
`share_name`	String	The name of the share under the share provider. Added in Databricks CLI version 0.287.0
`storage_root`	String	The storage root URL for managed tables within the catalog. Added in Databricks CLI version 0.287.0

Example

resources:
  catalogs:
    my_catalog:
      name: my_catalog
      comment: 'Catalog created by Databricks Asset Bundles'
      properties:
        purpose: 'Testing'
      grants:
        - principal: someone@example.com
          privileges:
            - USE_CATALOG
            - CREATE_SCHEMA

  schemas:
    my_schema:
      name: my_schema
      catalog_name: ${resources.catalogs.my_catalog.name}
      comment: 'Schema in custom catalog'

cluster

Type: Map

The cluster resource defines a cluster.

Added in Databricks CLI version 0.229.0 or below

clusters:
  <cluster-name>:
    <cluster-field-name>: <cluster-field-value>

Key	Type	Description
`apply_policy_default_values`	Boolean	When set to true, fixed and default values from the policy will be used for fields that are omitted. When set to false, only fixed values from the policy will be applied. Added in Databricks CLI version 0.229.0 or below
`autoscale`	Map	Parameters needed in order to automatically scale clusters up and down based on load. See autoscale. Added in Databricks CLI version 0.229.0 or below
`autotermination_minutes`	Integer	Automatically terminates the cluster after it is inactive for this time in minutes. If not set, this cluster will not be automatically terminated. If specified, the threshold must be between 10 and 10000 minutes. Users can also set this value to 0 to explicitly disable automatic termination. Added in Databricks CLI version 0.229.0 or below
`aws_attributes`	Map	Attributes related to clusters running on Amazon Web Services. If not specified at cluster creation, a set of default values will be used. See aws_attributes. Added in Databricks CLI version 0.229.0 or below
`azure_attributes`	Map	Attributes related to clusters running on Microsoft Azure. If not specified at cluster creation, a set of default values will be used. See azure_attributes. Added in Databricks CLI version 0.229.0 or below
`cluster_log_conf`	Map	The configuration for delivering spark logs to a long-term storage destination. See cluster_log_conf. Added in Databricks CLI version 0.229.0 or below
`cluster_name`	String	Cluster name requested by the user. This doesn't have to be unique. If not specified at creation, the cluster name will be an empty string. Added in Databricks CLI version 0.229.0 or below
`custom_tags`	Map	Additional tags for cluster resources. Databricks will tag all cluster resources (e.g., AWS instances and EBS volumes) with these tags in addition to `default_tags`. Added in Databricks CLI version 0.229.0 or below
`data_security_mode`	String	The data governance model to use when accessing data from a cluster. Valid values include `NONE`, `SINGLE_USER`, `USER_ISOLATION`, `LEGACY_SINGLE_USER`, `LEGACY_TABLE_ACL`, `LEGACY_PASSTHROUGH`. Added in Databricks CLI version 0.229.0 or below
`docker_image`	Map	The custom docker image. See docker_image. Added in Databricks CLI version 0.229.0 or below
`driver_instance_pool_id`	String	The optional ID of the instance pool for the driver of the cluster belongs. The pool cluster uses the instance pool with id (instancepool_id) if the driver pool is not assigned. Added in Databricks CLI version 0.229.0 or below*
`driver_node_type_id`	String	The node type of the Spark driver. This field is optional. If unset, the driver node type is set to the value of `node_type_id`. This field, along with `node_type_id`, should not be set if `virtual_cluster_size` is set. If both `driver_node_type_id`, `node_type_id`, and `virtual_cluster_size` are specified, `driver_node_type_id` and `node_type_id` take precedence. Added in Databricks CLI version 0.229.0 or below
`enable_elastic_disk`	Boolean	Autoscaling Local Storage: when enabled, this cluster will dynamically acquire additional disk space when its Spark workers are running low on disk space. This feature requires specific AWS permissions to function correctly - refer to the User Guide for more details. Added in Databricks CLI version 0.229.0 or below
`enable_local_disk_encryption`	Boolean	Whether to enable LUKS on cluster VMs' local disks. Added in Databricks CLI version 0.229.0 or below
`gcp_attributes`	Map	Attributes related to clusters running on Google Cloud Platform. If not specified at cluster creation, a set of default values will be used. See gcp_attributes. Added in Databricks CLI version 0.229.0 or below
`init_scripts`	Sequence	The configuration for storing init scripts. Any number of destinations can be specified. The scripts are executed sequentially in the order provided. See init_scripts. Added in Databricks CLI version 0.229.0 or below
`instance_pool_id`	String	The optional ID of the instance pool to which the cluster belongs. Added in Databricks CLI version 0.229.0 or below
`is_single_node`	Boolean	This field can only be used when `kind = CLASSIC_PREVIEW`. When set to true, Databricks will automatically set single node related `custom_tags`, `spark_conf`, and `num_workers`. Added in Databricks CLI version 0.237.0
`kind`	String	The kind of compute described by this compute specification. Added in Databricks CLI version 0.237.0
`lifecycle`	Map	Contains the lifecycle settings for a resource. It controls the behavior of the resource when it is deployed or destroyed. See lifecycle. Added in Databricks CLI version 0.268.0
`node_type_id`	String	This field encodes, through a single value, the resources available to each of the Spark nodes in this cluster. For example, the Spark nodes can be provisioned and optimized for memory or compute intensive workloads. A list of available node types can be retrieved by using the List node types API. Added in Databricks CLI version 0.229.0 or below
`num_workers`	Integer	Number of worker nodes that this cluster should have. A cluster has one Spark Driver and `num_workers` Executors for a total of `num_workers` + 1 Spark nodes. Added in Databricks CLI version 0.229.0 or below
`permissions`	Sequence	The cluster permissions. See permissions. Added in Databricks CLI version 0.229.0 or below
`policy_id`	String	The ID of the cluster policy used to create the cluster if applicable. Added in Databricks CLI version 0.229.0 or below
`remote_disk_throughput`	Integer	Remote disk throughput in bytes per second. Added in Databricks CLI version 0.257.0
`runtime_engine`	String	Determines the cluster's runtime engine, either `STANDARD` or `PHOTON`. Added in Databricks CLI version 0.229.0 or below
`single_user_name`	String	Single user name if datasecurity_mode is `SINGLE_USER`. Added in Databricks CLI version 0.229.0 or below*
`spark_conf`	Map	An object containing a set of optional, user-specified Spark configuration key-value pairs. Users can also pass in a string of extra JVM options to the driver and the executors via `spark.driver.extraJavaOptions` and `spark.executor.extraJavaOptions` respectively. Added in Databricks CLI version 0.229.0 or below
`spark_env_vars`	Map	An object containing a set of optional, user-specified environment variable key-value pairs. Added in Databricks CLI version 0.229.0 or below
`spark_version`	String	The Spark version of the cluster, e.g. `3.3.x-scala2.11`. A list of available Spark versions can be retrieved by using the List available Spark versions API. Added in Databricks CLI version 0.229.0 or below
`ssh_public_keys`	Sequence	SSH public key contents that will be added to each Spark node in this cluster. The corresponding private keys can be used to login with the user name `ubuntu` on port `2200`. Up to 10 keys can be specified. Added in Databricks CLI version 0.229.0 or below
`total_initial_remote_disk_size`	Integer	Total initial remote disk size in bytes. Added in Databricks CLI version 0.257.0
`use_ml_runtime`	Boolean	This field can only be used when `kind = CLASSIC_PREVIEW`. `effective_spark_version` is determined by `spark_version` (Databricks Runtime release), this field `use_ml_runtime`, and whether `node_type_id` is gpu node or not. Added in Databricks CLI version 0.237.0
`workload_type`	Map	Cluster Attributes showing for clusters workload types. See workload_type. Added in Databricks CLI version 0.229.0 or below

cluster.autoscale

Type: Map

Parameters for automatically scaling clusters up and down based on load.

Key	Type	Description
`min_workers`	Integer	The minimum number of workers to which the cluster can scale down when underutilized. It is also the initial number of workers the cluster will have after creation.
`max_workers`	Integer	The maximum number of workers to which the cluster can scale up when overloaded. `max_workers` must be strictly greater than `min_workers`.

cluster.aws_attributes

Type: Map

Attributes related to clusters running on Amazon Web Services.

Key	Type	Description
`zone_id`	String	Identifier for the availability zone/datacenter in which the cluster resides. This string will be of a form like `us-west-2a`.
`availability`	String	Availability type used for all subsequent nodes past the `first_on_demand` ones. Valid values are `SPOT`, `ON_DEMAND`, `SPOT_WITH_FALLBACK`.
`spot_bid_price_percent`	Integer	The max price for AWS spot instances, as a percentage of the corresponding instance type's on-demand price.
`instance_profile_arn`	String	Nodes for this cluster will only be placed on AWS instances with this instance profile.
`first_on_demand`	Integer	The first `first_on_demand` nodes of the cluster will be placed on on-demand instances. This value should be greater than `0`, to make sure the cluster driver node is placed on an on-demand instance.
`ebs_volume_type`	String	The type of EBS volumes that will be launched with this cluster. Valid values are `GENERAL_PURPOSE_SSD` or `THROUGHPUT_OPTIMIZED_HDD`.
`ebs_volume_count`	Integer	The number of volumes launched for each instance.
`ebs_volume_size`	Integer	The size of each EBS volume (in GiB) launched for each instance.
`ebs_volume_iops`	Integer	The number of IOPS per EBS gp3 volume.
`ebs_volume_throughput`	Integer	The throughput per EBS gp3 volume, in MiB per second.

cluster.azure_attributes

Type: Map

Attributes related to clusters running on Microsoft Azure.

Key	Type	Description
`first_on_demand`	Integer	The first `first_on_demand` nodes of the cluster will be placed on on-demand instances.
`availability`	String	Availability type used for all subsequent nodes past the `first_on_demand` ones. Valid values are `SPOT_AZURE`, `ON_DEMAND_AZURE`, `SPOT_WITH_FALLBACK_AZURE`.
`spot_bid_max_price`	Number	The max price for Azure spot instances. Use `-1` to specify lowest price.
`log_analytics_info`	Map	The configuration for Azure Log Analytics agent. See log_analytics_info.

cluster.azure_attributes.log_analytics_info

Type: Map

The configuration for Azure Log Analytics agent.

Key	Type	Description
`log_analytics_workspace_id`	String	The ID of the Azure Log Analytics workspace.
`log_analytics_primary_key`	String	The primary key for the Azure Log Analytics workspace.

cluster.gcp_attributes

Type: Map

Attributes related to clusters running on Google Cloud Platform.

Key	Type	Description
`use_preemptible_executors`	Boolean	Whether to use preemptible executors. Preemptible executors are preemptible GCE instances that may be reclaimed by GCE at any time.
`google_service_account`	String	The Google service account to be used by the Databricks cluster VM instances.
`local_ssd_count`	Integer	The number of local SSDs to attach to each node in the cluster. The default value is `0`.
`zone_id`	String	Identifier for the availability zone/datacenter in which the cluster resides.
`availability`	String	Availability type used for all nodes. Valid values are `PREEMPTIBLE_GCP`, `ON_DEMAND_GCP`, `PREEMPTIBLE_WITH_FALLBACK_GCP`.
`boot_disk_size`	Integer	The size of the boot disk in GB. Values typically range from 100 to 1000.

cluster.cluster_log_conf

The configuration for delivering Spark logs to a long-term storage destination.

Key	Type	Description
`dbfs`	Map	DBFS location for cluster log delivery. See dbfs.
`s3`	Map	S3 location for cluster log delivery. See s3.
`volumes`	Map	Volumes location for cluster log delivery. See volumes.

cluster.cluster_log_conf.dbfs

Type: Map

DBFS location for cluster log delivery.

Key	Type	Description
`destination`	String	The DBFS path for cluster log delivery (for example, `dbfs:/cluster-logs`).

cluster.cluster_log_conf.s3

Type: Map

S3 location for cluster log delivery.

Key	Type	Description
`destination`	String	The S3 URI for cluster log delivery (for example, `s3://my-bucket/cluster-logs`).
`region`	String	The AWS region of the S3 bucket.
`endpoint`	String	The S3 endpoint URL (optional).
`enable_encryption`	Boolean	Whether to enable encryption for cluster logs.
`encryption_type`	String	The encryption type. Valid values include `SSE_S3`, `SSE_KMS`.
`kms_key`	String	The KMS key ARN for encryption (when using `SSE_KMS`).
`canned_acl`	String	The canned ACL to apply to cluster logs.

cluster.cluster_log_conf.volumes

Type: Map

Volumes location for cluster log delivery.

Key	Type	Description
`destination`	String	The volume path for cluster log delivery (for example, `/Volumes/catalog/schema/volume/cluster_log`).

cluster.docker_image

Type: Map

The custom Docker image configuration.

Key	Type	Description
`url`	String	URL of the Docker image.
`basic_auth`	Map	Basic authentication for Docker repository. See basic_auth.

cluster.docker_image.basic_auth

Type: Map

Basic authentication for Docker repository.

Key	Type	Description
`username`	String	The username for Docker registry authentication.
`password`	String	The password for Docker registry authentication.

cluster.init_scripts

Type: Map

The configuration for storing init scripts. At least one location type must be specified.

Key	Type	Description
`dbfs`	Map	DBFS location of init script. See dbfs.
`workspace`	Map	Workspace location of init script. See workspace.
`s3`	Map	S3 location of init script. See s3.
`abfss`	Map	ABFSS location of init script. See abfss.
`gcs`	Map	GCS location of init script. See gcs.
`volumes`	Map	UC Volumes location of init script. See volumes.

cluster.init_scripts.dbfs

Type: Map

DBFS location of init script.

Key	Type	Description
`destination`	String	The DBFS path of the init script.

cluster.init_scripts.workspace

Type: Map

Workspace location of init script.

Key	Type	Description
`destination`	String	The workspace path of the init script.

cluster.init_scripts.s3

Type: Map

S3 location of init script.

Key	Type	Description
`destination`	String	The S3 URI of the init script.
`region`	String	The AWS region of the S3 bucket.
`endpoint`	String	The S3 endpoint URL (optional).

cluster.init_scripts.abfss

Type: Map

ABFSS location of init script.

Key	Type	Description
`destination`	String	The ABFSS path of the init script.

cluster.init_scripts.gcs

Type: Map

GCS location of init script.

Key	Type	Description
`destination`	String	The GCS path of the init script.

cluster.init_scripts.volumes

Type: Map

Volumes location of init script.

Key	Type	Description
`destination`	String	The UC Volumes path of the init script.

cluster.workload_type

Type: Map

Cluster attributes showing cluster workload types.

Key	Type	Description
`clients`	Map	Defines what type of clients can use the cluster. See clients.

cluster.workload_type.clients

Type: Map

The type of clients for this compute workload.

Key	Type	Description
`jobs`	Boolean	Whether the cluster can run jobs.
`notebooks`	Boolean	Whether the cluster can run notebooks.

Examples

The following example creates a dedicated (single-user) cluster for the current user with Databricks Runtime 15.4 LTS and a cluster policy:

resources:
  clusters:
    my_cluster:
      num_workers: 0
      node_type_id: 'i3.xlarge'
      driver_node_type_id: 'i3.xlarge'
      spark_version: '15.4.x-scala2.12'
      spark_conf:
        'spark.executor.memory': '2g'
      autotermination_minutes: 60
      enable_elastic_disk: true
      single_user_name: ${workspace.current_user.userName}
      policy_id: '000128DB309672CA'
      enable_local_disk_encryption: false
      data_security_mode: SINGLE_USER
      runtime_engine: STANDARD

This example creates a simple cluster my_cluster and sets that as the cluster to use to run the notebook in my_job:

bundle:
  name: clusters

resources:
  clusters:
    my_cluster:
      num_workers: 2
      node_type_id: 'i3.xlarge'
      autoscale:
        min_workers: 2
        max_workers: 7
      spark_version: '13.3.x-scala2.12'
      spark_conf:
        'spark.executor.memory': '2g'

  jobs:
    my_job:
      tasks:
        - task_key: test_task
          notebook_task:
            notebook_path: './src/my_notebook.py'
          existing_cluster_id: ${resources.clusters.my_cluster.id}

dashboard

Type: Map

The dashboard resource allows you to manage AI/BI dashboards in a bundle. For information about AI/BI dashboards, see Dashboards.

If you deployed a bundle that contains a dashboard from your local environment and then use the UI to modify that dashboard, modifications made through the UI are not applied to the dashboard JSON file in the local bundle unless you explicitly update it using bundle generate. You can use the --watch option to continuously poll and retrieve changes to the dashboard. See databricks bundle generate.

In addition, if you attempt to deploy a bundle from your local environment that contains a dashboard JSON file that is different than the one in the remote workspace, an error will occur. To force the deploy and overwrite the dashboard in the remote workspace with the local one, use the --force option. See databricks bundle deploy.

Added in Databricks CLI version 0.232.0

Note

When using Databricks Asset Bundles with dashboard Git support, prevent duplicate dashboards from being generated by adding the sync mapping to exclude the dashboards from synchronizing as files:

sync:
  exclude:
    - src/*.lvdash.json

dashboards:
  <dashboard-name>:
    <dashboard-field-name>: <dashboard-field-value>

Key	Type	Description
`dataset_catalog`	String	The default catalog value used by all datasets in the dashboard if not otherwise specified in the query. For example configuration that sets this field, see Dashboard catalog and schema parameterization. Added in Databricks CLI version 0.283.0
`dataset_schema`	String	The default schema value used by all datasets in the dashboard if not otherwise specified in the query. For example configuration that sets this field, see Dashboard catalog and schema parameterization. Added in Databricks CLI version 0.283.0
`display_name`	String	The display name of the dashboard. Added in Databricks CLI version 0.232.0
`embed_credentials`	Boolean	Whether the bundle deployment identity credentials are used to execute queries for all dashboard viewers. If it is set to `false`, a viewer's credentials are used. The default value is `false`. Added in Databricks CLI version 0.232.0
`etag`	String	The etag for the dashboard. Can be optionally provided on updates to ensure that the dashboard has not been modified since the last read. Added in Databricks CLI version 0.234.0
`file_path`	String	The local path of the dashboard asset, including the file name. Exported dashboards always have the file extension `.lvdash.json`. Added in Databricks CLI version 0.232.0
`lifecycle`	Map	Contains the lifecycle settings for a resource. It controls the behavior of the resource when it is deployed or destroyed. See lifecycle.
`parent_path`	String	The workspace path of the folder containing the dashboard. Includes leading slash and no trailing slash. Added in Databricks CLI version 0.232.0
`path`	String	The workspace path of the dashboard asset, including the asset name. Added in Databricks CLI version 0.234.0
`permissions`	Sequence	The dashboard permissions. See permissions. Added in Databricks CLI version 0.232.0
`serialized_dashboard`	Any	The contents of the dashboard in serialized string form. Added in Databricks CLI version 0.232.0
`warehouse_id`	String	The warehouse ID used to run the dashboard. Added in Databricks CLI version 0.232.0

Example

The following example includes and deploys the sample NYC Taxi Trip Analysis dashboard to the Databricks workspace.

resources:
  dashboards:
    nyc_taxi_trip_analysis:
      display_name: 'NYC Taxi Trip Analysis'
      file_path: ../src/nyc_taxi_trip_analysis.lvdash.json
      warehouse_id: ${var.warehouse_id}

database_catalog

Type: Map

The database catalog resource allows you to define database catalogs that correspond to database instances in a bundle. A database catalog is a Lakebase database that is registered as a Unity Catalog catalog.

For information about database catalogs, see Create a catalog.

Added in Databricks CLI version 0.265.0

database_catalogs:
  <database_catalog-name>:
    <database_catalog-field-name>: <database_catalog-field-value>

Key	Type	Description
`create_database_if_not_exists`	Boolean	Whether to create the database if it does not exist. Added in Databricks CLI version 0.265.0
`database_instance_name`	String	The name of the instance housing the database. Added in Databricks CLI version 0.265.0
`database_name`	String	The name of the database (in a instance) associated with the catalog. Added in Databricks CLI version 0.265.0
`lifecycle`	Map	Contains the lifecycle settings for a resource, including the behavior of the resource when it is deployed or destroyed. See lifecycle. Added in Databricks CLI version 0.265.0
`name`	String	The name of the catalog in Unity Catalog. Added in Databricks CLI version 0.265.0

Example

The following example defines a database instance with a corresponding database catalog:

resources:
  database_instances:
    my_instance:
      name: my-instance
      capacity: CU_1
  database_catalogs:
    my_catalog:
      database_instance_name: ${resources.database_instances.my_instance.name}
      name: example_catalog
      database_name: my_database
      create_database_if_not_exists: true

database_instance

Type: Map

The database instance resource allows you to define database instances in a bundle. A Lakebase database instance manages storage and compute resources and provides the endpoints that users connect to.

Important

When you deploy a bundle with a database instance, the instance immediately starts running and is subject to pricing. See Lakebase pricing.

For information about database instances, see What is a database instance?.

Added in Databricks CLI version 0.265.0

database_instances:
  <database_instance-name>:
    <database_instance-field-name>: <database_instance-field-value>

Key	Type	Description
`capacity`	String	The sku of the instance. Valid values are `CU_1`, `CU_2`, `CU_4`, `CU_8`. Added in Databricks CLI version 0.265.0
`custom_tags`	Sequence	A list of key-value pairs that specify custom tags associated with the instance. Added in Databricks CLI version 0.273.0
`enable_pg_native_login`	Boolean	Whether the instance has PG native password login enabled. Defaults to `true`. Added in Databricks CLI version 0.267.0
`enable_readable_secondaries`	Boolean	Whether to enable secondaries to serve read-only traffic. Defaults to `false`. Added in Databricks CLI version 0.265.0
`lifecycle`	Map	Contains the lifecycle settings for a resource. It controls the behavior of the resource when it is deployed or destroyed. See lifecycle. Added in Databricks CLI version 0.268.0
`name`	String	The name of the instance. This is the unique identifier for the instance. Added in Databricks CLI version 0.265.0
`node_count`	Integer	The number of nodes in the instance, composed of 1 primary and 0 or more secondaries. Defaults to 1 primary and 0 secondaries. Added in Databricks CLI version 0.265.0
`parent_instance_ref`	Map	The reference of the parent instance. This is only available if the instance is child instance. See parent instance. Added in Databricks CLI version 0.265.0
`permissions`	Sequence	The database instance's permissions. See permissions. Added in Databricks CLI version 0.265.0
`retention_window_in_days`	Integer	The retention window for the instance. This is the time window in days for which the historical data is retained. The default value is 7 days. Valid values are 2 to 35 days. Added in Databricks CLI version 0.265.0
`stopped`	Boolean	Whether the instance is stopped. Added in Databricks CLI version 0.265.0
`usage_policy_id`	String	The desired usage policy to associate with the instance. Added in Databricks CLI version 0.273.0

database_instance.parent_instance_ref

Type: Map

The reference of the parent instance. This is only available if the instance is child instance.

Key	Type	Description
`branch_time`	String	Branch time of the ref database instance. For a parent ref instance, this is the point in time on the parent instance from which the instance was created. For a child ref instance, this is the point in time on the instance from which the child instance was created.
`lsn`	String	User-specified WAL LSN of the ref database instance.
`name`	String	Name of the ref database instance.

Example

The following example defines a database instance with a corresponding database catalog:

resources:
  database_instances:
    my_instance:
      name: my-instance
      capacity: CU_1
  database_catalogs:
    my_catalog:
      database_instance_name: ${resources.database_instances.my_instance.name}
      name: example_catalog
      database_name: my_database
      create_database_if_not_exists: true

For an example bundle that demonstrates how to define a database instance and corresponding database catalog, see the bundle-examples GitHub repository.

experiment

Type: Map

The experiment resource allows you to define MLflow experiments in a bundle. For information about MLflow experiments, see Organize training runs with MLflow experiments.

Added in Databricks CLI version 0.229.0 or below

experiments:
  <experiment-name>:
    <experiment-field-name>: <experiment-field-value>

Key	Type	Description
`artifact_location`	String	The location where artifacts for the experiment are stored. Added in Databricks CLI version 0.229.0 or below
`lifecycle`	Map	Contains the lifecycle settings for a resource. It controls the behavior of the resource when it is deployed or destroyed. See lifecycle. Added in Databricks CLI version 0.268.0
`name`	String	The friendly name that identifies the experiment. An experiment name must be an absolute path in the Databricks workspace, for example `/Workspace/Users/someone@example.com/my_experiment`. Added in Databricks CLI version 0.229.0 or below
`permissions`	Sequence	The experiment's permissions. See permissions. Added in Databricks CLI version 0.229.0 or below
`tags`	Sequence	Additional metadata key-value pairs. See tags. Added in Databricks CLI version 0.229.0 or below

Example

The following example defines an experiment that all users can view:

resources:
  experiments:
    experiment:
      name: /Workspace/Users/someone@example.com/my_experiment
      permissions:
        - level: CAN_READ
          group_name: users
      description: MLflow experiment used to track runs

job

Type: Map

Jobs are supported in Python for Databricks Asset Bundles. See databricks.bundles.jobs.

The job resource allows you to define jobs and their corresponding tasks in your bundle.

For information about jobs, see Lakeflow Jobs. For a tutorial that uses a Databricks Asset Bundles template to create a job, see Develop a job with Databricks Asset Bundles.

Added in Databricks CLI version 0.229.0 or below

jobs:
  <job-name>:
    <job-field-name>: <job-field-value>

Key	Type	Description
`budget_policy_id`	String	The id of the user-specified budget policy to use for this job. If not specified, a default budget policy may be applied when creating or modifying the job. See `effective_budget_policy_id` for the budget policy used by this workload. Added in Databricks CLI version 0.231.0
`continuous`	Map	An optional continuous property for this job. The continuous property will ensure that there is always one run executing. Only one of `schedule` and `continuous` can be used. See continuous. Added in Databricks CLI version 0.229.0 or below
`deployment`	Map	Deployment information for jobs managed by external sources. See deployment. Added in Databricks CLI version 0.229.0 or below
`description`	String	An optional description for the job. The maximum length is 27700 characters in UTF-8 encoding. Added in Databricks CLI version 0.229.0 or below
`email_notifications`	Map	An optional set of email addresses that is notified when runs of this job begin or complete as well as when this job is deleted. See email_notifications. Added in Databricks CLI version 0.229.0 or below
`environments`	Sequence	A list of task execution environment specifications that can be referenced by serverless tasks of this job. An environment is required to be present for serverless tasks. For serverless notebook tasks, the environment is accessible in the notebook environment panel. For other serverless tasks, the task environment is required to be specified using environment_key in the task settings. See environments. Added in Databricks CLI version 0.229.0 or below
`format`	String	Deprecated. The format of the job.
`git_source`	Map	An optional specification for a remote Git repository containing the source code used by tasks. See job.git_source. Added in Databricks CLI version 0.229.0 or below Important: The `git_source` field and task `source` field set to `GIT` are not recommended for bundles, because local relative paths may not point to the same content in the Git repository, and bundles expect that a deployed job has the same content as the local copy from where it was deployed. Instead, clone the repository locally and set up your bundle project within this repository, so that the source for tasks are the workspace.
`health`	Map	An optional set of health rules that can be defined for this job. See health. Added in Databricks CLI version 0.229.0 or below
`job_clusters`	Sequence	A list of job cluster specifications that can be shared and reused by tasks of this job. See job_clusters. Added in Databricks CLI version 0.229.0 or below
`lifecycle`	Map	Contains the lifecycle settings for a resource. It controls the behavior of the resource when it is deployed or destroyed. See lifecycle. Added in Databricks CLI version 0.268.0
`max_concurrent_runs`	Integer	An optional maximum allowed number of concurrent runs of the job. Set this value if you want to be able to execute multiple runs of the same job concurrently.
`name`	String	An optional name for the job. The maximum length is 4096 bytes in UTF-8 encoding. Added in Databricks CLI version 0.229.0 or below
`notification_settings`	Map	Optional notification settings that are used when sending notifications to each of the `email_notifications` and `webhook_notifications` for this job. See notification_settings. Added in Databricks CLI version 0.229.0 or below
`parameters`	Sequence	Job-level parameter definitions. Added in Databricks CLI version 0.229.0 or below
`performance_target`	String	Defines how performant or cost efficient the execution of the run on serverless should be. Added in Databricks CLI version 0.241.0
`permissions`	Sequence	The job's permissions. See permissions. Added in Databricks CLI version 0.229.0 or below
`queue`	Map	The queue settings of the job. See queue. Added in Databricks CLI version 0.229.0 or below
`run_as`	Map	Write-only setting. Specifies the user or service principal that the job runs as. If not specified, the job runs as the user who created the job. Either `user_name` or `service_principal_name` should be specified. If not, an error is thrown. See run_as. Added in Databricks CLI version 0.229.0 or below
`schedule`	Map	An optional periodic schedule for this job. The default behavior is that the job only runs when triggered by clicking “Run Now” in the Jobs UI or sending an API request to `runNow`. See schedule. Added in Databricks CLI version 0.229.0 or below
`tags`	Map	A map of tags associated with the job. These are forwarded to the cluster as cluster tags for jobs clusters, and are subject to the same limitations as cluster tags. A maximum of 25 tags can be added to the job. Added in Databricks CLI version 0.229.0 or below
`tasks`	Sequence	A list of task specifications to be executed by this job. See Add tasks to jobs in Databricks Asset Bundles. Added in Databricks CLI version 0.237.0
`timeout_seconds`	Integer	An optional timeout applied to each run of this job. A value of `0` means no timeout. Added in Databricks CLI version 0.229.0 or below
`trigger`	Map	A configuration to trigger a run when certain conditions are met. See trigger. Added in Databricks CLI version 0.229.0 or below
`usage_policy_id`	String	The ID of the usage policy to use for this job. Added in Databricks CLI version 0.273.0
`webhook_notifications`	Map	A collection of system notification IDs to notify when runs of this job begin or complete. See webhook_notifications. Added in Databricks CLI version 0.229.0 or below

job.continuous

Type: Map

Configuration for continuous job execution.

Key	Type	Description
`pause_status`	String	Whether the continuous job is paused or not. Valid values: `PAUSED`, `UNPAUSED`.
`task_retry_mode`	String	Indicate how the continuous job is applying task level retries. Valid values are `NEVER` and `ON_FAILURE`. Defaults to `NEVER`.

job.deployment

Type: Map

Deployment information for jobs managed by external sources.

Key	Type	Description
`kind`	String	The kind of deployment. For example, `BUNDLE`.
`metadata_file_path`	String	The path to the metadata file for the deployment.

job.email_notifications

Type: Map

Email notification settings for job runs.

Key	Type	Description
`on_start`	Sequence	A list of email addresses to notify when a run starts.
`on_success`	Sequence	A list of email addresses to notify when a run succeeds.
`on_failure`	Sequence	A list of email addresses to notify when a run fails.
`on_duration_warning_threshold_exceeded`	Sequence	A list of email addresses to notify when a run duration exceeds the warning threshold.
`no_alert_for_skipped_runs`	Boolean	Whether to skip sending alerts for skipped runs.
`on_streaming_backlog_exceeded`	Sequence	A list of email addresses to notify when any streaming backlog thresholds are exceeded for any stream. Streaming backlog thresholds can be set in the `health` field using the following metrics: `STREAMING_BACKLOG_BYTES`, `STREAMING_BACKLOG_RECORDS`, `STREAMING_BACKLOG_SECONDS`, or `STREAMING_BACKLOG_FILES`. Alerting is based on the 10-minute average of these metrics. If the issue persists, notifications are resent every 30 minutes.

job.environments

Type: Sequence

A list of task execution environment specifications that can be referenced by serverless tasks of a job.

Each item in the list is a JobEnvironment:

Key	Type	Description
`environment_key`	String	The key of an environment. It has to be unique within a job.
`spec`	Map	The entity that represents a serverless environment. See job.environments.spec.

job.environments.spec

Type: Map

The entity that represents a serverless environment.

Key	Type	Description
`client`	String	Deprecated. The client version.
`dependencies`	Sequence	List of pip dependencies, as supported by the version of pip in this environment.
`environment_version`	String	Required. Environment version used by the environment. Each version comes with a specific Python version and a set of Python packages. The version is a string, consisting of an integer.

job.git_source

Type: Map

Git repository configuration for job source code.

Key	Type	Description
`git_branch`	String	The name of the branch to be checked out and used by this job. This field cannot be specified in conjunction with `git_tag` or `git_commit`.
`git_commit`	String	Commit to be checked out and used by this job. This field cannot be specified in conjunction with `git_branch` or `git_tag`.
`git_provider`	String	Unique identifier of the service used to host the Git repository. The value is case insensitive. Valid values are `gitHub`, `bitbucketCloud`, `gitLab`, `azureDevOpsServices`, `gitHubEnterprise`, `bitbucketServer`, `gitLabEnterpriseEdition`.
`git_snapshot`	Map	Read-only state of the remote repository at the time the job was run. This field is only included on job runs. See git_snapshot.
`git_tag`	String	Name of the tag to be checked out and used by this job. This field cannot be specified in conjunction with `git_branch` or `git_commit`.
`git_url`	String	URL of the repository to be cloned by this job.

job.git_source.git_snapshot

Type: Map

Read-only commit information snapshot.

Key	Type	Description
`used_commit`	String	Commit that was used to execute the run. If `git_branch` was specified, this points to the HEAD of the branch at the time of the run; if `git_tag` was specified, this points to the commit the tag points to.

job.health

Type: Map

Health monitoring configuration for the job.

Key	Type	Description
`rules`	Sequence	A list of job health rules. Each rule contains a `metric` and `op` (operator) and `value`. See job.health.rules.

job.health.rules

Type: Sequence

A list of job health rules.

Each item in the list is a JobHealthRule:

Key	Type	Description
`metric`	String	Specifies the health metric that is being evaluated for a particular health rule. `RUN_DURATION_SECONDS`: Expected total time for a run in seconds. `STREAMING_BACKLOG_BYTES`: An estimate of the maximum bytes of data waiting to be consumed across all streams. This metric is in Public Preview. `STREAMING_BACKLOG_RECORDS`: An estimate of the maximum offset lag across all streams. This metric is in Public Preview. `STREAMING_BACKLOG_SECONDS`: An estimate of the maximum consumer delay across all streams. This metric is in Public Preview. `STREAMING_BACKLOG_FILES`: An estimate of the maximum number of outstanding files across all streams. This metric is in Public Preview.
`op`	String	Specifies the operator used to compare the health metric value with the specified threshold.
`value`	Integer	Specifies the threshold value that the health metric should obey to satisfy the health rule.

job.job_clusters

Type: Sequence

A list of job cluster specifications that can be shared and reused by tasks of this job. Libraries cannot be declared in a shared job cluster. You must declare dependent libraries in task settings.

Each item in the list is a JobCluster:

Key	Type	Description
`job_cluster_key`	String	A unique name for the job cluster. This field is required and must be unique within the job. `JobTaskSettings` may refer to this field to determine which cluster to launch for the task execution.
`new_cluster`	Map	If new_cluster, a description of a cluster that is created for each task. See cluster.

job.notification_settings

Type: Map

Notification settings that apply to all notifications for the job.

Key	Type	Description
`no_alert_for_skipped_runs`	Boolean	Whether to skip sending alerts for skipped runs.
`no_alert_for_canceled_runs`	Boolean	Whether to skip sending alerts for canceled runs.

job.queue

Type: Map

Queue settings for the job.

Key	Type	Description
`enabled`	Boolean	Whether to enable queueing for the job.

job.schedule

Type: Map

Schedule configuration for periodic job execution.

Key	Type	Description
`quartz_cron_expression`	String	A Cron expression using Quartz syntax that specifies when the job runs. For example, `0 0 9 * * ?` runs the job every day at 9:00 AM UTC.
`timezone_id`	String	The timezone for the schedule. For example, `America/Los_Angeles` or `UTC`.
`pause_status`	String	Whether the schedule is paused or not. Valid values: `PAUSED`, `UNPAUSED`.

job.trigger

Type: Map

Trigger configuration for event-driven job execution.

Key	Type	Description
`file_arrival`	Map	Trigger based on file arrival. See file_arrival.
`table`	Map	Trigger based on a table. See table.
`table_update`	Map	Trigger based on table updates. See table_update.
`periodic`	Map	Periodic trigger. See periodic.

job.trigger.file_arrival

Type: Map

Trigger configuration based on file arrival.

Key	Type	Description
`url`	String	The file path to monitor for new files.
`min_time_between_triggers_seconds`	Integer	Minimum time in seconds between trigger events.
`wait_after_last_change_seconds`	Integer	Wait time in seconds after the last file change before triggering.

job.trigger.table

Type: Map

Trigger configuration based on a table.

Key	Type	Description
`table_names`	Sequence	A list of table names to monitor.
`condition`	String	The SQL condition that must be met to trigger the job.

job.trigger.table_update

Type: Map

Trigger configuration based on table updates.

Key	Type	Description
`table_names`	Sequence	A list of table names to monitor for updates.
`condition`	String	The SQL condition that must be met to trigger the job.
`wait_after_last_change_seconds`	Integer	Wait time in seconds after the last table update before triggering.

job.trigger.periodic

Type: Map

Periodic trigger configuration.

Key	Type	Description
`interval`	Integer	The interval value for the periodic trigger.
`unit`	String	The unit of time for the interval. Valid values: `SECONDS`, `MINUTES`, `HOURS`, `DAYS`, `WEEKS`.

job.webhook_notifications

Type: Map

Webhook notification settings for job runs.

Key	Type	Description
`on_start`	Sequence	A list of webhook notification IDs to notify when a run starts.
`on_success`	Sequence	A list of webhook notification IDs to notify when a run succeeds.
`on_failure`	Sequence	A list of webhook notification IDs to notify when a run fails.
`on_duration_warning_threshold_exceeded`	Sequence	A list of webhook notification IDs to notify when a run duration exceeds the warning threshold.
`on_streaming_backlog_exceeded`	Sequence	A list of system notification IDs to call when any streaming backlog thresholds are exceeded for any stream. Streaming backlog thresholds can be set in the `health` field using the following metrics: `STREAMING_BACKLOG_BYTES`, `STREAMING_BACKLOG_RECORDS`, `STREAMING_BACKLOG_SECONDS`, or `STREAMING_BACKLOG_FILES`. Alerting is based on the 10-minute average of these metrics. If the issue persists, notifications are resent every 30 minutes. A maximum of 3 destinations can be specified.

Examples

The following example defines a job with the resource key hello-job with one notebook task:

resources:
  jobs:
    hello-job:
      name: hello-job
      tasks:
        - task_key: hello-task
          notebook_task:
            notebook_path: ./hello.py

The following example defines a job with a SQL notebook:

resources:
  jobs:
    job_with_sql_notebook:
      name: 'Job to demonstrate using a SQL notebook with a SQL warehouse'
      tasks:
        - task_key: notebook
          notebook_task:
            notebook_path: ./select.sql
            warehouse_id: 799f096837fzzzz4

For additional job configuration examples, see Job configuration.

For information about defining job tasks and overriding job settings, see:

model (legacy)

Type: Map

The model resource allows you to define legacy models in bundles. Databricks recommends you use Unity Catalog registered models instead.

Added in Databricks CLI version 0.229.0 or below

model_serving_endpoint

Type: Map

The model_serving_endpoint resource allows you to define model serving endpoints. See Manage model serving endpoints.

Added in Databricks CLI version 0.229.0 or below

model_serving_endpoints:
  <model_serving_endpoint-name>:
    <model_serving_endpoint-field-name>: <model_serving_endpoint-field-value>

Key	Type	Description
`ai_gateway`	Map	The AI Gateway configuration for the serving endpoint. NOTE: Only external model and provisioned throughput endpoints are currently supported. See ai_gateway. Added in Databricks CLI version 0.230.0
`budget_policy_id`	String	The ID of the budget policy to use for this endpoint. Added in Databricks CLI version 0.244.0
`config`	Map	The core config of the serving endpoint. See config. Added in Databricks CLI version 0.229.0 or below
`description`	String	A description for the serving endpoint. Added in Databricks CLI version 0.260.0
`email_notifications`	Map	Email notifications configuration for the serving endpoint. See email_notifications. Added in Databricks CLI version 0.264.0
`lifecycle`	Map	Contains the lifecycle settings for a resource. It controls the behavior of the resource when it is deployed or destroyed. See lifecycle. Added in Databricks CLI version 0.268.0
`name`	String	The name of the serving endpoint. This field is required and must be unique across a Databricks workspace. An endpoint name can consist of alphanumeric characters, dashes, and underscores. Added in Databricks CLI version 0.229.0 or below
`permissions`	Sequence	The model serving endpoint's permissions. See permissions. Added in Databricks CLI version 0.229.0 or below
`rate_limits`	Sequence	Deprecated. Rate limits to be applied to the serving endpoint. Use AI Gateway to manage rate limits. Added in Databricks CLI version 0.229.0 or below
`route_optimized`	Boolean	Enable route optimization for the serving endpoint. Added in Databricks CLI version 0.229.0 or below
`tags`	Sequence	Tags to be attached to the serving endpoint and automatically propagated to billing logs. Added in Databricks CLI version 0.229.0 or below

model_serving_endpoint.email_notifications

Type: Map

Email notifications configuration for the serving endpoint.

Key	Type	Description
`on_update_failure`	Sequence	A list of email addresses to be notified when an endpoint fails to update its configuration or state.
`on_update_success`	Sequence	A list of email addresses to be notified when an endpoint successfully updates its configuration or state.

model_serving_endpoint.ai_gateway

Type: Map

AI Gateway configuration for the serving endpoint.

Key	Type	Description
`fallback_config`	Map	Configuration for traffic fallback which auto fallbacks to other served entities if the request to a served entity fails with certain error codes, to increase availability. See fallback_config.
`guardrails`	Map	Guardrail configuration. See guardrails.
`inference_table_config`	Map	Configuration for inference logging to Unity Catalog tables. See inference_table_config.
`rate_limits`	Sequence	Rate limit configurations.
`usage_tracking_config`	Map	Configuration for tracking usage. See usage_tracking_config.

model_serving_endpoint.ai_gateway.fallback_config

Type: Map

Configuration for traffic fallback which auto fallbacks to other served entities if a request fails with certain error codes.

Key	Type	Description
`enabled`	Boolean	Whether fallback is enabled for this endpoint.

model_serving_endpoint.ai_gateway.guardrails

Type: Map

The AI gateway guardrails configuration.

Key	Type	Description
`input`	Map	Input guardrails configuration with fields like `safety`, `pii`.
`output`	Map	Output guardrails configuration with fields like `safety`, `pii`.
`invalid_keywords`	Sequence	A list of keywords to block.

model_serving_endpoint.ai_gateway.inference_table_config

Type: Map

Configuration for inference logging to Unity Catalog tables.

Key	Type	Description
`catalog_name`	String	The name of the catalog in Unity Catalog.
`schema_name`	String	The name of the schema in Unity Catalog.
`table_name_prefix`	String	The prefix for inference table names.
`enabled`	Boolean	Whether inference table logging is enabled.

model_serving_endpoint.ai_gateway.usage_tracking_config

Type: Map

The AI gateway configuration for tracking usage.

Key	Type	Description
`enabled`	Boolean	Whether usage tracking is enabled.

model_serving_endpoint.config

Type: Map

The core configuration of the serving endpoint.

Key	Type	Description
`served_entities`	Sequence	A list of served entities for the endpoint to serve. Each served entity contains fields like `entity_name`, `entity_version`, `workload_size`, `scale_to_zero_enabled`, `workload_type`, `environment_vars`.
`served_models`	Sequence	(Deprecated: use `served_entities` instead) A list of served models for the endpoint to serve.
`traffic_config`	Map	The traffic config defining how invocations to the serving endpoint should be routed. See traffic_config.
`auto_capture_config`	Map	Configuration for Inference Tables which automatically logs requests and responses to Unity Catalog. See auto_capture_config.

model_serving_endpoint.config.traffic_config

Type: Map

The traffic config defining how invocations to the serving endpoint should be routed.

Key	Type	Description
`routes`	Sequence	A list of routes for traffic distribution. Each route contains `served_model_name` and `traffic_percentage`.

model_serving_endpoint.config.auto_capture_config

Type: Map

Configuration for Inference Tables which automatically logs requests and responses to Unity Catalog.

Key	Type	Description
`catalog_name`	String	The name of the catalog in Unity Catalog.
`schema_name`	String	The name of the schema in Unity Catalog.
`table_name_prefix`	String	The prefix for inference table names.
`enabled`	Boolean	Whether inference table logging is enabled.

Example

The following example defines a Unity Catalog model serving endpoint:

resources:
  model_serving_endpoints:
    uc_model_serving_endpoint:
      name: 'uc-model-endpoint'
      config:
        served_entities:
          - entity_name: 'myCatalog.mySchema.my-ads-model'
            entity_version: '10'
            workload_size: 'Small'
            scale_to_zero_enabled: 'true'
        traffic_config:
          routes:
            - served_model_name: 'my-ads-model-10'
              traffic_percentage: '100'
      tags:
        - key: 'team'
          value: 'data science'

pipeline

Type: Map

Pipelines are supported in Python for Databricks Asset Bundles. See databricks.bundles.pipelines.

The pipeline resource allows you to create pipelines. For information about pipelines, see Lakeflow Spark Declarative Pipelines. For a tutorial that uses the Databricks Asset Bundles template to create a pipeline, see Develop Lakeflow Spark Declarative Pipelines with Databricks Asset Bundles.

Added in Databricks CLI version 0.229.0 or below

pipelines:
  <pipeline-name>:
    <pipeline-field-name>: <pipeline-field-value>

Key	Type	Description
`allow_duplicate_names`	Boolean	If false, deployment will fail if name conflicts with that of another pipeline. Added in Databricks CLI version 0.261.0
`budget_policy_id`	String	Budget policy of this pipeline. Added in Databricks CLI version 0.230.0
`catalog`	String	A catalog in Unity Catalog to publish data from this pipeline to. If `target` is specified, tables in this pipeline are published to a `target` schema inside `catalog` (for example, `catalog`.`target`.`table`). If `target` is not specified, no data is published to Unity Catalog. Added in Databricks CLI version 0.229.0 or below
`channel`	String	The Lakeflow Spark Declarative Pipelines Release Channel that specifies which version of Lakeflow Spark Declarative Pipelines to use. Added in Databricks CLI version 0.229.0 or below
`clusters`	Sequence	The cluster settings for this pipeline deployment. See cluster. Added in Databricks CLI version 0.229.0 or below
`configuration`	Map	The configuration for this pipeline execution. Added in Databricks CLI version 0.229.0 or below
`continuous`	Boolean	Whether the pipeline is continuous or triggered. This replaces `trigger`. Added in Databricks CLI version 0.229.0 or below
`deployment`	Map	Deployment type of this pipeline. See deployment. Added in Databricks CLI version 0.229.0 or below
`development`	Boolean	Whether the pipeline is in development mode. Defaults to false. Added in Databricks CLI version 0.229.0 or below
`dry_run`	Boolean	Whether the pipeline is a dry run pipeline.
`edition`	String	The pipeline product edition. Added in Databricks CLI version 0.229.0 or below
`environment`	Map	The environment specification for this pipeline used to install dependencies on serverless compute. See environment. This key is only supported in Databricks CLI version 0.258 and above. Added in Databricks CLI version 0.257.0
`event_log`	Map	The event log configuration for this pipeline. See event_log. Added in Databricks CLI version 0.246.0
`filters`	Map	The filters that determine which pipeline packages to include in the deployed graph. See filters. Added in Databricks CLI version 0.229.0 or below
`gateway_definition`	Map	The configuration for a gateway pipeline. These settings cannot be used with the `ingestion_definition` settings. Added in Databricks CLI version 0.229.0 or below
`id`	String	Unique identifier for this pipeline. Added in Databricks CLI version 0.229.0 or below
`ingestion_definition`	Map	The configuration for a managed ingestion pipeline. These settings cannot be used with the `libraries`, `schema`, `target`, or `catalog` settings. See ingestion_definition. Added in Databricks CLI version 0.229.0 or below
`libraries`	Sequence	A list of libraries or code needed by this deployment. See pipeline.libraries. Added in Databricks CLI version 0.229.0 or below
`lifecycle`	Map	Contains the lifecycle settings for a resource. It controls the behavior of the resource when it is deployed or destroyed. See lifecycle. Added in Databricks CLI version 0.268.0
`name`	String	A friendly name for this pipeline. Added in Databricks CLI version 0.229.0 or below
`notifications`	Sequence	The notification settings for this pipeline. See notifications. Added in Databricks CLI version 0.229.0 or below
`permissions`	Sequence	The pipeline's permissions. See permissions. Added in Databricks CLI version 0.229.0 or below
`photon`	Boolean	Whether Photon is enabled for this pipeline. This key is ignored if `serverless` is set to `true`. Added in Databricks CLI version 0.229.0 or below
`restart_window`	Map	Defines a restart window for this pipeline. Pipelines can be restarted within this window without falling behind.
`root_path`	String	The root path for this pipeline. This is used as the root directory when editing the pipeline in the Databricks user interface and it is added to sys.path when executing Python sources during pipeline execution. Added in Databricks CLI version 0.253.0
`run_as`	Map	The identity that the pipeline runs as. If not specified, the pipeline runs as the user who created the pipeline. Only `user_name` or `service_principal_name` can be specified. If both are specified, an error is thrown. See run_as. Added in Databricks CLI version 0.241.0
`schema`	String	The default schema (database) where tables are read from or published to. Added in Databricks CLI version 0.230.0
`serverless`	Boolean	Whether serverless compute is enabled for this pipeline. Added in Databricks CLI version 0.229.0 or below
`storage`	String	The DBFS root directory for storing checkpoints and tables. Added in Databricks CLI version 0.229.0 or below
`tags`	Map	A map of tags associated with the pipeline. These are forwarded to the cluster as cluster tags, and are therefore subject to the same limitations. A maximum of 25 tags can be added to the pipeline. Added in Databricks CLI version 0.256.0
`target`	String	Target schema (database) to add tables in this pipeline to. Exactly one of `schema` or `target` must be specified. To publish to Unity Catalog, also specify `catalog`. This legacy field is deprecated for pipeline creation in favor of the `schema` field. Added in Databricks CLI version 0.229.0 or below
`usage_policy_id`	String	The ID of the usage policy to use for this pipeline. Added in Databricks CLI version 0.273.0

pipeline.deployment

Type: Map

Deployment type configuration for the pipeline.

Key	Type	Description
`kind`	String	The kind of deployment. For example, `BUNDLE`.
`metadata_file_path`	String	The path to the metadata file for the deployment.

pipeline.environment

Type: Map

Environment specification for installing dependencies on serverless compute.

Key	Type	Description
`dependencies`	Sequence	A list of pip dependencies, as supported by the version of pip in this environment. Each dependency is a pip requirement file line.

pipeline.event_log

Type: Map

Event log configuration for the pipeline.

Key	Type	Description
`catalog`	String	The Unity Catalog catalog the event log is published under.
`name`	String	The name the event log is published to in Unity Catalog.
`schema`	String	The Unity Catalog schema the event log is published under.

pipeline.filters

Type: Map

Filters that determine which pipeline packages to include in the deployed graph.

Key	Type	Description
`include`	Sequence	A list of package names to include.
`exclude`	Sequence	A list of package names to exclude.

pipeline.ingestion_definition

Type: Map

Configuration for a managed ingestion pipeline. These settings cannot be used with the libraries, schema, target, or catalog settings.

Key	Type	Description
`connection_name`	String	The name of the connection to use for ingestion.
`ingestion_gateway_id`	String	The ID of the ingestion gateway.
`objects`	Sequence	Required. Settings specifying tables to replicate and the destination for the replicated tables. Each object can be a SchemaSpec, TableSpec, or ReportSpec.
`source_configurations`	Sequence	Top-level source configurations.
`table_configuration`	Map	Configuration for the ingestion tables. See table_configuration.

SchemaSpec

Type: Map

Schema object specification for ingesting all tables from a schema.

Key	Type	Description
`source_schema`	String	The name of the source schema to ingest.
`destination_catalog`	String	The name of the destination catalog in Unity Catalog.
`destination_schema`	String	The name of the destination schema in Unity Catalog.
`table_configuration`	Map	Configuration to apply to all tables in this schema. See pipeline.ingestion_definition.table_configuration.

TableSpec

Type: Map

Table object specification for ingesting a specific table.

Key	Type	Description
`source_schema`	String	The name of the source schema containing the table.
`source_table`	String	The name of the source table to ingest.
`destination_catalog`	String	The name of the destination catalog in Unity Catalog.
`destination_schema`	String	The name of the destination schema in Unity Catalog.
`destination_table`	String	The name of the destination table in Unity Catalog.
`table_configuration`	Map	Configuration for this specific table. See pipeline.ingestion_definition.table_configuration.

ReportSpec

Type: Map

Report object specification for ingesting analytics reports.

Key	Type	Description
`source_url`	String	The URL of the source report.
`source_report`	String	The name or identifier of the source report.
`destination_catalog`	String	The name of the destination catalog in Unity Catalog.
`destination_schema`	String	The name of the destination schema in Unity Catalog.
`destination_table`	String	The name of the destination table for the report data.
`table_configuration`	Map	Configuration for the report table. See pipeline.ingestion_definition.table_configuration.

pipeline.ingestion_definition.source_configurations

Type: Map

Configuration for source.

Key	Type	Description
`catalog`	Map	Catalog-level source configuration parameters. See catalog.

pipeline.ingestion_definition.source_configuration.catalog

Type: Map

Catalog-level source configuration parameters

Key	Type	Description
`postgres`	Map	Postgres-specific catalog-level configuration parameters. Contains one `slot_config` key that is a `Map` representing the Postgres slot configuration to use for logical replication.
`source_catalog`	String	The source catalog name.

pipeline.ingestion_definition.table_configuration

Type: Map

Configuration options for ingestion tables.

Key	Type	Description
`exclude_columns`	Sequence	A list of column names to be excluded for the ingestion. When not specified, `include_columns` fully controls what columns to be ingested. When specified, all other columns including future ones will be automatically included for ingestion. This field in mutually exclusive with `include_columns`.
`include_columns`	Sequence	A list of column names to be included for the ingestion. When not specified, all columns except ones in `exclude_columns` will be included. Future columns will be automatically included. When specified, all other future columns will be automatically excluded from ingestion. This field in mutually exclusive with `exclude_columns`.
`primary_keys`	Sequence	A list of column names to use as primary keys for the table.
`sequence_by`	Sequence	The column names specifying the logical order of events in the source data. Spark Declarative Pipelines uses this sequencing to handle change events that arrive out of order.

pipeline.libraries

Type: Sequence

Defines the list of libraries or code needed by this pipeline.

Each item in the list is a definition:

Key	Type	Description
`file`	Map	The path to a file that defines a pipeline and is stored in Databricks Repos. See pipeline.libraries.file.
`glob`	Map	The unified field to include source code. Each entry can be a notebook path, a file path, or a folder path that ends `/**`. This field cannot be used together with `notebook` or `file`. See pipeline.libraries.glob.
`notebook`	Map	The path to a notebook that defines a pipeline and is stored in the Databricks workspace. See pipeline.libraries.notebook.
`whl`	String	This field is deprecated

pipeline.libraries.file

Type: Map

The path to a file that defines a pipeline and is stored in the Databricks Repos.

Key	Type	Description
`path`	String	The absolute path of the source code.

pipeline.libraries.glob

Type: Map

The unified field to include source code. Each entry can be a notebook path, a file path, or a folder path that ends /**. This field cannot be used together with notebook or file.

Key	Type	Description
`include`	String	The source code to include for pipelines

pipeline.libraries.notebook

Type: Map

The path to a notebook that defines a pipeline and is stored in the Databricks workspace.

Key	Type	Description
`path`	String	The absolute path of the source code.

pipeline.notifications

Type: Sequence

The notification settings for this pipeline. Each item in the sequence is a notification configuration.

Key	Type	Description
`alerts`	Sequence	A list of alerts that trigger notifications. Valid values include `on-update-success`, `on-update-failure`, `on-update-fatal-failure`, `on-flow-failure`.
`email_recipients`	Sequence	A list of email addresses to notify when a configured alert is triggered.

Example

The following example defines a pipeline with the resource key hello-pipeline:

resources:
  pipelines:
    hello-pipeline:
      name: hello-pipeline
      clusters:
        - label: default
          num_workers: 1
      development: true
      continuous: false
      channel: CURRENT
      edition: CORE
      photon: false
      libraries:
        - notebook:
            path: ./pipeline.py

For additional pipeline configuration examples, see Pipeline configuration.

postgres_branch

Type:Map

The Postgres branch resource allows you to define Lakebase branches in a bundle. You must also define corresponding Postgres projects and compute endpoints.

Added in Databricks CLI version 0.287.0

postgres_branches:
  <postgres_branch-name>:
    <postgres_branch-field-name>: <postgres_branches-field-value>

Key	Type	Description
`branch_id`	String	The ID to use for the Branch. This becomes the final component of the branch's resource name. The ID is required and must be 1-63 characters long, start with a lowercase letter, and contain only lowercase letters, numbers, and hyphens. For example, `development` becomes `projects/my-app/branches/development`. Added in Databricks CLI version 0.287.0
`expire_time`	String	Absolute expiration timestamp. When set, the branch will expire at this time. Added in Databricks CLI version 0.287.0
`is_protected`	Boolean	When set to true, protects the branch from deletion and reset. Associated compute endpoints and the project cannot be deleted while the branch is protected. Added in Databricks CLI version 0.287.0
`no_expiry`	Boolean	Explicitly disable expiration. When set to true, the branch will not expire. If set to false, the request is invalid; provide either ttl or expire_time instead. Added in Databricks CLI version 0.287.0
`parent`	String	The project where this branch will be created. Format: `projects/{project_id}` Added in Databricks CLI version 0.287.0
`source_branch`	String	The name of the source branch from which this branch was created (data lineage for point-in-time recovery). If not specified, defaults to the project's default branch. Format: `projects/{project_id}/branches/{branch_id}` Added in Databricks CLI version 0.287.0
`source_branch_lsn`	String	The Log Sequence Number (LSN) on the source branch from which this branch was created. Added in Databricks CLI version 0.287.0
`source_branch_time`	String	The point in time on the source branch from which this branch was created. Added in Databricks CLI version 0.287.0
`ttl`	String	Relative time-to-live duration. When set, the branch will expire at creation_time + ttl. Added in Databricks CLI version 0.287.0

Example

See postgres_projects example.

postgres_endpoint

Type: Map

The postgres_endpoints resource allows you to define Lakebase compute endpoints in a bundle. You must also define corresponding Lakebase projects and Lakebase branches.

Added in Databricks CLI version 0.287.0

postgres_endpoints:
  <postgres_endpoint-name>:
    <postgres_endpoint-field-name>: <postgres_endpoint-field-value>

Key	Type	Description
`autoscaling_limit_max_cu`	Number	The maximum number of Compute Units. Minimum value is 0.5. Added in Databricks CLI version 0.287.0
`autoscaling_limit_min_cu`	Number	The minimum number of Compute Units. Minimum value is 0.5. Added in Databricks CLI version 0.287.0
`disabled`	Boolean	Whether to restrict connections to the compute endpoint. Enabling this option schedules a suspend compute operation. A disabled compute endpoint cannot be enabled by a connection or console action. Added in Databricks CLI version 0.287.0
`endpoint_id`	String	The ID to use for the Endpoint. This becomes the final component of the endpoint's resource name. The ID is required and must be 1-63 characters long, start with a lowercase letter, and contain only lowercase letters, numbers, and hyphens. For example, `primary` becomes `projects/my-app/branches/development/endpoints/primary`. Added in Databricks CLI version 0.287.0
`endpoint_type`	String	The endpoint type. A branch can only have one READ_WRITE endpoint. Possible values: `ENDPOINT_TYPE_READ_WRITE`, `ENDPOINT_TYPE_READ_ONLY`. Added in Databricks CLI version 0.287.0
`no_suspension`	Boolean	When set to true, explicitly disables automatic suspension (never suspend). Should be set to true when provided. Added in Databricks CLI version 0.287.0
`parent`	String	The branch where this Endpoint will be created. Format: `projects/{project_id}/branches/{branch_id}` Added in Databricks CLI version 0.287.0
`settings`	Map	A collection of settings for a compute endpoint. Added in Databricks CLI version 0.287.0
`suspend_timeout_duration`	String	Duration of inactivity after which the compute endpoint is automatically suspended. If specified should be between 60s and 604800s (1 minute to 1 week). Added in Databricks CLI version 0.287.0

Example

See postgres_projects example.

postgres_project

Type: Map

The Postgres project resource allows you to define Lakebase Autoscaling Postgres database projects in a bundle. You must also define corresponding Postgres branches and compute endpoints.

Added in Databricks CLI version 0.287.0

postgres_projects:
  <postgres_project-name>:
    <postgres_project-field-name>: <postgres_project-field-value>

Key	Type	Description
`default_endpoint_settings`	Map	A collection of settings for a compute endpoint. See postgres_project.default_endpoint_settings. Added in Databricks CLI version 0.287.0
`display_name`	String	Human-readable project name. Length should be between 1 and 256 characters. Added in Databricks CLI version 0.287.0
`history_retention_duration`	String	The number of seconds to retain the shared history for point in time recovery for all branches in this project. Value should be between 0s and 2592000s (up to 30 days). Added in Databricks CLI version 0.287.0
`pg_version`	Integer	The major Postgres version number. Supported versions are 16 and 17. Added in Databricks CLI version 0.287.0
`project_id`	String	The ID to use for the Project. This becomes the final component of the project's resource name. The ID is required and must be 1-63 characters long, start with a lowercase letter, and contain only lowercase letters, numbers, and hyphens. For example, `my-app` becomes `projects/my-app`. Added in Databricks CLI version 0.287.0

Example

resources:
  postgres_projects:
    my_db:
      project_id: test-prod-app
      display_name: 'Production Database'
      pg_version: 17

  postgres_branches:
    main:
      parent: ${resources.postgres_projects.my_db.id}
      branch_id: main
      is_protected: false
      no_expiry: true

  postgres_endpoints:
    primary:
      parent: ${resources.postgres_branches.main.id}
      endpoint_id: primary
      endpoint_type: ENDPOINT_TYPE_READ_WRITE
      autoscaling_limit_min_cu: 0.5
      autoscaling_limit_max_cu: 4

postgres_project.default_endpoint_settings

Type: Map

Key	Type	Description
`autoscaling_limit_max_cu`	Number	The maximum number of Compute Units. Minimum value is 0.5.
`autoscaling_limit_min_cu`	Number	The minimum number of Compute Units. Minimum value is 0.5.
`no_suspension`	Boolean	When set to true, explicitly disables automatic suspension (never suspend). Should be set to true when provided.
`pg_settings`	Map	A raw representation of Postgres settings.
`suspend_timeout_duration`	String	Duration of inactivity after which the compute endpoint is automatically suspended. If specified should be between 60s and 604800s (1 minute to 1 week).

quality_monitor (Unity Catalog)

Type: Map

The quality_monitor resource allows you to define a Unity Catalog table monitor. For information about monitors, see Data profiling.

Added in Databricks CLI version 0.229.0 or below

quality_monitors:
  <quality_monitor-name>:
    <quality_monitor-field-name>: <quality_monitor-field-value>

Key	Type	Description
`assets_dir`	String	The directory to store monitoring assets (e.g. dashboard, metric tables). Added in Databricks CLI version 0.229.0 or below
`baseline_table_name`	String	Name of the baseline table from which drift metrics are computed from. Columns in the monitored table should also be present in the baseline table. Added in Databricks CLI version 0.229.0 or below
`custom_metrics`	Sequence	Custom metrics to compute on the monitored table. These can be aggregate metrics, derived metrics (from already computed aggregate metrics), or drift metrics (comparing metrics across time windows). See custom_metrics. Added in Databricks CLI version 0.229.0 or below
`inference_log`	Map	Configuration for monitoring inference logs. See inference_log. Added in Databricks CLI version 0.229.0 or below
`latest_monitor_failure_msg`	String	The latest error message for a monitor failure. This is a read-only field that is populated when a monitor fails. Added in Databricks CLI version 0.264.0
`lifecycle`	Map	Contains the lifecycle settings for a resource. It controls the behavior of the resource when it is deployed or destroyed. See lifecycle. Added in Databricks CLI version 0.268.0
`notifications`	Map	The notification settings for the monitor. See notifications. Added in Databricks CLI version 0.229.0 or below
`output_schema_name`	String	Schema where output metric tables are created. Added in Databricks CLI version 0.229.0 or below
`schedule`	Map	The schedule for automatically updating and refreshing metric tables. See schedule. Added in Databricks CLI version 0.229.0 or below
`skip_builtin_dashboard`	Boolean	Whether to skip creating a default dashboard summarizing data quality metrics. Added in Databricks CLI version 0.229.0 or below
`slicing_exprs`	Sequence	List of column expressions to slice data with for targeted analysis. The data is grouped by each expression independently, resulting in a separate slice for each predicate and its complements. For high-cardinality columns, only the top 100 unique values by frequency will generate slices. Added in Databricks CLI version 0.229.0 or below
`snapshot`	Map	Configuration for monitoring snapshot tables. See snapshot. Added in Databricks CLI version 0.229.0 or below
`table_name`	String	The full name of the table. Added in Databricks CLI version 0.235.0
`time_series`	Map	Configuration for monitoring time series tables. See time_series. Added in Databricks CLI version 0.229.0 or below
`warehouse_id`	String	Optional argument to specify the warehouse for dashboard creation. If not specified, the first running warehouse will be used. Added in Databricks CLI version 0.229.0 or below

quality_monitor.custom_metrics

Type: Sequence

A list of custom metric definitions.

Each item in the list is a CustomMetric:

Key	Type	Description
`definition`	String	Jinja template for a SQL expression that specifies how to compute the metric. See create metric definition.
`input_columns`	Sequence	A list of column names in the input table the metric should be computed for. Can use `:table` to indicate that the metric needs information from multiple columns.
`name`	String	Name of the metric in the output tables.
`output_data_type`	String	The output type of the custom metric.
`type`	String	Can only be one of `CUSTOM_METRIC_TYPE_AGGREGATE`, `CUSTOM_METRIC_TYPE_DERIVED`, or `CUSTOM_METRIC_TYPE_DRIFT`. The `CUSTOM_METRIC_TYPE_AGGREGATE` and `CUSTOM_METRIC_TYPE_DERIVED` metrics are computed on a single table, whereas the `CUSTOM_METRIC_TYPE_DRIFT` compare metrics across baseline and input table, or across the two consecutive time windows. CUSTOM_METRIC_TYPE_AGGREGATE: only depend on the existing columns in your table CUSTOM_METRIC_TYPE_DERIVED: depend on previously computed aggregate metrics CUSTOM_METRIC_TYPE_DRIFT: depend on previously computed aggregate or derived metrics

quality_monitor.inference_log

Type: Map

Configuration for monitoring inference logs.

Key	Type	Description
`granularities`	Sequence	The time granularities for aggregating inference logs (for example, `["1 day"]`).
`model_id_col`	String	The name of the column containing the model ID.
`prediction_col`	String	The name of the column containing the prediction.
`timestamp_col`	String	The name of the column containing the timestamp.
`problem_type`	String	The type of ML problem. Valid values include `PROBLEM_TYPE_CLASSIFICATION`, `PROBLEM_TYPE_REGRESSION`.
`label_col`	String	The name of the column containing the label (ground truth).
`prediction_proba_col`	String	The name of the column containing the prediction probabilities.

quality_monitor.notifications

Type: Map

Notification settings for the monitor.

Key	Type	Description
`on_failure`	Map	Notification settings when the monitor fails. See on_failure.
`on_new_classification_tag_detected`	Map	Notification settings when new classification tags are detected. See on_new_classification_tag_detected.

quality_monitor.notifications.on_failure

Type: Map

Notification settings when the monitor fails.

Key	Type	Description
`email_addresses`	Sequence	A list of email addresses to notify on monitor failure.

quality_monitor.notifications.on_new_classification_tag_detected

Type: Map

Notification settings when new classification tags are detected.

Key	Type	Description
`email_addresses`	Sequence	A list of email addresses to notify when new classification tags are detected.

quality_monitor.schedule

Type: Map

Schedule for automatically updating and refreshing metric tables.

Key	Type	Description
`quartz_cron_expression`	String	A Cron expression using Quartz syntax. For example, `0 0 8 * * ?` runs every day at 8:00 AM.
`timezone_id`	String	The timezone for the schedule (for example, `UTC`, `America/Los_Angeles`).
`pause_status`	String	Whether the schedule is paused. Valid values: `PAUSED`, `UNPAUSED`.

quality_monitor.snapshot

Type: Map

Configuration for monitoring snapshot tables.

quality_monitor.time_series

Configuration for monitoring time series tables.

Key	Type	Description
`granularities`	Sequence	The time granularities for aggregating time series data (for example, `["30 minutes"]`).
`timestamp_col`	String	The name of the column containing the timestamp.

Examples

The following examples define quality monitors for InferenceLog, TimeSeries, and Snapshot profile types.

# InferenceLog profile type
resources:
  quality_monitors:
    my_quality_monitor:
      table_name: dev.mlops_schema.predictions
      output_schema_name: ${bundle.target}.mlops_schema
      assets_dir: /Workspace/Users/${workspace.current_user.userName}/databricks_lakehouse_monitoring
      inference_log:
        granularities: [1 day]
        model_id_col: model_id
        prediction_col: prediction
        label_col: price
        problem_type: PROBLEM_TYPE_REGRESSION
        timestamp_col: timestamp
      schedule:
        quartz_cron_expression: 0 0 8 * * ? # Run Every day at 8am
        timezone_id: UTC

# TimeSeries profile type
resources:
  quality_monitors:
    my_quality_monitor:
      table_name: dev.mlops_schema.predictions
      output_schema_name: ${bundle.target}.mlops_schema
      assets_dir: /Workspace/Users/${workspace.current_user.userName}/databricks_lakehouse_monitoring
      time_series:
        granularities: [30 minutes]
        timestamp_col: timestamp
      schedule:
        quartz_cron_expression: 0 0 8 * * ? # Run Every day at 8am
        timezone_id: UTC

# Snapshot profile type
resources:
  quality_monitors:
    my_quality_monitor:
      table_name: dev.mlops_schema.predictions
      output_schema_name: ${bundle.target}.mlops_schema
      assets_dir: /Workspace/Users/${workspace.current_user.userName}/databricks_lakehouse_monitoring
      snapshot: {}
      schedule:
        quartz_cron_expression: 0 0 8 * * ? # Run Every day at 8am
        timezone_id: UTC

The following example configures a quality monitor and a corresponding model retraining job based on the monitoring:

# Quality monitoring workflow
resources:
  quality_monitors:
    mlops_quality_monitor:
      table_name: ${bundle.target}.mlops_demo.predictions
      output_schema_name: ${bundle.target}.mlops_demo
      assets_dir: /Users/${workspace.current_user.userName}/databricks_lakehouse_monitoring
      inference_log:
        granularities: [1 hour]
        model_id_col: model_version
        prediction_col: prediction
        label_col: fare_amount
        problem_type: PROBLEM_TYPE_REGRESSION
        timestamp_col: inference_timestamp
      schedule:
        quartz_cron_expression: 57 0 14 * * ? # refresh monitoring metrics every day at 7 am PT
        timezone_id: UTC
  jobs:
    retraining_job:
      name: ${bundle.target}-mlops_demo-monitoring-retraining-job
      tasks:
        - task_key: monitored_metric_violation_check
          notebook_task:
            notebook_path: ../monitoring/notebooks/MonitoredMetricViolationCheck.py
            base_parameters:
              env: ${bundle.target}
              table_name_under_monitor: ${bundle.target}.mlops_demo.predictions
              metric_to_monitor: r2_score
              metric_violation_threshold: 0.7
              num_evaluation_windows: 24
              num_violation_windows: 5 # 5 out of the past 24 windows have metrics lower than threshold

        - task_key: is_metric_violated
          depends_on:
            - task_key: monitored_metric_violation_check
          condition_task:
            op: EQUAL_TO
            left: '{{tasks.monitored_metric_violation_check.values.is_metric_violated}}'
            right: 'true'

        - task_key: trigger_retraining
          depends_on:
            - task_key: is_metric_violated
              outcome: 'true'
          run_job_task:
            job_id: ${resources.jobs.model_training_job.id}

      schedule:
        quartz_cron_expression: '0 0 15 * * ?' # daily at 8 am PDT
        timezone_id: UTC

      # To get notifications, provide a list of emails to the on_failure argument.
      #
      #  email_notifications:
      #    on_failure:
      #      - someone@example.com

registered_model (Unity Catalog)

Type: Map

The registered model resource allows you to define models in Unity Catalog. For information about Unity Catalog registered models, see Manage model lifecycle in Unity Catalog.

Added in Databricks CLI version 0.229.0 or below

registered_models:
  <registered_model-name>:
    <registered_model-field-name>: <registered_model-field-value>

Key	Type	Description
`aliases`	Sequence	List of aliases associated with the registered model. See registered_model.aliases. Added in Databricks CLI version 0.273.0
`browse_only`	Boolean	Indicates whether the principal is limited to retrieving metadata for the associated object through the BROWSE privilege when include_browse is enabled in the request. Added in Databricks CLI version 0.273.0
`catalog_name`	String	The name of the catalog where the schema and the registered model reside. Added in Databricks CLI version 0.229.0 or below
`comment`	String	The comment attached to the registered model. Added in Databricks CLI version 0.229.0 or below
`created_at`	Integer	Creation timestamp of the registered model in milliseconds since the Unix epoch. Added in Databricks CLI version 0.273.0
`created_by`	String	The identifier of the user who created the registered model. Added in Databricks CLI version 0.273.0
`full_name`	String	The three-level (fully qualified) name of the registered model. Added in Databricks CLI version 0.273.0
`grants`	Sequence	The grants associated with the registered model. See grant. Added in Databricks CLI version 0.229.0 or below
`lifecycle`	Map	Contains the lifecycle settings for a resource. It controls the behavior of the resource when it is deployed or destroyed. See lifecycle. Added in Databricks CLI version 0.268.0
`metastore_id`	String	The unique identifier of the metastore. Added in Databricks CLI version 0.273.0
`name`	String	The name of the registered model. Added in Databricks CLI version 0.229.0 or below
`owner`	String	The identifier of the user who owns the registered model. Added in Databricks CLI version 0.273.0
`schema_name`	String	The name of the schema where the registered model resides. Added in Databricks CLI version 0.229.0 or below
`storage_location`	String	The storage location on the cloud under which model version data files are stored. Added in Databricks CLI version 0.229.0 or below
`updated_at`	String	Last-update timestamp of the registered model in milliseconds since the Unix epoch. Added in Databricks CLI version 0.273.0
`updated_by`	String	The identifier of the user who updated the registered model last time. Added in Databricks CLI version 0.273.0

registered_model.aliases

Type: Sequence

A list of aliases associated with the registered model.

Each item in the list is an Alias:

Key	Type	Description
`alias_name`	String	Name of the alias, e.g. 'champion' or 'latest_stable'
`catalog_name`	String	The name of the catalog containing the model version
`id`	String	The unique identifier of the alias
`model_name`	String	The name of the parent registered model of the model version, relative to parent schema
`schema_name`	String	The name of the schema containing the model version, relative to parent catalog
`version_num`	Integer	Integer version number of the model version to which this alias points.

Example

The following example defines a registered model in Unity Catalog:

resources:
  registered_models:
    model:
      name: my_model
      catalog_name: ${bundle.target}
      schema_name: mlops_schema
      comment: Registered model in Unity Catalog for ${bundle.target} deployment target
      grants:
        - privileges:
            - EXECUTE
          principal: account users

schema (Unity Catalog)

Type: Map

Schemas are supported in Python for Databricks Asset Bundles. See databricks.bundles.schemas.

The schema resource type allows you to define Unity Catalog schemas for tables and other assets in your workflows and pipelines created as part of a bundle. A schema, different from other resource types, has the following limitations:

The owner of a schema resource is always the deployment user, and cannot be changed. If run_as is specified in the bundle, it will be ignored by operations on the schema.
Only fields supported by the corresponding Schemas object create API are available for the schema resource. For example, enable_predictive_optimization is not supported as it is only available on the update API.

Added in Databricks CLI version 0.229.0 or below

schemas:
  <schema-name>:
    <schema-field-name>: <schema-field-value>

Key	Type	Description
`catalog_name`	String	The name of the parent catalog. Added in Databricks CLI version 0.229.0 or below
`comment`	String	A user-provided free-form text description. Added in Databricks CLI version 0.229.0 or below
`grants`	Sequence	The grants associated with the schema. See grant. Added in Databricks CLI version 0.229.0 or below
`lifecycle`	Map	Contains the lifecycle settings for a resource. It controls the behavior of the resource when it is deployed or destroyed. See lifecycle. Added in Databricks CLI version 0.268.0
`name`	String	The name of schema, relative to the parent catalog. Added in Databricks CLI version 0.229.0 or below
`properties`	Map	A map of key-value properties attached to the schema. Added in Databricks CLI version 0.229.0 or below
`storage_root`	String	The storage root URL for managed tables within the schema. Added in Databricks CLI version 0.229.0 or below

Examples

The following example defines a pipeline with the resource key my_pipeline that creates a Unity Catalog schema with the key my_schema as the target:

resources:
  pipelines:
    my_pipeline:
      name: test-pipeline-{{.unique_id}}
      libraries:
        - notebook:
            path: ../src/nb.ipynb
        - file:
            path: ../src/range.sql
      development: true
      catalog: ${resources.schemas.my_schema.catalog_name}
      target: ${resources.schemas.my_schema.id}

  schemas:
    my_schema:
      name: test-schema-{{.unique_id}}
      catalog_name: main
      comment: This schema was created by Databricks Asset Bundles.

A top-level grants mapping is not supported by Databricks Asset Bundles, so if you want to set grants for a schema, define the grants for the schema within the schemas mapping. For more information about grants, see Show, grant, and revoke privileges.

The following example defines a Unity Catalog schema with grants:

resources:
  schemas:
    my_schema:
      name: test-schema
      grants:
        - principal: users
          privileges:
            - SELECT
        - principal: my_team
          privileges:
            - CAN_MANAGE
      catalog_name: main

secret_scope

Type: Map

The secret_scope resource allows you to define secret scopes in a bundle. For information about secret scopes, see Secret management.

Added in Databricks CLI version 0.252.0

secret_scopes:
  <secret_scope-name>:
    <secret_scope-field-name>: <secret_scope-field-value>

Key	Type	Description
`backend_type`	String	The backend type the scope will be created with. If not specified, this defaults to `DATABRICKS`. Added in Databricks CLI version 0.252.0
`keyvault_metadata`	Map	The metadata for the secret scope if the `backend_type` is `AZURE_KEYVAULT`. See keyvault_metadata. Added in Databricks CLI version 0.252.0
`lifecycle`	Map	Contains the lifecycle settings for a resource. It controls the behavior of the resource when it is deployed or destroyed. See lifecycle. Added in Databricks CLI version 0.268.0
`name`	String	Scope name requested by the user. Scope names are unique. Added in Databricks CLI version 0.252.0
`permissions`	Sequence	The permissions to apply to the secret scope. Permissions are managed via secret scope ACLs. See permissions. Added in Databricks CLI version 0.252.0

secret_scope.keyvault_metadata

Type: Map

The metadata for Azure Key Vault-backed secret scopes.

Key	Type	Description
`resource_id`	String	The Azure resource ID of the Key Vault.
`dns_name`	String	The DNS name of the Azure Key Vault.

Examples

The following example defines a secret scope that uses a key vault backend:

resources:
  secret_scopes:
    secret_scope_azure:
      name: test-secrets-azure-backend
      backend_type: 'AZURE_KEYVAULT'
      keyvault_metadata:
        resource_id: my_azure_keyvault_id
        dns_name: my_azure_keyvault_dns_name

The following example sets a custom ACL using secret scopes and permissions:

resources:
  secret_scopes:
    my_secret_scope:
      name: my_secret_scope
      permissions:
        - user_name: admins
          level: WRITE
        - user_name: users
          level: READ

For an example bundle that demonstrates how to define a secret scope and a job with a task that reads from it in a bundle, see the bundle-examples GitHub repository.

sql_warehouse

Type: Map

The SQL warehouse resource allows you to define a SQL warehouse in a bundle. For information about SQL warehouses, see Data warehousing on Azure Databricks.

Added in Databricks CLI version 0.260.0

sql_warehouses:
  <sql-warehouse-name>:
    <sql-warehouse-field-name>: <sql-warehouse-field-value>

Key	Type	Description
`auto_stop_mins`	Integer	The amount of time in minutes that a SQL warehouse must be idle (for example, no RUNNING queries), before it is automatically stopped. Valid values are 0, which indicates no autostop, or greater than or equal to 10. The default is 120. Added in Databricks CLI version 0.260.0
`channel`	Map	The channel details. See channel. Added in Databricks CLI version 0.260.0
`cluster_size`	String	The size of the clusters allocated for this warehouse. Increasing the size of a Spark cluster allows you to run larger queries on it. If you want to increase the number of concurrent queries, tune max_num_clusters. For supported values, see cluster_size. Added in Databricks CLI version 0.260.0
`creator_name`	String	The name of the user that created the warehouse. Added in Databricks CLI version 0.260.0
`enable_photon`	Boolean	Whether the warehouse should use Photon optimized clusters. Defaults to false. Added in Databricks CLI version 0.260.0
`enable_serverless_compute`	Boolean	Whether the warehouse should use serverless compute. Added in Databricks CLI version 0.260.0
`instance_profile_arn`	String	Deprecated. Instance profile used to pass IAM role to the cluster. Added in Databricks CLI version 0.260.0
`lifecycle`	Map	Contains the lifecycle settings for a resource. It controls the behavior of the resource when it is deployed or destroyed. See lifecycle. Added in Databricks CLI version 0.268.0
`max_num_clusters`	Integer	The maximum number of clusters that the autoscaler will create to handle concurrent queries. Values must be less than or equal to 30 and greater than or equal to `min_num_clusters`. Defaults to min_clusters if unset. Added in Databricks CLI version 0.260.0
`min_num_clusters`	Integer	The minimum number of available clusters that will be maintained for this SQL warehouse. Increasing this will ensure that a larger number of clusters are always running and therefore may reduce the cold start time for new queries. This is similar to reserved vs. revocable cores in a resource manager. Values must be greater than 0 and less than or equal to min(max_num_clusters, 30). Defaults to 1. Added in Databricks CLI version 0.260.0
`name`	String	The logical name for the cluster. The name must be unique within an org and less than 100 characters. Added in Databricks CLI version 0.260.0
`permissions`	Sequence	The permissions to apply to the warehouse. See permissions. Added in Databricks CLI version 0.260.0
`spot_instance_policy`	String	Whether to use spot instances. Valid values are `POLICY_UNSPECIFIED`, `COST_OPTIMIZED`, `RELIABILITY_OPTIMIZED`. The default is `COST_OPTIMIZED`. Added in Databricks CLI version 0.260.0
`tags`	Map	A set of key-value pairs that will be tagged on all resources (e.g., AWS instances and EBS volumes) associated with this SQL warehouse. The number of tags must be less than 45. Added in Databricks CLI version 0.260.0
`warehouse_type`	String	The warehouse type, `PRO` or `CLASSIC`. If you want to use serverless compute, set this field to `PRO` and also set the field `enable_serverless_compute` to `true`. Added in Databricks CLI version 0.260.0

sql_warehouse.channel

Type: Map

The channel configuration for the SQL warehouse.

Key	Type	Description
`name`	String	The name of the channel. Valid values include `CHANNEL_NAME_CURRENT`, `CHANNEL_NAME_PREVIEW`, `CHANNEL_NAME_CUSTOM`.
`dbsql_version`	String	The DBSQL version for custom channels.

Example

The following example defines a SQL warehouse:

resources:
  sql_warehouses:
    my_sql_warehouse:
      name: my_sql_warehouse
      cluster_size: X-Large
      enable_serverless_compute: true
      max_num_clusters: 3
      min_num_clusters: 1
      auto_stop_mins: 60
      warehouse_type: PRO

synced_database_table

Type: Map

The synced database table resource allows you to define Lakebase database tables in a bundle.

For information about synced database tables, see What is a database instance?.

Added in Databricks CLI version 0.266.0

synced_database_tables:
  <synced_database_table-name>:
    <synced_database_table-field-name>: <synced_database_table-field-value>

Key	Type	Description
`database_instance_name`	String	The name of the target database instance. This is required when creating synced database tables in standard catalogs. This is optional when creating synced database tables in registered catalogs. Added in Databricks CLI version 0.266.0
`lifecycle`	Map	Contains the lifecycle settings for a resource. It controls the behavior of the resource when it is deployed or destroyed. See lifecycle. Added in Databricks CLI version 0.268.0
`logical_database_name`	String	The name of the target Postgres database object (logical database) for this table. Added in Databricks CLI version 0.266.0
`name`	String	The full name of the table, in the form `catalog.schema.table`. Added in Databricks CLI version 0.266.0
`spec`	Map	The database table specification. See synced database table specification. Added in Databricks CLI version 0.266.0

synced_database_table.spec

Type: Map

The database table specification.

Added in Databricks CLI version 0.266.0

Key	Type	Description
`create_database_objects_if_missing`	Boolean	Whether to create the synced table's logical database and schema resources if they do not already exist.
`existing_pipeline_id`	String	The ID for an existing pipeline. If this is set, the synced table will be bin packed into the existing pipeline referenced. This avoids creating a new pipeline and allows sharing existing compute. In this case, the `scheduling_policy` of this synced table must match the scheduling policy of the existing pipeline. At most one of `existing_pipeline_id` and `new_pipeline_spec` should be defined.
`new_pipeline_spec`	Map	The specification for a new pipeline. See new_pipeline_spec. At most one of `existing_pipeline_id` and `new_pipeline_spec` should be defined.
`primary_key_columns`	Sequence	The list of column names that form the primary key.
`scheduling_policy`	String	The scheduling policy for syncing. Valid values include `SNAPSHOT`, `CONTINUOUS`.
`source_table_full_name`	String	The full name of the source table in the format `catalog.schema.table`.
`timeseries_key`	String	Time series key to de-duplicate rows with the same primary key.

synced_database_table.spec.new_pipeline_spec

Type: Map

The specification for a new pipeline used by the synced database table.

Key	Type	Description
`budget_policy_id`	String	The ID of the budget policy to set on the newly created pipeline.
`storage_catalog`	String	The catalog for the pipeline to store intermediate files, such as checkpoints and event logs. This needs to be a standard catalog where the user has permissions to create Delta tables.
`storage_schema`	String	The schema for the pipeline to store intermediate files, such as checkpoints and event logs. This needs to be in the standard catalog where the user has permissions to create Delta tables.

Examples

The following example defines a synced database table within a corresponding database catalog:

resources:
  database_instances:
    my_instance:
      name: my-instance
      capacity: CU_1
  database_catalogs:
    my_catalog:
      database_instance_name: my-instance
      database_name: 'my_database'
      name: my_catalog
      create_database_if_not_exists: true
  synced_database_tables:
    my_synced_table:
      name: ${resources.database_catalogs.my_catalog.name}.${resources.database_catalogs.my_catalog.database_name}.my_destination_table
      database_instance_name: ${resources.database_catalogs.my_catalog.database_instance_name}
      logical_database_name: ${resources.database_catalogs.my_catalog.database_name}
      spec:
        source_table_full_name: 'my_source_table'
        scheduling_policy: SNAPSHOT
        primary_key_columns:
          - my_pk_column
        new_pipeline_spec:
          storage_catalog: 'my_delta_catalog'
          storage_schema: 'my_delta_schema'

The following example defines a synced database table inside a standard catalog:

resources:
  synced_database_tables:
    my_synced_table:
      name: 'my_standard_catalog.public.synced_table'
      # database_instance_name is required for synced tables created in standard catalogs.
      database_instance_name: 'my-database-instance'
      # logical_database_name is required for synced tables created in standard catalogs:
      logical_database_name: ${resources.database_catalogs.my_catalog.database_name}
      spec:
        source_table_full_name: 'source_catalog.schema.table'
        scheduling_policy: SNAPSHOT
        primary_key_columns:
          - my_pk_column
        create_database_objects_if_missing: true
        new_pipeline_spec:
          storage_catalog: 'my_delta_catalog'
          storage_schema: 'my_delta_schema'

This example creates a synced database table and customizes the pipeline schedule for it. It assumes you already have:

A database instance named my-database-instance
A standard catalog named my_standard_catalog
A schema in the standard catalog named default
A source delta table named source_delta.schema.customer with the primary key c_custkey

resources:
  synced_database_tables:
    my_synced_table:
      name: 'my_standard_catalog.default.my_synced_table'
      database_instance_name: 'my-database-instance'
      logical_database_name: 'test_db'
      spec:
        source_table_full_name: 'source_delta.schema.customer'
        scheduling_policy: SNAPSHOT
        primary_key_columns:
          - c_custkey
        create_database_objects_if_missing: true
        new_pipeline_spec:
          storage_catalog: 'source_delta'
          storage_schema: 'schema'

  jobs:
    sync_pipeline_schedule_job:
      name: sync_pipeline_schedule_job
      description: 'Job to schedule synced database table pipeline.'
      tasks:
        - task_key: synced-table-pipeline
          pipeline_task:
            pipeline_id: ${resources.synced_database_tables.my_synced_table.data_synchronization_status.pipeline_id}
      schedule:
        quartz_cron_expression: '0 0 0 * * ?'

volume (Unity Catalog)

Type: Map

Volumes are supported in Python for Databricks Asset Bundles. See databricks.bundles.volumes.

The volume resource type allows you to define and create Unity Catalog volumes as part of a bundle. When deploying a bundle with a volume defined, note that:

A volume cannot be referenced in the artifact_path for the bundle until it exists in the workspace. Hence, if you want to use Databricks Asset Bundles to create the volume, you must first define the volume in the bundle, deploy it to create the volume, then reference it in the artifact_path in subsequent deployments.
Volumes in the bundle are not prepended with the dev_${workspace.current_user.short_name} prefix when the deployment target has mode: development configured. However, you can manually configure this prefix. See Custom presets.

Added in Databricks CLI version 0.236.0

volumes:
  <volume-name>:
    <volume-field-name>: <volume-field-value>

Key	Type	Description
`catalog_name`	String	The name of the catalog of the schema and volume. Added in Databricks CLI version 0.236.0
`comment`	String	The comment attached to the volume. Added in Databricks CLI version 0.236.0
`grants`	Sequence	The grants associated with the volume. See grant. Added in Databricks CLI version 0.236.0
`lifecycle`	Map	Contains the lifecycle settings for a resource. It controls the behavior of the resource when it is deployed or destroyed. See lifecycle. Added in Databricks CLI version 0.268.0
`name`	String	The name of the volume. Added in Databricks CLI version 0.236.0
`schema_name`	String	The name of the schema where the volume is. Added in Databricks CLI version 0.236.0
`storage_location`	String	The storage location on the cloud. Added in Databricks CLI version 0.236.0
`volume_type`	String	The volume type, either `EXTERNAL` or `MANAGED`. An external volume is located in the specified external location. A managed volume is located in the default location which is specified by the parent schema, or the parent catalog, or the metastore. See Managed versus external volumes.

Example

The following example creates a Unity Catalog volume with the key my_volume_id:

resources:
  volumes:
    my_volume_id:
      catalog_name: main
      name: my_volume
      schema_name: my_schema

For an example bundle that runs a job that writes to a file in Unity Catalog volume, see the bundle-examples GitHub repository.

Common objects

grant

Type: Map

Defines the principal and privileges to grant to that principal. For more information about grants, see Show, grant, and revoke privileges.

Added in Databricks CLI version 0.229.0 or below

Key	Type	Description
`principal`	String	The name of the principal that will be granted privileges. This can be a user, group, or service principal.
`privileges`	Sequence	The privileges to grant to the specified entity. Valid values depend on the resource type (for example, `SELECT`, `MODIFY`, `CREATE`, `USAGE`, `READ_FILES`, `WRITE_FILES`, `EXECUTE`, `ALL_PRIVILEGES`).

Example

The following example defines a Unity Catalog schema with grants:

resources:
  schemas:
    my_schema:
      name: test-schema
      grants:
        - principal: users
          privileges:
            - SELECT
        - principal: my_team
          privileges:
            - CAN_MANAGE
      catalog_name: main

lifecycle

Type: Map

Contains the lifecycle settings for a resource. It controls the behavior of the resource when it is deployed or destroyed.

Added in Databricks CLI version 0.268.0

Key	Type	Description
`prevent_destroy`	Boolean	Lifecycle setting to prevent the resource from being destroyed. Added in Databricks CLI version 0.268.0

Feedback

Var denne side nyttig?

Last updated on 2026-02-17

Del via

Databricks Asset Bundles resources

Supported resources

alert

alert.evaluation

alert.evaluation.notification

alert.evaluation.notification.subscriptions

alert.evaluation.source

alert.evaluation.threshold

alert.evaluation.threshold.value

alert.schedule

Examples

app

app.config

app.resources

app.resources.database

app.resources.experiment

app.resources.genie_space

app.resources.job

app.resources.secret

app.resources.serving_endpoint

app.resources.sql_warehouse

app.resources.uc_securable

Examples

catalogs

Example

cluster

cluster.autoscale

cluster.aws_attributes

cluster.azure_attributes

cluster.azure_attributes.log_analytics_info

cluster.gcp_attributes

cluster.cluster_log_conf

cluster.cluster_log_conf.dbfs

cluster.cluster_log_conf.s3

cluster.cluster_log_conf.volumes

cluster.docker_image

cluster.docker_image.basic_auth

cluster.init_scripts

cluster.init_scripts.dbfs

cluster.init_scripts.workspace

cluster.init_scripts.s3

cluster.init_scripts.abfss

cluster.init_scripts.gcs

cluster.init_scripts.volumes

cluster.workload_type

cluster.workload_type.clients

Examples

dashboard

Example

database_catalog

Example

database_instance

database_instance.parent_instance_ref

Example

experiment

Example

job

job.continuous

job.deployment

job.email_notifications

job.environments

job.environments.spec

job.git_source

job.git_source.git_snapshot

job.health

job.health.rules

job.job_clusters

job.notification_settings

job.queue

job.schedule

job.trigger

job.trigger.file_arrival

job.trigger.table

job.trigger.table_update

job.trigger.periodic

job.webhook_notifications

Examples

model (legacy)

model_serving_endpoint