subcategory |
---|
Security |
-> Note Please switch to databricks_grants with Unity Catalog to manage data access, which provides a better and faster way for managing data security. databricks_grants
resource doesn't require a technical cluster to perform operations. On workspaces with Unity Catalog enabled, you may run into errors such as Error: cannot create sql permissions: cannot read current grants: For unity catalog, please specify the catalog name explicitly. E.g. SHOW GRANT ``your.address@email.com`` ON CATALOG main
. This happens if your default_catalog_name
was set to a UC catalog instead of hive_metastore
. The workaround is to re-assign the metastore again with the default catalog set to be hive_metastore
. See databricks_metastore_assignment.
This resource manages data object access control lists in Databricks workspaces for things like tables, views, databases, and more. In order to enable Table Access control, you have to login to the workspace as administrator, go to Admin Console
, pick Access Control
tab, click on Enable
button in Table Access Control
section, and click Confirm
. The security guarantees of table access control will only be effective if cluster access control is also turned on. Please make sure that no users can create clusters in your workspace and all databricks_cluster have approximately the following configuration:
resource "databricks_cluster" "cluster_with_table_access_control" {
// ...
spark_conf = {
"spark.databricks.acl.dfAclsEnabled" : "true",
"spark.databricks.repl.allowedLanguages" : "python,sql",
}
}
It could be combined with creation of High-Concurrency and Single-Node clusters - in this case it should have corresponding custom_tags
and spark.databricks.cluster.profile
in Spark configuration as described in documentation for databricks_cluster
resource.
The created cluster could be referred to by providing its ID as cluster_id
property.
resource "databricks_sql_permissions" "foo_table" {
cluster_id = databricks_cluster.cluster_name.id
#...
}
It is required to define all permissions for a securable in a single resource, otherwise Terraform cannot guarantee config drift prevention.
The following resource definition will enforce access control on a table by executing the following SQL queries on a special auto-terminating cluster it would create for this operation:
SHOW GRANT ON TABLE `default`.`foo`
REVOKE ALL PRIVILEGES ON TABLE `default`.`foo` FROM ... every group and user that has access to it ...
GRANT MODIFY, SELECT ON TABLE `default`.`foo` TO `serge@example.com`
GRANT SELECT ON TABLE `default`.`foo` TO `special group`
resource "databricks_sql_permissions" "foo_table" {
table = "foo"
privilege_assignments {
principal = "serge@example.com"
privileges = ["SELECT", "MODIFY"]
}
privilege_assignments {
principal = "special group"
privileges = ["SELECT"]
}
}
The following arguments are available to specify the data object you need to enforce access controls on. You must specify only one of those arguments (except for table
and view
), otherwise resource creation will fail.
database
- Name of the database. Has default value ofdefault
.table
- Name of the table. Can be combined withdatabase
.view
- Name of the view. Can be combined withdatabase
.catalog
- (Boolean) If this access control for the entire catalog. Defaults tofalse
.any_file
- (Boolean) If this access control for reading any file. Defaults tofalse
.anonymous_function
- (Boolean) If this access control for using anonymous function. Defaults tofalse
.
You must specify one or many privilege_assignments
configuration blocks to declare privileges
to a principal
, which corresponds to display_name
of databricks_group or databricks_user. Terraform would ensure that only those principals and privileges defined in the resource are applied for the data object and would remove anything else. It would not remove any transitive privileges. DENY
statements are intentionally not supported. Every privilege_assignments
has the following required arguments:
principal
-display_name
for a databricks_group or databricks_user,application_id
for a databricks_service_principal.privileges
- set of available privilege names in upper case.
Available privilege names are:
SELECT
- gives read access to an object.CREATE
- gives the ability to create an object (for example, a table in a database).MODIFY
- gives the ability to add, delete, and modify data to or from an object.USAGE
- do not give any abilities, but is an additional requirement to perform any action on a database object.READ_METADATA
- gives the ability to view an object and its metadata.CREATE_NAMED_FUNCTION
- gives the ability to create a named UDF in an existing catalog or database.MODIFY_CLASSPATH
- gives the ability to add files to the Spark class path.
-> Even though the value ALL PRIVILEGES
is mentioned in Table ACL documentation, it's not recommended to use it from terraform, as it may result in unnecessary state updates.
The resource can be imported using a synthetic identifier. Examples of valid synthetic identifiers are:
table/default.foo
- tablefoo
in adefault
database. Database is always mandatory.view/bar.foo
- viewfoo
inbar
database.database/bar
-bar
database.catalog/
- entire catalog./
suffix is mandatory.any file/
- direct access to any file./
suffix is mandatory.anonymous function/
- anonymous function./
suffix is mandatory.
$ terraform import databricks_sql_permissions.foo /<object-type>/<object-name>
The following resources are often used in the same context:
- End to end workspace management guide.
- databricks_group to manage groups in Databricks Workspace or Account Console (for AWS deployments).
- databricks_grants to manage data access in Unity Catalog.
- databricks_permissions to manage access control in Databricks workspace.
- databricks_user to manage users, that could be added to databricks_group within the workspace.