-
Notifications
You must be signed in to change notification settings - Fork 87
/
labs.yml
362 lines (317 loc) · 15.9 KB
/
labs.yml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
---
name: ucx
description: Unity Catalog Migration Toolkit (UCX)
install:
script: src/databricks/labs/ucx/install.py
uninstall:
script: src/databricks/labs/ucx/uninstall.py
entrypoint: src/databricks/labs/ucx/cli.py
min_python: 3.10
commands:
- name: open-remote-config
description: Opens remote configuration in the browser
- name: workflows
description: Show deployed workflows and their state
table_template: |-
Step\tState\tStarted
{{range .}}{{.step}}\t{{.state}}\t{{.started}}
{{end}}
- name: logs
description: Show logs from the latest job run
flags:
- name: workflow
description: Name of the workflow to show logs for - assessment, table-migration, etc
- name: installations
description: Show installations by different users on the same workspace
table_template: |-
Path\tDatabase\tWarehouse
{{range .}}{{.path}}\t{{.database}}\t{{.warehouse_id}}
{{end}}
- name: skip
description: Add a skip comment on a schema, table or view.
flags:
- name: schema
description: Schema name to skip.
- name: table
description: (Optional) Table name to skip. Exclusive with `--view`.
- name: view
description: (Optional) View name to skip. Exclusive with `--table`.
- name: unskip
description: Remove the skip comment from a schema, table or view.
flags:
- name: schema
description: Schema name to unskip.
- name: table
description: (Optional) Table name to unskip. Exclusive with `--view`.
- name: view
description: (Optional) View name to unskip. Exclusive with `--table`.
- name: sync-workspace-info
is_account_level: true
description: upload workspace config to all workspaces in the account where ucx is installed
- name: report-account-compatibility
is_account_level: true
description: aggregation of UCX output of multiple workspaces in the account.
If --workspace-ids is not provided, it will use all workspaces present in the account.
flags:
- name: workspace-ids
description: List of workspace IDs to create account groups from.
- name: validate-table-locations
is_account_level: true
description: Validate if the table locations are overlapping in a workspace and across workspaces.
flags:
- name: workspace-ids
description: |
List of workspace IDs to include.
If --workspace-ids is not provided, it will use all workspaces present in the account.
- name: manual-workspace-info
description: only supposed to be run if cannot get admins to run `databricks labs ucx sync-workspace-info`
- name: create-table-mapping
description: create initial table mapping for review
flags:
- name: run-as-collection
description: (Optional) boolean flag to indicate to run the cmd as a collection. Default is False.
- name: ensure-assessment-run
description: ensure the assessment job was run on a workspace
flags:
- name: run-as-collection
description: (Optional) Whether to check (and run if necessary) the assessment for the collection of workspaces
with ucx installed. Default is false.
- name: update-migration-progress
description: trigger the `migration-progress-experimental` job to refresh the inventory that tracks the workspace
resources and their migration status.
flags:
- name: run-as-collection
description: (Optional) Whether to update the migration progress for the collection of workspaces with ucx
installed. Default is False.
- name: validate-external-locations
description: |
Validates external locations and provides Terraform script that maps external locations to external table.
flags:
- name: run-as-collection
description: Run the command for the collection of workspaces with ucx installed. Default is False.
- name: repair-run
description: Repair Run the Failed Job
flags:
- name: step
description: name of the step
- name: revert-migrated-tables
description: remove notation on a migrated table for re-migration
flags:
- name: schema
description: Schema to revert (if left blank all schemas in the workspace will be reverted)
- name: table
description: Table to revert (if left blank all tables in the schema will be reverted). Requires schema parameter to be specified.)
- name: delete_managed
description: Revert and delete managed tables
- name: move
description: move tables across schema/catalog withing a UC metastore
flags:
- name: from-catalog
description: from catalog name
- name: from-schema
description: schema name to migrate.
- name: from-table
description: table names to migrate. enter * to migrate all tables
- name: to-catalog
description: target catalog to migrate schema to
- name: to-schema
description: target schema to migrate tables to
- name: alias
description: |
alias tables across schema/catalog withing a UC metastore
create a view pointing to the "from" table
if a view is aliased, recreates the same view in the target schema/catalog
flags:
- name: from-catalog
description: from catalog name
- name: from-schema
description: from schema
- name: from-table
description: table names to alias. enter * to migrate all tables
- name: to-catalog
description: target catalog to migrate schema to
- name: to-schema
description: target schema to migrate tables to
- name: principal-prefix-access
description: For azure cloud, identifies all storage account used by tables in the workspace, identify spn and its
permission on each storage accounts. For aws, identifies all the Instance Profiles configured in the workspace and
its access to all the S3 buckets, along with AWS roles that are set with UC access and its access to S3 buckets.
The output is stored in the workspace install folder.
flags:
- name: subscription-ids
description: Comma separated list of subscriptions to scan storage account in.
- name: aws-profile
description: AWS Profile to use for authentication
- name: run-as-collection
description: (Optional) boolean flag to indicate to run the cmd as a collection. Default is False.
- name: create-missing-principals
description: For AWS, this command identifies all the S3 locations that are missing a UC compatible role and
creates them. It accepts a number of optional parameters, i.e. KMS Key, Role Name, Policy Name, and whether to
create a single role for all the S3 locations.
flags:
- name: aws-profile
description: AWS Profile to use for authentication
- name: kms-key
description: (Optional) KMS Key to be specified for the UC roles.
- name: role-name
description: (Optional) IAM Role name to be specified for the UC roles. (default:UC_ROLE)
- name: policy-name
description: (Optional) IAM policy Name to be specified for the UC roles. (default:UC_POLICY)
- name: single-role
description: (Optional) Create a single role for all S3 locations. (default:False)
- name: run-as-collection
description: (Optional) boolean flag to indicate to run the cmd as a collection. Default is False.
- name: delete-missing-principals
description: For AWS, this command identifies all the UC roles that are created through the create-missing-principals
cmd. It lists all the UC roles in aws and lets users select the roles to delete. It also validates if the selected
roles are used by any storage credential and prompts to confirm if roles should still be deleted.
flags:
- name: aws-profile
description: AWS Profile to use for authentication
- name: create-uber-principal
description: |
For azure cloud, creates a service principal and gives `STORAGE_BLOB_READER` access on all the storage account
used by tables in the workspace and stores the service principal information in the UCX cluster policy.
For aws, indentify all s3 buckets used by the Instance Profiles configured in the workspace.
flags:
- name: subscription-ids
description: Comma separated list of subscriptions to scan storage account in.
- name: aws-profile
description: AWS Profile to use for authentication
- name: run-as-collection
description: Run the command for the collection of workspaces with ucx installed. Default is False.
- name: validate-groups-membership
description: Validate groups to check if the groups at account level and workspace level have different memberships
table_template: |-
Workspace Group Name\tMembers Count\tAccount Group Name\tMembers Count\tDifference
{{range .}}{{.wf_group_name}}\t{{.wf_group_members_count}}\t{{.acc_group_name}}\t{{.acc_group_members_count}}\t{{.group_members_difference}}
{{end}}
flags:
- name: run-as-collection
description: (Optional) Run the command for the collection of workspaces with ucx installed. Default is False.
- name: migrate-credentials
description: Migrate credentials for storage access to UC storage credential
flags:
- name: subscription-ids
description: Comma separated list of subscriptions to scan storage account in.
- name: aws-profile
description: AWS Profile to use for authentication
- name: run-as-collection
description: (Optional) boolean flag to indicate to run the cmd as a collection. Default is False.
- name: create-account-groups
is_account_level: true
description: |
Creates account level groups for all groups in workspaces provided in --workspace-ids.
If --workspace-ids is not provided, it will use all workspaces present in the account.
flags:
- name: workspace-ids
description: List of workspace IDs to create account groups from.
- name: migrate-locations
description: Create UC external locations based on the output of guess_external_locations assessment task.
flags:
- name: subscription-ids
description: Comma separated list of subscriptions to scan storage account in.
- name: aws-profile
description: AWS Profile to use for authentication
- name: run-as-collection
description: Run the command for the collection of workspaces with ucx installed. Default is False.
- name: create-catalogs-schemas
description: |
Create UC external catalogs and schemas based on the destinations created from `create_table_mapping` command.
This command should be executed before migrating tables to Unity Catalog.
flags:
- name: run-as-collection
description: Run the command for the collection of workspaces with ucx installed. Default is False.
- name: cluster-remap
description: Re-mapping the cluster to UC
- name: revert-cluster-remap
description: Reverting the Re-mapping of the cluster from UC
- name: migrate-local-code
description: (Experimental) Migrate files in the current directory to be more compatible with Unity Catalog.
- name: lint-local-code
description: (Experimental) Lint files in the current directory to highlight incompatibilities with Unity Catalog.
flags:
- name: path
description: Path to the file or directory to lint
- name: show-all-metastores
is_account_level: true
description: Show all metastores available in the same region as the specified workspace
flags:
- name: workspace-id
description: (Optional) Workspace ID to show metastores for
- name: assign-metastore
is_account_level: true
description: Enable Unity Catalog features on a workspace by assigning a metastore to it.
flags:
- name: workspace-id
description: Workspace ID to assign a metastore to
- name: metastore-id
description: (Optional) If there are multiple metastores in the region, specify the metastore ID to assign
- name: default-catalog
description: (Optional) Default catalog to assign to the workspace. If not provided, it will be hive_metastore
- name: create-ucx-catalog
description: Create UCX artifact catalog
- name: migrate-tables
description: |
Trigger the `migrate-tables` workflow and, optionally, `migrate-external-hiveserde-tables-in-place-experimental`
workflow and `migrate-external-tables-ctas workflow`.
flags:
- name: run-as-collection
description: Run the command for the collection of workspaces with ucx installed. Default is False.
- name: migrate-acls
description: |
Migrate access control lists from legacy metastore to UC metastore.
Use the --dry-run flag to populate the infered_grants table and skip the migration.
Use the hms-fed flag to migrate HMS-FED ACLs. If not provided, HMS ACLs will be migrated for migrated tables.
flags:
- name: target-catalog
description: (Optional) Target catalog to migrate ACLs to. Used for HMS-FED ACLs migration.
- name: hms-fed
description: (Optional) Migrate HMS-FED ACLs. If not provided, HMS ACLs will be migrated for migrated tables.
- name: dry-run
description: (Optional) Dry run the migration. If set to true, acl table will be populated and acl migration will be skipped.
If not provided, the migration will be executed.
- name: run-as-collection
description: (Optional) Run the command for the collection of workspaces with ucx installed. Default is False.
- name: migrate-dbsql-dashboards
description: Migrate DBSQL dashboards by replacing legacy HMS tables in DBSQL queries with the corresponding new UC tables.
flags:
- name: dashboard-id
description: (Optional) DBSQL dashboard ID to migrate. If no dashboard ID is provided, all DBSQL dashboards in the workspace will be migrated.
- name: run-as-collection
description: (Optional) Run the command for the collection of workspaces with ucx installed. Default is False.
- name: revert-dbsql-dashboards
description: Revert DBSQL dashboards that have been migrated to their original state before the migration.
flags:
- name: dashboard-id
description: (Optional) DBSQL dashboard ID to revert. If no dashboard ID is provided, all migrated DBSQL dashboards in the workspace will be reverted.
- name: join-collection
is_account_level: true
description: workspace_id to join a collection.
flags:
- name: workspace-ids
description: workspace_ids which should join a collection. provide a comma separated list of workspace ids.
- name: target-workspace-id
description: (Optional) id of a workspace in the target collection. If not specified, ucx will prompt to select from a list
- name: upload
description: upload file to all workspaces in the account where ucx is installed
flags:
- name: file
description: The file to upload
- name: run-as-collection
description: Run the command for the collection of workspaces with ucx installed. Default is False.
- name: download
description: download file from all workspaces in the account where ucx is installed
flags:
- name: file
description: The file to download
- name: run-as-collection
description: Run the command for the collection of workspaces with ucx installed. Default is False.
- name: export-assessment
description: Export UCX results to a specified location
- name: create-federated-catalog
description: (EXPERIMENTAL) Create a federated catalog in the workspace
- name: enable-hms-federation
description: (EXPERIMENTAL) Enable HMS federation based migration flow. When this is enabled, UCX will create a federated HMS catalog which syncs from the workspace HMS.
- name: assign-owner-group
description: Assign owner group to the workspace. This group will be assigned as an owner to all migrated tables and views.