Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

skip workspace listing for objects that dont have object_type defined #481

Merged
merged 2 commits into from
Oct 20, 2023

Conversation

HariGS-DB
Copy link
Contributor

@HariGS-DB HariGS-DB commented Oct 20, 2023

When running the assessment job, the tasks workspace listing, fails with the error "KeyError: 'object_type'"
the object_type property is not set for experiment objects and while listing it throws this error. Since scope of workspace listing is limited to Notebooks, Directories, Repos, Files, Libraries, suggesting a code change will will filter out cases where object_type is None.

@HariGS-DB HariGS-DB requested a review from a team October 20, 2023 11:11
@HariGS-DB HariGS-DB linked an issue Oct 20, 2023 that may be closed by this pull request
@codecov
Copy link

codecov bot commented Oct 20, 2023

Codecov Report

Merging #481 (801f4fd) into main (2559812) will increase coverage by 0.06%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main     #481      +/-   ##
==========================================
+ Coverage   80.19%   80.25%   +0.06%     
==========================================
  Files          31       31              
  Lines        3206     3206              
  Branches      620      620              
==========================================
+ Hits         2571     2573       +2     
+ Misses        487      486       -1     
+ Partials      148      147       -1     
Files Coverage Δ
...rc/databricks/labs/ucx/workspace_access/generic.py 90.82% <100.00%> (+0.96%) ⬆️

@HariGS-DB HariGS-DB temporarily deployed to account-admin October 20, 2023 11:32 — with GitHub Actions Inactive
@@ -246,7 +246,7 @@ def _crawl(self) -> list[WorkspaceObjectInfo]:

ws_listing = WorkspaceListing(self._ws, num_threads=self._num_threads, with_directories=False)
for obj in ws_listing.walk(self._start_path):
if obj is None:
if obj is None or obj.object_type is None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When object type is none?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for experiment object_type is None and we need to filter this out as there is seperate listing for experiment

@nfx
Copy link
Collaborator

nfx commented Oct 20, 2023

Make PR title changelog ready

@HariGS-DB HariGS-DB added bug Something isn't working feat/crawler labels Oct 20, 2023
@mwojtyczka
Copy link
Contributor

mwojtyczka commented Oct 20, 2023

This PR also solves another issue. Currently the tool trips on itself by listing its own files. You get object_type is None in that case as well and the assessment job fails.

@HariGS-DB HariGS-DB changed the title Bug/workspace listing fix Bug: fix issue with workspace listing process on None type object_type. Oct 20, 2023
@HariGS-DB HariGS-DB changed the title Bug: fix issue with workspace listing process on None type object_type. fix issue with workspace listing process on None type object_type. Oct 20, 2023
fannijako added a commit that referenced this pull request Oct 20, 2023
@nfx nfx changed the title fix issue with workspace listing process on None type object_type. fix issue with workspace listing process on None type object_type Oct 20, 2023
@nfx nfx changed the title fix issue with workspace listing process on None type object_type Fixed issue with workspace listing process on None type object_type Oct 20, 2023
@nfx nfx merged commit 334c9b8 into main Oct 20, 2023
6 checks passed
@HariGS-DB HariGS-DB changed the title Fixed issue with workspace listing process on None type object_type skip workspace listing for objects that dont have object_type defined Oct 20, 2023
@HariGS-DB HariGS-DB deleted the bug/workspace_listing_fix branch October 20, 2023 15:13
nfx added a commit that referenced this pull request Nov 17, 2023
**Breaking changes** (existing installations need to reinstall UCX and re-run assessment jobs)

 * Switched local group migration component to rename groups instead of creating backup groups ([#450](#450)).
 * Mitigate permissions loss in Table ACLs by folding grants belonging to the same principal, object id and object type together ([#512](#512)).

**New features**

 * Added support for the experimental Databricks CLI launcher ([#517](#517)).
 * Added support for external Hive Metastores including AWS Glue ([#400](#400)).
 * Added more views to assessment dashboard ([#474](#474)).
 * Added rate limit for creating backup group to increase stability ([#500](#500)).
 * Added deduplication for mount point list ([#569](#569)).
 * Added documentation to describe interaction with external Hive Metastores ([#473](#473)).
 * Added failure injection for job failure message propagation  ([#591](#591)).
 * Added uniqueness in the new warehouse name to avoid conflicts on installation ([#542](#542)).
 * Added a global init script to collect Hive Metastore lineage ([#513](#513)).
 * Added retry set/update permissions when possible and assess the changes in the workspace ([#519](#519)).
 * Use `~/.ucx/state.json` to store the state of both dashboards and jobs ([#561](#561)).

**Bug fixes**

 * Fixed handling for `OWN` table permissions ([#571](#571)).
 * Fixed handling of keys with and without values. ([#514](#514)).
 * Fixed integration test failures related to concurrent group delete ([#584](#584)).
 * Fixed issue with workspace listing process on None type `object_type` ([#481](#481)).
 * Fixed missing group entitlement migration bug ([#583](#583)).
 * Fixed entitlement application for account-level groups ([#529](#529)).
 * Fixed assessment throwing an error when the owner of an object is empty ([#485](#485)).
 * Fixed installer to migrate between different configuration file versions ([#596](#596)).
 * Fixed cluster policy crawler to be aware of deleted policies ([#486](#486)).
 * Improved error message for not null constraints violated ([#532](#532)).
 * Improved integration test resiliency ([#597](#597), [#594](#594), [#586](#586)).
 * Introduced Safer access to workspace objects' properties. ([#530](#530)).
 * Mitigated permissions loss in Table ACLs by running appliers with single thread ([#518](#518)).
 * Running apply permission task before assessment should display message ([#487](#487)).
 * Split integration tests from blocking the merge queue ([#496](#496)).
 * Support more than one dashboard per step ([#472](#472)).
 * Update databricks-sdk requirement from ~=0.11.0 to ~=0.12.0 ([#505](#505)).
 * Update databricks-sdk requirement from ~=0.12.0 to ~=0.13.0 ([#575](#575)).
@nfx nfx mentioned this pull request Nov 17, 2023
nfx added a commit that referenced this pull request Nov 17, 2023
**Breaking changes** (existing installations need to reinstall UCX and
re-run assessment jobs)

* Switched local group migration component to rename groups instead of
creating backup groups
([#450](#450)).
* Mitigate permissions loss in Table ACLs by folding grants belonging to
the same principal, object id and object type together
([#512](#512)).

**New features**

* Added support for the experimental Databricks CLI launcher
([#517](#517)).
* Added support for external Hive Metastores including AWS Glue
([#400](#400)).
* Added more views to assessment dashboard
([#474](#474)).
* Added rate limit for creating backup group to increase stability
([#500](#500)).
* Added deduplication for mount point list
([#569](#569)).
* Added documentation to describe interaction with external Hive
Metastores ([#473](#473)).
* Added failure injection for job failure message propagation
([#591](#591)).
* Added uniqueness in the new warehouse name to avoid conflicts on
installation ([#542](#542)).
* Added a global init script to collect Hive Metastore lineage
([#513](#513)).
* Added retry set/update permissions when possible and assess the
changes in the workspace
([#519](#519)).
* Use `~/.ucx/state.json` to store the state of both dashboards and jobs
([#561](#561)).

**Bug fixes**

* Fixed handling for `OWN` table permissions
([#571](#571)).
* Fixed handling of keys with and without values.
([#514](#514)).
* Fixed integration test failures related to concurrent group delete
([#584](#584)).
* Fixed issue with workspace listing process on None type `object_type`
([#481](#481)).
* Fixed missing group entitlement migration bug
([#583](#583)).
* Fixed entitlement application for account-level groups
([#529](#529)).
* Fixed assessment throwing an error when the owner of an object is
empty ([#485](#485)).
* Fixed installer to migrate between different configuration file
versions ([#596](#596)).
* Fixed cluster policy crawler to be aware of deleted policies
([#486](#486)).
* Improved error message for not null constraints violated
([#532](#532)).
* Improved integration test resiliency
([#597](#597),
[#594](#594),
[#586](#586)).
* Introduced Safer access to workspace objects' properties.
([#530](#530)).
* Mitigated permissions loss in Table ACLs by running appliers with
single thread ([#518](#518)).
* Running apply permission task before assessment should display message
([#487](#487)).
* Split integration tests from blocking the merge queue
([#496](#496)).
* Support more than one dashboard per step
([#472](#472)).
* Update databricks-sdk requirement from ~=0.11.0 to ~=0.12.0
([#505](#505)).
* Update databricks-sdk requirement from ~=0.12.0 to ~=0.13.0
([#575](#575)).
pritishpai pushed a commit that referenced this pull request Nov 21, 2023
**Breaking changes** (existing installations need to reinstall UCX and
re-run assessment jobs)

* Switched local group migration component to rename groups instead of
creating backup groups
([#450](#450)).
* Mitigate permissions loss in Table ACLs by folding grants belonging to
the same principal, object id and object type together
([#512](#512)).

**New features**

* Added support for the experimental Databricks CLI launcher
([#517](#517)).
* Added support for external Hive Metastores including AWS Glue
([#400](#400)).
* Added more views to assessment dashboard
([#474](#474)).
* Added rate limit for creating backup group to increase stability
([#500](#500)).
* Added deduplication for mount point list
([#569](#569)).
* Added documentation to describe interaction with external Hive
Metastores ([#473](#473)).
* Added failure injection for job failure message propagation
([#591](#591)).
* Added uniqueness in the new warehouse name to avoid conflicts on
installation ([#542](#542)).
* Added a global init script to collect Hive Metastore lineage
([#513](#513)).
* Added retry set/update permissions when possible and assess the
changes in the workspace
([#519](#519)).
* Use `~/.ucx/state.json` to store the state of both dashboards and jobs
([#561](#561)).

**Bug fixes**

* Fixed handling for `OWN` table permissions
([#571](#571)).
* Fixed handling of keys with and without values.
([#514](#514)).
* Fixed integration test failures related to concurrent group delete
([#584](#584)).
* Fixed issue with workspace listing process on None type `object_type`
([#481](#481)).
* Fixed missing group entitlement migration bug
([#583](#583)).
* Fixed entitlement application for account-level groups
([#529](#529)).
* Fixed assessment throwing an error when the owner of an object is
empty ([#485](#485)).
* Fixed installer to migrate between different configuration file
versions ([#596](#596)).
* Fixed cluster policy crawler to be aware of deleted policies
([#486](#486)).
* Improved error message for not null constraints violated
([#532](#532)).
* Improved integration test resiliency
([#597](#597),
[#594](#594),
[#586](#586)).
* Introduced Safer access to workspace objects' properties.
([#530](#530)).
* Mitigated permissions loss in Table ACLs by running appliers with
single thread ([#518](#518)).
* Running apply permission task before assessment should display message
([#487](#487)).
* Split integration tests from blocking the merge queue
([#496](#496)).
* Support more than one dashboard per step
([#472](#472)).
* Update databricks-sdk requirement from ~=0.11.0 to ~=0.12.0
([#505](#505)).
* Update databricks-sdk requirement from ~=0.12.0 to ~=0.13.0
([#575](#575)).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working feat/crawler
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Workspace listing raising ObjectType key error for experiments
3 participants