Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

crawl_permissions takes too long to run on a large workspace #380

Closed
dmoore247 opened this issue Oct 4, 2023 · 5 comments · Fixed by #1080
Closed

crawl_permissions takes too long to run on a large workspace #380

dmoore247 opened this issue Oct 4, 2023 · 5 comments · Fixed by #1080
Labels
enhancement New feature or request migrate/groups Corresponds to Migrate Groups Step of go/uc/upgrade step/assessment go/uc/upgrade - Assessment Step wontfix This will not be worked on

Comments

@dmoore247
Copy link
Contributor

In testing the assessment job on a large (3000+ user workspace) crawl_permissions takes over 8 hours to run. This job has not run to completion for this workspace, starting with ucx 0.2.0, also on ucx 0.3.0

Suggest...
a. make it run faster (more threads)
b. do less work
c. make it do save checkpoints and re-start from where it last failed.

@nfx nfx added the enhancement New feature or request label Oct 5, 2023
@nfx nfx changed the title bug: crawl_permissions takes too long to run on a large workspace crawl_permissions takes too long to run on a large workspace Oct 5, 2023
@nfx nfx added wontfix This will not be worked on and removed enhancement New feature or request labels Oct 5, 2023
@nfx
Copy link
Collaborator

nfx commented Oct 5, 2023

we cannot defy the laws of physics with networking...

@FastLee
Copy link
Contributor

FastLee commented Oct 5, 2023

Make it optional!

@nfx
Copy link
Collaborator

nfx commented Oct 5, 2023

This would make migration of local groups impossible

@pohlposition pohlposition added step/assessment go/uc/upgrade - Assessment Step migrate/groups Corresponds to Migrate Groups Step of go/uc/upgrade labels Oct 13, 2023
@dmoore247
Copy link
Contributor Author

@nfx The api runs at 4-50 requests per second. The API is pretty slow on a good day (>1 second each call)

I just noticed, we default parallelism to 8 but number of vCPUs per driver/worker is 2 so we're over subscribed.

@nfx
Copy link
Collaborator

nfx commented Oct 24, 2023

@dmoore247 Python Global Interpreter Lock doesn't have any effect on IO, so CPU oversubscription concerns are not applicable here

@nfx nfx added the enhancement New feature or request label Nov 7, 2023
@nfx nfx closed this as completed in #1080 Mar 27, 2024
nfx pushed a commit that referenced this issue Mar 27, 2024
## Changes
- Add experimental support for group permission migrations using new
API. The new workflow is called `migrate-groups-experimental` which uses
the new API for all permissions migration, except for Legacy Table ACL
which still leverage the current approach.
- This API does not require specifying resources to migrate, which
simplify the codebase
- Extend integration tests for both existing & new code path.
- Add initial performance testing code. Initial results suggests that
the new API has much better scaling behaviour

### Linked issues
<!-- DOC: Link issue with a keyword: close, closes, closed, fix, fixes,
fixed, resolve, resolves, resolved. See
https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword
-->

Resolves #380 

### Functionality 

- [x] added new workflow: `migrate-groups-experimental`

### Tests
<!-- How is this tested? Please see the checklist below and also
describe any other relevant tests -->

- [x] manually tested
- [x] added unit tests
- [x] added integration tests
- [x] verified on staging environment (screenshot attached)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request migrate/groups Corresponds to Migrate Groups Step of go/uc/upgrade step/assessment go/uc/upgrade - Assessment Step wontfix This will not be worked on
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

4 participants