-
Notifications
You must be signed in to change notification settings - Fork 120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce more specific exceptions, like NotFound
, AlreadyExists
, BadRequest
, PermissionDenied
, InternalError
, and others
#376
Conversation
…ists`, `BadRequest`, `PermissionDenied`, `InternalError`, and others Improve the ergonomics of SDK, where instead of `except DatabricksError as err: if err.error_code != 'NOT_FOUND': raise err else: do_stuff()` we could do `except NotFound: do_stuff()`. Additionally, it'll make it easier to read stack traces, as they will contain specific exception class name. # First principles - do not override `builtins.NotImplemented` for `NOT_IMPLEMENTED` error code - assume that platform error_code/HTTP status code mapping is not perfect and in the state of transition - we do represent reasonable subset of error codes as specific exceptions ## Open questions ### HTTP Status Codes vs Error Codes 1. Mix between status codes and error codes (preferred) 2. Rely only on HTTP status codes 3. Rely only on `error_code`'s One example is `BAD_REQUEST` error code that maps onto HTTP 400 and to `except BadRequest as err` catch clause. But `MALFORMED_REQUEST`, `INVALID_STATE`, and `UNPARSEABLE_HTTP_ERROR` do also map to HTTP 400. So the proposal is to remap the MALFORMED_REQUEST to `BadRequest` exception. Another corner-case is UC: - 'METASTORE_DOES_NOT_EXIST': NotFound, - 'DAC_DOES_NOT_EXIST': NotFound, - 'CATALOG_DOES_NOT_EXIST': NotFound, - 'SCHEMA_DOES_NOT_EXIST': NotFound, - 'TABLE_DOES_NOT_EXIST': NotFound, - 'SHARE_DOES_NOT_EXIST': NotFound, - 'RECIPIENT_DOES_NOT_EXIST': NotFound, - 'STORAGE_CREDENTIAL_DOES_NOT_EXIST': NotFound, - 'EXTERNAL_LOCATION_DOES_NOT_EXIST': NotFound, - 'PRINCIPAL_DOES_NOT_EXIST': NotFound, - 'PROVIDER_DOES_NOT_EXIST': NotFound, ### Naming conflict resolution We have three sources of naming: - `error_code` - HTTP Status - Python builtin exceptions We still have to define which name takes the precedence.
Agree with this statement:
IMO, error codes shall be the main body of the error, since they provide logical explanation to what has happened, while HTTP status codes provide in-depth technical details. Comparing two options:
And:
The first one sounds a bit more DDD which is proven to be a good practice. |
@renardeinside technically, we can do |
some more input: SCIM doesn't have a concept of
|
and first time seeing this:
|
Yes, it makes sense to have very specific errors and handle them explicitly |
We're in this situation in the UCX project and it's starting to be urgent because of the usage customer has of this tool. Please prioritize this when possible |
…rrNotFound)`, `errors.Is(err, databricks.ErrAlreadyExists)`, `errors.Is(err, databricks.ErrBadRequest)`, `errors.Is(err, databricks.ErrPermissionDenied)`, `errors.Is(err, databricks.ErrInternal)`, and others This PR enables ergonomic error handling for all Databricks API Errors, following the same principles as databricks/databricks-sdk-py#376
NotFound
, AlreadyExists
, BadRequest
, PermissionDenied
, InternalError
, and othersNotFound
, AlreadyExists
, BadRequest
, PermissionDenied
, InternalError
, and others
Now this PR is generated from the metadata in databricks/databricks-sdk-go#682 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, thanks! Couple small suggestions, otherwise I think we're good.
.codegen/error_mapping.py.tmpl
Outdated
|
||
from .base import DatabricksError | ||
|
||
__all__ = ['error_mapper'{{range .ExceptionTypes}}, '{{.PascalName}}'{{end}}] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this necessary? These are also the only exported members of this module. I also wasn't aware that this variable was respected in modules that also define an __init__.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is how python works ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just cloned this PR and removed this line and was still able to import these:
from databricks.sdk import errors
print(dir(errors))
['Aborted', 'AlreadyExists', 'BadRequest', 'Cancelled', 'DataLoss', 'DatabricksError', 'DeadlineExceeded', 'ErrorDetail', 'InternalError', 'InvalidParameterValue', 'NotFound', 'NotImplemented', 'OperationFailed', 'OperationTimeout', 'PermissionDenied', 'RequestLimitExceeded', 'ResourceAlreadyExists', 'ResourceConflict', 'ResourceExhausted', 'TemporarilyUnavailable', 'TooManyRequests', 'Unauthenticated', 'Unknown', '__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__path__', '__spec__', 'base', 'error_mapper', 'mapper', 'mapping', 'sdk']
If you want to control what is exported, I think you need to include this in the __init__.py
file explicitly.
… `BadRequest`, `PermissionDenied`, `InternalError`, and others See implementations in other SDKs: - Go: databricks/databricks-sdk-go#682 - Python: databricks/databricks-sdk-py#376
…rrNotFound)`, `errors.Is(err, databricks.ErrAlreadyExists)`, `errors.Is(err, databricks.ErrBadRequest)`, `errors.Is(err, databricks.ErrPermissionDenied)`, `errors.Is(err, databricks.ErrInternal)`, and others (#682) ## Changes This PR enables ergonomic error handling for all Databricks API Errors, following the same principles as databricks/databricks-sdk-py#376 ## Tests - [x] `make test` passing - [x] `make fmt` applied - [ ] relevant integration tests applied
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This LGTM. Thanks for working with us on this!
* Introduce more specific exceptions, like `NotFound`, `AlreadyExists`, `BadRequest`, `PermissionDenied`, `InternalError`, and others ([#376](#376)). This makes it easier to handle errors thrown by the Databricks API. Instead of catching `DatabricksError` and checking the error_code field, you can catch one of these subtypes of `DatabricksError`, which is more ergonomic and removes the need to rethrow exceptions that you don't want to catch. For example: ```python try: return (self._ws .permissions .get(object_type, object_id)) except DatabricksError as e: if e.error_code in [ "RESOURCE_DOES_NOT_EXIST", "RESOURCE_NOT_FOUND", "PERMISSION_DENIED", "FEATURE_DISABLED", "BAD_REQUEST"]: logger.warning(...) return None raise RetryableError(...) from e ``` can be replaced with ```python try: return (self._ws .permissions .get(object_type, object_id)) except PermissionDenied, FeatureDisabled: logger.warning(...) return None except NotFound: raise RetryableError(...) ``` * Paginate all SCIM list requests in the SDK ([#440](#440)). This change ensures that SCIM list() APIs use a default limit of 100 resources, leveraging SCIM's offset + limit pagination to batch requests to the Databricks API. * Added taskValues support in remoteDbUtils ([#406](#406)). * Added more detailed error message on default credentials not found error ([#419](#419)). * Request management token via Azure CLI only for Service Principals and not human users ([#408](#408)). API Changes: * Fixed `create()` method for [w.functions](https://databricks-sdk-py.readthedocs.io/en/latest/workspace/functions.html) workspace-level service and corresponding `databricks.sdk.service.catalog.CreateFunction` and `databricks.sdk.service.catalog.FunctionInfo` dataclasses. * Changed `create()` method for [w.metastores](https://databricks-sdk-py.readthedocs.io/en/latest/workspace/metastores.html) workspace-level service with new required argument order. * Changed `storage_root` field for `databricks.sdk.service.catalog.CreateMetastore` to be optional. * Added `skip_validation` field for `databricks.sdk.service.catalog.UpdateExternalLocation`. * Added `libraries` field for `databricks.sdk.service.compute.CreatePolicy`, `databricks.sdk.service.compute.EditPolicy` and `databricks.sdk.service.compute.Policy`. * Added `init_scripts` field for `databricks.sdk.service.compute.EventDetails`. * Added `file` field for `databricks.sdk.service.compute.InitScriptInfo`. * Added `zone_id` field for `databricks.sdk.service.compute.InstancePoolGcpAttributes`. * Added several dataclasses related to init scripts. * Added `databricks.sdk.service.compute.LocalFileInfo` dataclass. * Replaced `ui_state` field with `edit_mode` for `databricks.sdk.service.jobs.CreateJob` and `databricks.sdk.service.jobs.JobSettings`. * Replaced `databricks.sdk.service.jobs.CreateJobUiState` dataclass with `databricks.sdk.service.jobs.CreateJobEditMode`. * Added `include_resolved_values` field for `databricks.sdk.service.jobs.GetRunRequest`. * Replaced `databricks.sdk.service.jobs.JobSettingsUiState` dataclass with `databricks.sdk.service.jobs.JobSettingsEditMode`. * Removed [a.o_auth_enrollment](https://databricks-sdk-py.readthedocs.io/en/latest/account/o_auth_enrollment.html) account-level service. This was only used to aid in OAuth enablement during the public preview of OAuth. OAuth is now enabled for all AWS E2 accounts, so usage of this API is no longer needed. * Added `network_connectivity_config_id` field for `databricks.sdk.service.provisioning.UpdateWorkspaceRequest`. * Added [a.network_connectivity](https://databricks-sdk-py.readthedocs.io/en/latest/account/network_connectivity.html) account-level service. * Added `string_shared_as` field for `databricks.sdk.service.sharing.SharedDataObject`. Internal changes: * Added regression question to issue template ([#414](#414)). * Made test_auth no longer fail if you have a default profile setup ([#426](#426)). OpenAPI SHA: d136ad0541f036372601bad9a4382db06c3c912d, Date: 2023-11-14
* Introduce more specific exceptions, like `NotFound`, `AlreadyExists`, `BadRequest`, `PermissionDenied`, `InternalError`, and others ([#376](#376)). This makes it easier to handle errors thrown by the Databricks API. Instead of catching `DatabricksError` and checking the error_code field, you can catch one of these subtypes of `DatabricksError`, which is more ergonomic and removes the need to rethrow exceptions that you don't want to catch. For example: ```python try: return (self._ws .permissions .get(object_type, object_id)) except DatabricksError as e: if e.error_code in [ "RESOURCE_DOES_NOT_EXIST", "RESOURCE_NOT_FOUND", "PERMISSION_DENIED", "FEATURE_DISABLED", "BAD_REQUEST"]: logger.warning(...) return None raise RetryableError(...) from e ``` can be replaced with ```python try: return (self._ws .permissions .get(object_type, object_id)) except PermissionDenied, FeatureDisabled: logger.warning(...) return None except NotFound: raise RetryableError(...) ``` * Paginate all SCIM list requests in the SDK ([#440](#440)). This change ensures that SCIM list() APIs use a default limit of 100 resources, leveraging SCIM's offset + limit pagination to batch requests to the Databricks API. * Added taskValues support in remoteDbUtils ([#406](#406)). * Added more detailed error message on default credentials not found error ([#419](#419)). * Request management token via Azure CLI only for Service Principals and not human users ([#408](#408)). API Changes: * Fixed `create()` method for [w.functions](https://databricks-sdk-py.readthedocs.io/en/latest/workspace/functions.html) workspace-level service and corresponding `databricks.sdk.service.catalog.CreateFunction` and `databricks.sdk.service.catalog.FunctionInfo` dataclasses. * Changed `create()` method for [w.metastores](https://databricks-sdk-py.readthedocs.io/en/latest/workspace/metastores.html) workspace-level service with new required argument order. * Changed `storage_root` field for `databricks.sdk.service.catalog.CreateMetastore` to be optional. * Added `skip_validation` field for `databricks.sdk.service.catalog.UpdateExternalLocation`. * Added `libraries` field for `databricks.sdk.service.compute.CreatePolicy`, `databricks.sdk.service.compute.EditPolicy` and `databricks.sdk.service.compute.Policy`. * Added `init_scripts` field for `databricks.sdk.service.compute.EventDetails`. * Added `file` field for `databricks.sdk.service.compute.InitScriptInfo`. * Added `zone_id` field for `databricks.sdk.service.compute.InstancePoolGcpAttributes`. * Added several dataclasses related to init scripts. * Added `databricks.sdk.service.compute.LocalFileInfo` dataclass. * Replaced `ui_state` field with `edit_mode` for `databricks.sdk.service.jobs.CreateJob` and `databricks.sdk.service.jobs.JobSettings`. * Replaced `databricks.sdk.service.jobs.CreateJobUiState` dataclass with `databricks.sdk.service.jobs.CreateJobEditMode`. * Added `include_resolved_values` field for `databricks.sdk.service.jobs.GetRunRequest`. * Replaced `databricks.sdk.service.jobs.JobSettingsUiState` dataclass with `databricks.sdk.service.jobs.JobSettingsEditMode`. * Removed [a.o_auth_enrollment](https://databricks-sdk-py.readthedocs.io/en/latest/account/o_auth_enrollment.html) account-level service. This was only used to aid in OAuth enablement during the public preview of OAuth. OAuth is now enabled for all AWS E2 accounts, so usage of this API is no longer needed. * Added `network_connectivity_config_id` field for `databricks.sdk.service.provisioning.UpdateWorkspaceRequest`. * Added [a.network_connectivity](https://databricks-sdk-py.readthedocs.io/en/latest/account/network_connectivity.html) account-level service. * Added `string_shared_as` field for `databricks.sdk.service.sharing.SharedDataObject`. Internal changes: * Added regression question to issue template ([#414](#414)). * Made test_auth no longer fail if you have a default profile setup ([#426](#426)). OpenAPI SHA: d136ad0541f036372601bad9a4382db06c3c912d, Date: 2023-11-14
… `BadRequest`, `PermissionDenied`, `InternalError`, and others (#185) See implementations in other SDKs: - Go: databricks/databricks-sdk-go#682 - Python: databricks/databricks-sdk-py#376 --------- Co-authored-by: Miles Yucht <miles@databricks.com> Co-authored-by: Tanmay Rustagi <tanmay.rustagi@databricks.com>
… `BadRequest`, `PermissionDenied`, `InternalError`, and others (databricks#185) See implementations in other SDKs: - Go: databricks/databricks-sdk-go#682 - Python: databricks/databricks-sdk-py#376 --------- Co-authored-by: Miles Yucht <miles@databricks.com> Co-authored-by: Tanmay Rustagi <tanmay.rustagi@databricks.com>
Improve the ergonomics of SDK, where instead of
except DatabricksError as err: if err.error_code != 'NOT_FOUND': raise err else: do_stuff()
we could doexcept NotFound: do_stuff()
, like in this example.Additionally, it'll make it easier to read stack traces, as they will contain specific exception class name. Examples of unclear stack traces are: databrickslabs/ucx#359, databrickslabs/ucx#353, databrickslabs/ucx#347,
First principles
do not overridebuiltins.NotImplemented
forNOT_IMPLEMENTED
error codeerror_code
from specific exceptions likeNotFound
orAlreadyExists
.Proposal
Naming conflict resolution
We have four sources of naming and this is a proposed order of naming conflict resolution:
error_code
, that is surfaced in our API documentation, known by Databricks users