Skip to content

Commit

Permalink
Changed default UCX installation folder to /Applications/ucx from `…
Browse files Browse the repository at this point in the history
…/Users/<me>/.ucx` to allow multiple users users utilising the same installation (#854)

## [ADVANCED] Force install over existing UCX

Using an environment variable `UCX_FORCE_INSTALL` you can force the
installation of UCX over an existing installation.
The values for the environment variable are 'global' and 'user'.

Global Install: When UCX is installed at '/Applications/ucx'
User Install: When UCX is installed at '/Users/<user>/.ucx'

If there is an existing global installation of UCX, you can force a user
installation of UCX over the existing installation by setting the
environment variable `UCX_FORCE_INSTALL` to 'global'.

At this moment there is no global override over a user installation of
UCX. As this requires migration and can break existing installations.


| global | user | expected install location | install_folder | mode |
| --- | --- | --- | --- |--- |
| no | no | default | `/Applications/ucx` | install |
| yes | no | default | `/Applications/ucx` | upgrade |
| no | yes | default | `/Users/X/.ucx` | upgrade (existing installations
must not break) |
| yes | yes | default | `/Users/X/.ucx` | upgrade |
| yes | no | **USER** | `/Users/X/.ucx` | install (show prompt) |
| no | yes | **GLOBAL** | ...  | migrate |


* `UCX_FORCE_INSTALL=user databricks labs install ucx` - will force the
installation to be for user only
* `UCX_FORCE_INSTALL=global databricks labs install ucx` - will force
the installation to be for root only

Related to #803

---------

Co-authored-by: Serge Smertin <serge.smertin@databricks.com>
  • Loading branch information
2 people authored and dmoore247 committed Mar 23, 2024
1 parent d8bb07f commit 9bd1df3
Show file tree
Hide file tree
Showing 6 changed files with 466 additions and 76 deletions.
29 changes: 29 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ See [contributing instructions](CONTRIBUTING.md) to help improve this project.
* [Installation](#installation)
* [Authenticate Databricks CLI](#authenticate-databricks-cli)
* [Install UCX](#install-ucx)
* [[ADVANCED] Force install over existing UCX](#advanced-force-install-over-existing-ucx)
* [Upgrading UCX for newer versions](#upgrading-ucx-for-newer-versions)
* [Uninstall UCX](#uninstall-ucx)
* [Migration process](#migration-process)
Expand Down Expand Up @@ -123,6 +124,34 @@ You can also install a specific version by specifying it like `@v0.13.2` - `data

![macos_install_ucx](docs/macos_2_databrickslabsmac_installucx.gif)

[[back to top](#databricks-labs-ucx)]

## [ADVANCED] Force install over existing UCX
Using an environment variable `UCX_FORCE_INSTALL` you can force the installation of UCX over an existing installation.
The values for the environment variable are 'global' and 'user'.

Global Install: When UCX is installed at '/Applications/ucx'
User Install: When UCX is installed at '/Users/<user>/.ucx'

If there is an existing global installation of UCX, you can force a user installation of UCX over the existing installation by setting the environment variable `UCX_FORCE_INSTALL` to 'global'.

At this moment there is no global override over a user installation of UCX. As this requires migration and can break existing installations.


| global | user | expected install location | install_folder | mode |
| --- | --- | --- | --- |--- |
| no | no | default | `/Applications/ucx` | install |
| yes | no | default | `/Applications/ucx` | upgrade |
| no | yes | default | `/Users/X/.ucx` | upgrade (existing installations must not break) |
| yes | yes | default | `/Users/X/.ucx` | upgrade |
| yes | no | **USER** | `/Users/X/.ucx` | install (show prompt) |
| no | yes | **GLOBAL** | ... | migrate |


* `UCX_FORCE_INSTALL=user databricks labs install ucx` - will force the installation to be for user only
* `UCX_FORCE_INSTALL=global databricks labs install ucx` - will force the installation to be for root only


[[back to top](#databricks-labs-ucx)]

## Upgrading UCX for newer versions
Expand Down
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ classifiers = [
"Programming Language :: Python :: Implementation :: CPython",
]
dependencies = ["databricks-sdk~=0.21.0",
"databricks-labs-blueprint~=0.3.1",
"databricks-labs-blueprint~=0.4.0",
"PyYAML>=6.0.0,<7.0.0"]

[project.entry-points.databricks]
Expand Down
72 changes: 52 additions & 20 deletions src/databricks/labs/ucx/install.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
from pathlib import Path
from typing import Any

import databricks.sdk.errors
from databricks.labs.blueprint.entrypoint import get_logger
from databricks.labs.blueprint.installation import Installation
from databricks.labs.blueprint.installer import InstallState
Expand Down Expand Up @@ -135,8 +136,6 @@

logger = logging.getLogger(__name__)

PRODUCT_INFO = ProductInfo(__file__)


def deploy_schema(sql_backend: SqlBackend, inventory_schema: str):
# we need to import it like this because we expect a module instance
Expand Down Expand Up @@ -171,22 +170,33 @@ def deploy_schema(sql_backend: SqlBackend, inventory_schema: str):


class WorkspaceInstaller:
def __init__(self, prompts: Prompts, installation: Installation, ws: WorkspaceClient):
if "DATABRICKS_RUNTIME_VERSION" in os.environ:
def __init__(
self,
prompts: Prompts,
installation: Installation,
ws: WorkspaceClient,
product_info: ProductInfo,
environ: dict[str, str] | None = None,
):
if not environ:
environ = dict(os.environ.items())
if "DATABRICKS_RUNTIME_VERSION" in environ:
msg = "WorkspaceInstaller is not supposed to be executed in Databricks Runtime"
raise SystemExit(msg)
self._ws = ws
self._installation = installation
self._prompts = prompts
self._policy_installer = ClusterPolicyInstaller(installation, ws, prompts)
self._product_info = product_info
self._force_install = environ.get("UCX_FORCE_INSTALL")

def run(
self,
verify_timeout=timedelta(minutes=2),
sql_backend_factory: Callable[[WorkspaceConfig], SqlBackend] | None = None,
wheel_builder_factory: Callable[[], WheelsV2] | None = None,
):
logger.info(f"Installing UCX v{PRODUCT_INFO.version()}")
logger.info(f"Installing UCX v{self._product_info.version()}")
config = self.configure()
if not sql_backend_factory:
sql_backend_factory = self._new_sql_backend
Expand All @@ -199,7 +209,8 @@ def run(
wheel_builder_factory(),
self._ws,
self._prompts,
verify_timeout=verify_timeout,
verify_timeout,
self._product_info,
)
try:
workspace_installation.run()
Expand All @@ -209,14 +220,34 @@ def run(
raise err

def _new_wheel_builder(self):
return WheelsV2(self._installation, PRODUCT_INFO)
return WheelsV2(self._installation, self._product_info)

def _new_sql_backend(self, config: WorkspaceConfig) -> SqlBackend:
return StatementExecutionBackend(self._ws, config.warehouse_id)

def _confirm_force_install(self) -> bool:
if not self._force_install:
return False
msg = "[ADVANCED] UCX is already installed on this workspace. Do you want to create a new installation?"
if not self._prompts.confirm(msg):
raise RuntimeWarning("UCX is already installed, but no confirmation")
if not self._installation.is_global() and self._force_install == "global":
# TODO:
# Logic for forced global over user install
# Migration logic will go here
# verify complains without full path, asks to raise NotImplementedError builtin
raise databricks.sdk.errors.NotImplemented("Migration needed. Not implemented yet.")
if self._installation.is_global() and self._force_install == "user":
# Logic for forced user install over global install
self._installation = Installation.assume_user_home(self._ws, self._product_info.product_name())
return True
return False

def configure(self) -> WorkspaceConfig:
try:
config = self._installation.load(WorkspaceConfig)
if self._confirm_force_install():
return self._configure_new_installation()
self._apply_upgrades()
return config
except NotFound as err:
Expand All @@ -225,7 +256,7 @@ def configure(self) -> WorkspaceConfig:

def _apply_upgrades(self):
try:
upgrades = Upgrades(PRODUCT_INFO, self._installation)
upgrades = Upgrades(self._product_info, self._installation)
upgrades.apply(self._ws)
except NotFound as err:
logger.warning(f"Installed version is too old: {err}")
Expand All @@ -237,11 +268,9 @@ def _configure_new_installation(self) -> WorkspaceConfig:
inventory_database = self._prompts.question(
"Inventory Database stored in hive_metastore", default="ucx", valid_regex=r"^\w+$"
)

warehouse_id = self._configure_warehouse()
configure_groups = ConfigureGroups(self._prompts)
configure_groups.run()

log_level = self._prompts.question("Log level", default="INFO").upper()
num_threads = int(self._prompts.question("Number of threads", default="8", valid_number=True))

Expand Down Expand Up @@ -316,6 +345,7 @@ def __init__(
ws: WorkspaceClient,
prompts: Prompts,
verify_timeout: timedelta,
product_info: ProductInfo,
):
self._config = config
self._installation = installation
Expand All @@ -326,16 +356,18 @@ def __init__(
self._verify_timeout = verify_timeout
self._state = InstallState.from_installation(installation)
self._this_file = Path(__file__)
self._product_info = product_info

@classmethod
def current(cls, ws: WorkspaceClient):
installation = PRODUCT_INFO.current_installation(ws)
product_info = ProductInfo.from_class(WorkspaceConfig)
installation = product_info.current_installation(ws)
config = installation.load(WorkspaceConfig)
sql_backend = StatementExecutionBackend(ws, config.warehouse_id)
wheels = WheelsV2(installation, PRODUCT_INFO)
wheels = product_info.wheels(ws)
prompts = Prompts()
timeout = timedelta(minutes=2)
return WorkspaceInstallation(config, installation, sql_backend, wheels, ws, prompts, timeout)
return WorkspaceInstallation(config, installation, sql_backend, wheels, ws, prompts, timeout, product_info)

@property
def config(self):
Expand All @@ -346,7 +378,7 @@ def folder(self):
return self._installation.install_folder()

def run(self):
logger.info(f"Installing UCX v{PRODUCT_INFO.version()}")
logger.info(f"Installing UCX v{self._product_info.version()}")
Threads.strict(
"installing components",
[
Expand Down Expand Up @@ -656,7 +688,7 @@ def _job_settings(self, step_name: str, remote_wheel: str):
[t for t in _TASKS.values() if t.workflow == step_name],
key=lambda _: _.name,
)
version = PRODUCT_INFO.version()
version = self._product_info.version()
version = version if not self._ws.config.is_gcp else version.replace("+", "-")
return {
"name": self._name(step_name),
Expand All @@ -669,7 +701,7 @@ def _job_settings(self, step_name: str, remote_wheel: str):
def _upload_wheel_runner(self, remote_wheel: str):
# TODO: we have to be doing this workaround until ES-897453 is solved in the platform
code = TEST_RUNNER_NOTEBOOK.format(remote_wheel=remote_wheel, config_file=self._config_file).encode("utf8")
return self._installation.upload(f"wheels/wheel-test-runner-{PRODUCT_INFO.version()}.py", code)
return self._installation.upload(f"wheels/wheel-test-runner-{self._product_info.version()}.py", code)

@staticmethod
def _apply_cluster_overrides(settings: dict[str, Any], overrides: dict[str, str], wheel_runner: str) -> dict:
Expand Down Expand Up @@ -873,7 +905,7 @@ def uninstall(self):
):
return
# TODO: this is incorrect, fetch the remote version (that appeared only in Feb 2024)
logger.info(f"Deleting UCX v{PRODUCT_INFO.version()} from {self._ws.config.host}")
logger.info(f"Deleting UCX v{self._product_info.version()} from {self._ws.config.host}")
try:
self._installation.files()
except NotFound:
Expand Down Expand Up @@ -960,8 +992,8 @@ def validate_and_run(self, step: str):
if __name__ == "__main__":
logger = get_logger(__file__)
logger.setLevel("INFO")

app = ProductInfo.from_class(WorkspaceConfig)
workspace_client = WorkspaceClient(product="ucx", product_version=__version__)
current = Installation(workspace_client, PRODUCT_INFO.product_name())
installer = WorkspaceInstaller(Prompts(), current, workspace_client)
current = Installation.assume_global(workspace_client, app.product_name())
installer = WorkspaceInstaller(Prompts(), current, workspace_client, app)
installer.run()
Loading

0 comments on commit 9bd1df3

Please sign in to comment.