Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make local backup a backup agent #130623

Draft
wants to merge 3 commits into
base: allthebackupchanges
Choose a base branch
from

Conversation

emontnemery
Copy link
Contributor

Proposed change

Make local backup a backup agent

Type of change

  • Dependency upgrade
  • Bugfix (non-breaking change which fixes an issue)
  • New integration (thank you!)
  • New feature (which adds functionality to an existing integration)
  • Deprecation (breaking change to happen in the future)
  • Breaking change (fix/feature causing existing functionality to break)
  • Code quality improvements to existing code or addition of tests

Additional information

  • This PR fixes or closes issue: fixes #
  • This PR is related to issue:
  • Link to documentation pull request:

Checklist

  • The code change is tested and works locally.
  • Local tests pass. Your PR cannot be merged unless tests pass
  • There is no commented out code in this PR.
  • I have followed the development checklist
  • I have followed the perfect PR recommendations
  • The code has been formatted using Ruff (ruff format homeassistant tests)
  • Tests have been added to verify that the new code works.

If user exposed functionality or configuration variables are added/changed:

If the code communicates with devices, web services, or third-party tools:

  • The manifest file has all fields filled out correctly.
    Updated and included derived files by running: python3 -m script.hassfest.
  • New or updated dependencies have been added to requirements_all.txt.
    Updated by running python3 -m script.gen_requirements_all.
  • For the updated dependencies - a link to the changelog, or at minimum a diff between library versions is added to the PR description.

To help with the load of incoming pull requests:

@home-assistant
Copy link

Hey there @home-assistant/core, mind taking a look at this pull request as it has been labeled with an integration (backup) you are listed as a code owner for? Thanks!

Code owner commands

Code owners of backup can trigger bot actions by commenting:

  • @home-assistant close Closes the pull request.
  • @home-assistant rename Awesome new title Renames the pull request.
  • @home-assistant reopen Reopen the pull request.
  • @home-assistant unassign backup Removes the current integration label and assignees on the pull request, add the integration domain after the command.
  • @home-assistant add-label needs-more-information Add a label (needs-more-information, problem in dependency, problem in custom component) to the pull request.
  • @home-assistant remove-label needs-more-information Remove a label (needs-more-information, problem in dependency, problem in custom component) on the pull request.

homeassistant/components/backup/backup.py Outdated Show resolved Hide resolved
Comment on lines +28 to +36
@dataclass(slots=True)
class LocalBackup(UploadedBackup):
"""Local backup class."""

path: Path

def as_dict(self) -> dict:
"""Return a dict representation of this backup."""
return {**asdict(self), "path": self.path.as_posix()}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a local agent will always have a path. We should move this class out of this platform if that's the case

return {**asdict(self), "path": self.path.as_posix()}


class LocalBackupAgent(BackupAgent):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may want a flag or different base class for local agents.

class LocalBackupAgent(BackupAgent):
"""Define the format that backup agents can have."""

name = "local"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a separate PR, we may want to make name human readable and add a slug-type id.

if data_file := backup_file.extractfile("./backup.json"):
data = json_loads_object(data_file.read())
backup = LocalBackup(
id=cast(str, data["slug"]), # Do we need another ID?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The UploadedBackup base class says we need an ID, I'm not sure why, manager and frontend should care about the slug only.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The id is the agent's way of identifying the backup. The slug is something that the manager uses to identify the backup. They could be the same but I think it's good to separate terms since the semantics are different.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, but if we don't need the additional identifier there's no point in storing a reference to the slug twice.
The manager will only ever ask the agent to handle a backup based on the slug. The ID is an implementation detail which individual agents can add if they need one.

Copy link
Member

@MartinHjelmare MartinHjelmare Nov 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now the id is part of the interface and the id parameter is used to eg download a backup. We could remove it and force agents to keep track of backups via the slug.

I don't like the name slug since it doesn't describe the purpose of the parameter so I'd prefer to limit its use in interfaces.

Comment on lines +42 to +43
# pylint: disable=fixme
# TODO: Don't forget to remove this when the implementation is complete
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are many TODOs, pylint silenced for now so CI can run

Comment on lines 159 to 164
await integration_platform.async_process_integration_platforms(
self.hass,
DOMAIN,
self._add_platform_pre_post_handlers,
wait_for_platforms=True,
)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

async_process_integration_platforms can only be called once per domain, hence refactored so all platform registration happens in one callback

homeassistant/components/backup/manager.py Show resolved Hide resolved
Comment on lines 304 to 305
# TODO: Do we need to expose the path?
path=agent_backup.path, # type: ignore[attr-defined]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this breaks for example kitchen sink. Do we even need to expose the path?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The path at the agent location is not important for the manager.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Comment on lines 292 to 293
async def async_get_backups(self, **kwargs: Any) -> dict[str, Backup]:
"""Return backups."""
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes sense the manager knows about all backups across all agents.
The manager should not have to ask each time though, there needs to be some caching. Also, either frontend or an agent should be able to initiate a refresh.

hass: HomeAssistant,
**kwargs: Any,
) -> list[BackupAgent]:
"""Register the backup agent."""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"""Register the backup agent."""
"""Return a list of backup agents."""


def __init__(self, hass: HomeAssistant) -> None:
"""Initialize the backup agent."""
super().__init__()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
super().__init__()

await self._async_upload_backup(
slug=slug,
backup=backup,
agent_ids=list(self.backup_agents.keys()),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
agent_ids=list(self.backup_agents.keys()),
agent_ids=list(self.backup_agents),

Copy link
Contributor Author

@emontnemery emontnemery Nov 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit weird this slipped through, I'm pretty sure our linters used to enforce not calling keys(), did we disable some rule?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't seen that yet.

name=agent_backup.name,
date=agent_backup.date,
agent_ids=[],
# TODO: Do we need to expose the path?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The backup path at the agent location feels like a concern of the agent and not of the backup manager.

@@ -392,12 +377,13 @@ def _move_and_cleanup() -> None:
temp_dir_handler.cleanup()

await self.hass.async_add_executor_job(_move_and_cleanup)
await self.load_backups()
# TODO: What do we need to do instead?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

async_receive_backup is called after the user has uploaded a backup. We probably want to let the local agents upload the backup to the local storage

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably upload the backup to all agents after the user uploads a backup.

Comment on lines +48 to +50
self.backup_dir = Path(hass.config.path("backups"))
self.backups: dict[str, LocalBackup] = {}
self.loaded_backups = False
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
self.backup_dir = Path(hass.config.path("backups"))
self.backups: dict[str, LocalBackup] = {}
self.loaded_backups = False
self._backup_dir = Path(hass.config.path("backups"))
self._backups: dict[str, LocalBackup] = {}
self._loaded_backups = False

**kwargs: Any,
) -> None:
"""Upload a backup."""
self.backups[metadata.slug] = LocalBackup(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels inconsistent that this just updates the data collection instead of moving file data.

@@ -441,6 +433,11 @@ async def _async_create_backup(
) -> Backup:
"""Generate a backup."""
success = False
if LOCAL_AGENT_ID in agent_ids:
backup_dir = self.backup_dir
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking the manager shouldn't know about this backup directory. That knowledge belongs to the local agent.


return backup

async def async_remove_backup(self, *, slug: str, **kwargs: Any) -> None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't part of the agent interface yet. I'm working on adding async_delete_backup.

await self.load_backups()
return list(self.backups.values())

async def async_get_backup(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
async def async_get_backup(
async def _async_get_backup(


if backup is None or not backup.path.exists():
if backup is None or path is None or not path.exists():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to run the exists check in the executor.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd replace this check with try... except FileNotFoundError when reading the file though since the file can be removed at any time.

@@ -42,12 +42,13 @@ async def get(

manager = cast(BackupManager, request.app[KEY_HASS].data[DATA_MANAGER])
backup = await manager.async_get_backup(slug=slug)
path = await manager.async_get_backup_path(slug=slug)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd expect us to download the backup from the agent here. If the agent already has it downloaded it doesn't need to do it again of course. The downloaded backup will always have a local path on disk.

@@ -251,25 +246,27 @@ async def async_upload_backup(self, *, slug: str, **kwargs: Any) -> None:
if not (backup := await self.async_get_backup(slug=slug)):
return

local_agent = cast(LocalBackupAgent, self.backup_agents[LOCAL_AGENT_ID])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should treat local agents differently or be aware of specific agent ids.

@@ -433,10 +437,12 @@ async def _async_create_backup(
) -> Backup:
"""Generate a backup."""
success = False

if LOCAL_AGENT_ID in agent_ids:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no way to disable the builtin agent.

else:
backup_dir = self.temp_backup_dir
tar_file_path = self.temp_backup_dir / f"{slug}.tar"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest we instead always write to the backups directory and let the builtin agent do a no-op in its upload method.

@epenet epenet marked this pull request as draft November 15, 2024 08:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants