Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Git worktrees for sync #1831

Merged
merged 7 commits into from
Oct 21, 2024
Merged

Support Git worktrees for sync #1831

merged 7 commits into from
Oct 21, 2024

Conversation

smacke
Copy link
Member

@smacke smacke commented Oct 14, 2024

Changes

This change allows the sync command to work from git worktrees.

Tests

  • Added unit tests for traversal of worktree related files.
  • Manually confirmed that synchronization of files from a main checkout, as well as a worktree, observed the same ignore rules (both locally defined as well as from $GIT_DIR/info/exclude).

@shreyas-goenka
Copy link
Contributor

@smacke Could you expand on what it means for the sync command to support git worktrees. A quick read of the PR suggests that worktrees have a .git file, and you'd like that file to also be read for .gitignore rules?

I might be way off since I'm not familiar with git worktrees, please let me know what it exactly means to add support here.

@smacke
Copy link
Member Author

smacke commented Oct 15, 2024

A quick read of the PR suggests that worktrees have a .git file, and you'd like that file to also be read for .gitignore rules?

@shreyas-goenka not quite. The problem is that worktrees have a .git file but not a .git directory. The databricks cli looks inside .git/config to locate some configuration for sync, but it fails in worktrees since .git is not a directory, but just a file with a gitdir entry. For worktrees we have to read this entry to find the original repo (and thus path to .git/config).

For the other part around .gitignore, we similarly need to read from the original repository to find that file. I couldn't quite figure out how to tell it not to generate new .gitignore files in worktrees unnecessarily, so I just made the logic to do that dependent on us not being in a worktree for now, which works well enough.

@smacke smacke requested a review from shreyas-goenka October 16, 2024 22:21
@pietern
Copy link
Contributor

pietern commented Oct 18, 2024

@smacke I did some digging to figure out how work trees work, and it was a little more involved than initially proposed in the PR. Not only do we need to chase the gitdir in the .git file, but we also need to resolve the commondir. These paths are referred to as $GIT_DIR and $GIT_COMMON_DIR in the Git docs.

Then there were a few other call sites where we load HEAD or read the contents of refs, and both needed to be anchored to either the $GIT_DIR or the $GIT_COMMON_DIR, so it was easier to break those out into separate fields.

Can you comment on why you had the code bail out of writing a .gitignore file in view.go?

@pietern pietern changed the title support git worktrees for directory sync Support Git worktrees for sync Oct 18, 2024
@pietern pietern linked an issue Oct 18, 2024 that may be closed by this pull request
@smacke
Copy link
Member Author

smacke commented Oct 18, 2024

Can you comment on why you had the code bail out of writing a .gitignore file in view.go?

@pietern Sure. I did this because I think what was happening was that the .gitignore that was detected for the repo was correct for reading, but when writing, it wrote to .gitignore in the sync directory, even though I had already added the necessary contents there to avoid syncing .databricks. But I'm not totally sure on the root cause. The behavior that I was trying to work around was writing to the sync dir's .gitignore over and over again.

Copy link
Contributor

@shreyas-goenka shreyas-goenka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thanks for the contribution @smacke and @pietern! Mostly looks good to me in its current state other than the issue with Root().

libs/git/repository.go Outdated Show resolved Hide resolved
libs/git/repository.go Outdated Show resolved Hide resolved
libs/git/repository.go Show resolved Hide resolved
libs/git/repository.go Show resolved Hide resolved
@smacke
Copy link
Member Author

smacke commented Oct 21, 2024

@pietern thanks a ton for helping to push this through, I pulled in your changes, built, and verified everything works as expected 👍

@pietern pietern added this pull request to the merge queue Oct 21, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Oct 21, 2024
@pietern
Copy link
Contributor

pietern commented Oct 21, 2024

This PR was removed from the merge queue due to flakiness on Windows.

Attempt to fix the underlying flakiness in #1845.

@pietern pietern added this pull request to the merge queue Oct 21, 2024
Merged via the queue into databricks:main with commit 571076d Oct 21, 2024
5 checks passed
andrewnester added a commit that referenced this pull request Oct 23, 2024
CLI:
 * Added JSON input validation for CLI commands ([#1771](#1771)).

Bundles:
 * Support Git worktrees for `sync` ([#1831](#1831)).
 * Add `bundle summary` to display URLs for deployed resources ([#1731](#1731)).
 * Added a warning when incorrect permissions used for `/Workspace/Shared` bundle root ([#1821](#1821)).
 * Show actionable errors for collaborative deployment scenarios ([#1386](#1386)).
 * Fix path to repository-wide exclude file ([#1837](#1837)).
 * Fixed typo in converting cluster permissions ([#1826](#1826)).
 * Ignore metastore permission error during template generation ([#1819](#1819)).
 * Handle normalization of `dyn.KindTime` into an any type ([#1836](#1836)).
 * Added support for pip options in environment dependencies ([#1842](#1842)).
 * Fix race condition when restarting continuous jobs ([#1849](#1849)).
 * Fix pipeline in default-python template not working for certain workspaces ([#1854](#1854)).
 * Add "output" flag to the bundle sync command ([#1853](#1853)).

Internal:
 * Move utility functions dealing with IAM to libs/iamutil ([#1820](#1820)).
 * Remove unused `IS_OWNER` constant ([#1823](#1823)).
 * Assert SDK version is consistent in the CLI generation process ([#1814](#1814)).
 * Fixed unmarshalling json input into `interface{}` type ([#1832](#1832)).
 * Fix `TestAccFsMkdirWhenFileExistsAtPath` in isolated Azure environments ([#1833](#1833)).
 * Add behavioral tests for examples from the YAML spec ([#1835](#1835)).
 * Remove Terraform conversion function that's no longer used ([#1840](#1840)).
 * Encode assumptions about the dashboards API in a test ([#1839](#1839)).
 * Add script to make testing of code on branches easier ([#1844](#1844)).

API Changes:
 * Added `databricks disable-legacy-dbfs` command group.

OpenAPI commit cf9c61453990df0f9453670f2fe68e1b128647a2 (2024-10-14)
Dependency updates:
 * Upgrade TF provider to 1.54.0 ([#1852](#1852)).
 * Bump github.com/databricks/databricks-sdk-go from 0.48.0 to 0.49.0 ([#1843](#1843)).
github-merge-queue bot pushed a commit that referenced this pull request Oct 23, 2024
CLI:
* Added JSON input validation for CLI commands
([#1771](#1771)).
* Support Git worktrees for `sync`
([#1831](#1831)).

Bundles:
* Add `bundle summary` to display URLs for deployed resources
([#1731](#1731)).
* Added a warning when incorrect permissions used for
`/Workspace/Shared` bundle root
([#1821](#1821)).
* Show actionable errors for collaborative deployment scenarios
([#1386](#1386)).
* Fix path to repository-wide exclude file
([#1837](#1837)).
* Fixed typo in converting cluster permissions
([#1826](#1826)).
* Ignore metastore permission error during template generation
([#1819](#1819)).
* Handle normalization of `dyn.KindTime` into an any type
([#1836](#1836)).
* Added support for pip options in environment dependencies
([#1842](#1842)).
* Fix race condition when restarting continuous jobs
([#1849](#1849)).
* Fix pipeline in default-python template not working for certain
workspaces ([#1854](#1854)).
* Add "output" flag to the bundle sync command
([#1853](#1853)).

Internal:
* Move utility functions dealing with IAM to libs/iamutil
([#1820](#1820)).
* Remove unused `IS_OWNER` constant
([#1823](#1823)).
* Assert SDK version is consistent in the CLI generation process
([#1814](#1814)).
* Fixed unmarshalling json input into `interface{}` type
([#1832](#1832)).
* Fix `TestAccFsMkdirWhenFileExistsAtPath` in isolated Azure
environments ([#1833](#1833)).
* Add behavioral tests for examples from the YAML spec
([#1835](#1835)).
* Remove Terraform conversion function that's no longer used
([#1840](#1840)).
* Encode assumptions about the dashboards API in a test
([#1839](#1839)).
* Add script to make testing of code on branches easier
([#1844](#1844)).

API Changes:
 * Added `databricks disable-legacy-dbfs` command group.

OpenAPI commit cf9c61453990df0f9453670f2fe68e1b128647a2 (2024-10-14)
Dependency updates:
* Upgrade TF provider to 1.54.0
([#1852](#1852)).
* Bump github.com/databricks/databricks-sdk-go from 0.48.0 to 0.49.0
([#1843](#1843)).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Does not support git worktrees
3 participants