Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove case-insensitive dump file names handling logic #231

Closed
fzhinkin opened this issue May 30, 2024 · 16 comments · Fixed by #237
Closed

Remove case-insensitive dump file names handling logic #231

fzhinkin opened this issue May 30, 2024 · 16 comments · Fixed by #237

Comments

@fzhinkin
Copy link
Collaborator

#79 and #84 addressed the issue reported in #76 by looking up .api files in a case-insensitive manner.

To make things work smoothly on the Gradle side, we need to:

  • keep track of individual files instead of a directory containing these files (which is not possible with how we supported case-insensitivity)
  • keep track of directories (for check tasks, it's how we did it previously).

The latter option should work, but it requires specifying input/output directories and a file name for all the tasks. Otherwise, Gradle won't be able to track dependencies between tasks correctly. Also, all tasks consuming .api files from the repo should be updated to support case-insensitive file names (and for Klibs, the majority of tasks need it).

Note that before all the klib-related changes, the check task was using directories as its inputs, while the dump task was using files as its output, so the only way to enforce dependency between these tasks was dependsOn call.

Looking at #76 I'm not sure if macOS's case insensitivity was the root cause. The :apiDump task was based on Sync, which wipes out the destination directory before copying files. Even if the dump had a name with the wrong characters case, rerunning apiDump should have fixed it.

So, I propose removing all the logic for case-insensitive file names from the BCV and using exact file paths as Gradle task inputs/outputs.

@fzhinkin
Copy link
Collaborator Author

@ZacSweers, @joffrey-bion, @qwwdfsad PTAL. You were all involved in the original issue (#76) discussion, so I'd love to hear your thoughts on the topic.

@ZacSweers
Copy link

Definitely in favor of case-insensitivity 👍

@fzhinkin fzhinkin changed the title Remove case-insensitive dump file names logic Remove case-insensitive dump file names handling logic May 30, 2024
@fzhinkin
Copy link
Collaborator Author

@ZacSweers do you have a project that relies on it?

@ZacSweers
Copy link

I don't have any projects that desire case sensitivity. The last issue was that it unexpectedly failed on CI due to CI running linux

@fzhinkin
Copy link
Collaborator Author

Looking at #76 once again. The problem is how git works on case-insensitive file systems:

  • if core.ignorecase (introduced in git v1.5.6) is true (that would be the default), generated projectname.api file will be considered as just an update of ProjectName.api, on systems with case-sensitive FS, that would result in a missing dump file;
  • with core.ignorecase=false, if a dump file not only changed the name, but its content was also updated, git will see a new file, but it will also update the old file, resulting in a two files in the index (and you can't check them out simultaneously on case-insensitive FS), so a git mv call would be needed.

@joffrey-bion
Copy link
Contributor

joffrey-bion commented May 31, 2024

Wouldn't it "just work(TM)" to make the BCV delete the old file in a case-insensitive manner, and recreate it with an exact match of the project name every time?

EDIT: ah sorry that's the case where git would just update the wrongly-cased file in the index instead of renaming it.

@fzhinkin
Copy link
Collaborator Author

fzhinkin commented May 31, 2024

BCV was always deleting the old file, the problem is how git tracks deletion and creation of a new file:

  • with default git repo settings:
$ git init

$ echo "yo" > AAA.txt; git add AAA.txt; git commit -a -m 'added AAA.txt'
$ rm AAA.txt; echo YO > aaa.txt
$ git add .
$ git status
On branch main
Changes to be committed:
  (use "git restore --staged <file>..." to unstage)
	modified:   AAA.txt

$ ls
aaa.txt
  • with core.ignorecase=false
$ git init
$ echo "yo" > AAA.txt; git add AAA.txt; git commit -a -m 'added AAA.txt'
# please take care of case-insentiveness
$ git config core.ignorecase false
$ rm AAA.txt; echo YO > aaa.txt
$ git add .
$ git status
On branch main
Changes to be committed:
  (use "git restore --staged <file>..." to unstage)
	modified:   AAA.txt
	new file:   aaa.txt

$ echo '🤡'

The only proper way to handle file name case change is git mv:

$ git mv AAA.txt aaa.txt
$ git status
On branch main
Changes to be committed:
  (use "git restore --staged <file>..." to unstage)
	deleted:    AAA.txt
	new file:   aaa.txt

@joffrey-bion
Copy link
Contributor

Yes, understood. Sorry I was too slow to update my comment apparently.

I can't believe git works this way by default. This is quite enraging. Now, is it the responsibility of BCV to deal with it? Maybe, maybe not. If we can make the experience better, why not.

So, based on what you said, even being strict from the BCV side during apiDump doesn't guarantee anything, because people could change the case of their project name between 2 generations, and git will happily save the wrong case.

I guess that leaves us no choice but to be case insensitive on the apiCheck side.

@fzhinkin
Copy link
Collaborator Author

In general, it's nice to provide a better user experience.

However, those who use core.ignorecase false continue having problems (althougt it should be easier to spot the problem).
Also, as Gradle tasks have to use directories as their inputs instead of precise file locations, and also because we store both JVM and Klibs dump in the same folder, every update of one of these dumps will invalidate all dependent tasks, which is not very nice.

@joffrey-bion
Copy link
Contributor

if core.ignorecase (introduced in git v1.5.6) is true (that would be the default)

with default git repo settings

For the record, this is actually not quite correct. Git's default depends on the case-sensitivity of the local OS. So in case-sensitive systems, core.ignorecase will be false by default, which is best.

Reference: https://git-scm.com/docs/git-config#Documentation/git-config.txt-coreignoreCase

In general, it's nice to provide a better user experience

Agreed. Although I would argue that if we get annoying Gradle behavior with task invalidation, it's worth reconsidering what experience is best.

Basically, if apiDump behaves correctly, and people are just committing wrong cases due to git's behavior on their machine, there is little BCV could do about it. The repository's state will be "incorrect" in the sense that the case of the .api file will not match the case of the project. We could indeed decide to be case-insensitive for apiCheck, but it still doesn't solve the broken state of the repo.

@joffrey-bion
Copy link
Contributor

However, those who use core.ignorecase false continue having problems (althougt it should be easier to spot the problem).

Could you please expand on that? If people use core.ignorecase=false, the apiDump task will always generate correct files, and git will commit them in the correct case, right?

@fzhinkin
Copy link
Collaborator Author

For the record, this is actually not quite correct. Git's default depends on the case-sensitivity of the local OS. So in case-sensitive systems, core.ignorecase will be false by default, which is best.

Well, yeah, that's true. I meant how it works on macOs & Windows.

Could you please expand on that? If people use core.ignorecase=false, the apiDump task will always generate correct files, and git will commit them in the correct case, right?

Not quite. Without git mv, git can't detect old file removal but sees only new file creation. So, a user may end up with the two files in repo:

$ git init;
$ git config core.ignorecase false
$ echo 123 > zzz
$ git add zzz; git commit -a -m 'initial'
$ mv zzz ZZZ
$ git add ZZZ; git commit -a -m 'rename'
$ git ls-files
ZZZ
zzz

@joffrey-bion
Copy link
Contributor

joffrey-bion commented May 31, 2024

Not quite. Without git mv, git can't detect old file removal but sees only new file creation. So, a user may end up with the two files in repo

I don't quite get that point. I mean, this would happen in all systems, and with all renames, regardless of whether it's just a case change or not. It's just user error with git commands (just doing git add ZZZ isn't sufficient to register the whole rename, you would need a git add zzz too, or to use git mv in the first place, or just commit this from your IDE which should deal with it properly).

Also, if the user renames their project (which is the case that leads to this kind of stuff), won't they have to deal with this manually for the directory itself and other files named after the project anyway?

@fzhinkin
Copy link
Collaborator Author

fzhinkin commented Jun 3, 2024

I don't quite get that point. I mean, this would happen in all systems, and with all renames, regardless of whether it's just a case change or not. It's just user error with git commands (just doing git add ZZZ isn't sufficient to register the whole rename, you would need a git add zzz too, or to use git mv in the first place, or just commit this from your IDE which should deal with it properly).

Assuming that a user uses an IDE to work with a project, if a file rename is performed using GUI, everything will go through underlying VCS's commands, and the rename will be handled correctly.

But it's not the case for BCV, where a file is renamed during Gradle's command execution.

As a user who renamed a project and rerun apiDump, I'll see the following in IDEA's CSV commit view:

  • if ignorecase is true, there will be no changes in .api files;
  • if ignorecase is false, there will be no changes in an old file, and there will be a new untracked .api file. So nothing would hint me to remove an old file from index.

Also, if the user renames their project (which is the case that leads to this kind of stuff), won't they have to deal with this manually for the directory itself and other files named after the project anyway?

That poses a question: why should BCV handle scenarios when someone forgets to perform some related actions after renaming a project?

@joffrey-bion
Copy link
Contributor

joffrey-bion commented Jun 3, 2024

But it's not the case for BCV, where a file is renamed during Gradle's command execution.

Indeed, and I'm talking about the commit action, not the rename action. For sure, git mv is not an option here because the user doesn't control the rename action (it's done by BCV), but the commit action is definitely something done by the user, via the IDE or the command line. My point was that, if the user does git add only on the new name (and not the old), it's a user error.

As a user who renamed a project and rerun apiDump, I'll see the following in IDEA's CSV commit view
if ignorecase is false, there will be no changes in an old file. So nothing would hint me to remove an old file from index.

If this were the case in the IDE, I believe that should be reported as an IDE bug. But my experiment doesn't show this (at least on my Windows machine). Here is my result:

PS C:\Projects\test-git> git init
Initialized empty Git repository in C:/Projects/test-git/.git/
PS C:\Projects\test-git> git config user.name "Test User"
PS C:\Projects\test-git> git config user.email "test-user@noreply.com"
PS C:\Projects\test-git> git config core.ignorecase false
PS C:\Projects\test-git> echo 123 > zzz
PS C:\Projects\test-git> git add zzz
PS C:\Projects\test-git> git commit -m "initial zzz"
[main (root-commit) 5c35d17] initial zzz
 1 file changed, 0 insertions(+), 0 deletions(-)
 create mode 100644 zzz
PS C:\Projects\test-git> mv zzz ZZZ
PS C:\Projects\test-git> git status
On branch main
Changes not staged for commit:
  (use "git add/rm <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
        deleted:    zzz

Untracked files:
  (use "git add <file>..." to include in what will be committed)
        ZZZ

no changes added to commit (use "git add" and/or "git commit -a")

image

We see both the deleted and the created file, so we're supposed to add both of those and commit both as well. And this works fine. It is even considered a proper rename by git if done in a single commit:

PS C:\Projects\test-git> git add zzz
PS C:\Projects\test-git> git status
On branch main
Changes to be committed:
  (use "git restore --staged <file>..." to unstage)
        deleted:    zzz

Untracked files:
  (use "git add <file>..." to include in what will be committed)
        ZZZ

PS C:\Projects\test-git> git add ZZZ
PS C:\Projects\test-git> git status
On branch main
Changes to be committed:
  (use "git restore --staged <file>..." to unstage)
        renamed:    zzz -> ZZZ

if ignorecase is true, there will be no changes in .api files;

I was mainly talking about the ignorecase=false case, but yeah in the default case where ignorecase=true, it's a bit annoying, but that's not really a concern of BCV. It's just how git works, and if people can't commit the file, it makes sense to investigate on the git side of things.

Also, if the user renames their project (which is the case that leads to this kind of stuff), won't they have to deal with this manually for the directory itself and other files named after the project anyway?

That poses a question: why should BCV handle scenarios when someone forgets to perform some related actions after renaming a project?

Exactly. To be frank, BCV is not even supposed to know git exists, let alone the fact that git has any feature related to the filename case. If people don't manage to commit the case change, they should probably open issues with git instead.

@fzhinkin
Copy link
Collaborator Author

fzhinkin commented Jun 5, 2024

I'm leaning toward API dump filename case mismatch no longer being tolerable on case-sensitive filesystems.

The decision could always be reconsidered, but if the issue becomes annoying, I'd rather detach dump file names from project names (#233).

fzhinkin added a commit that referenced this issue Jun 5, 2024
Filename case mismatch is no longer tolerated
on case-sensitive filesystems.

Fixes #231
@fzhinkin fzhinkin linked a pull request Jun 5, 2024 that will close this issue
@fzhinkin fzhinkin added this to the klib/0.15.0-Beta.3 milestone Jun 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants