Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't return incorrect files when there's a file whose name matches a dir #524

Closed
wants to merge 1 commit into from

Conversation

ghiculescu
Copy link
Contributor

@ghiculescu ghiculescu commented Dec 11, 2020

Fixes a bug in #460

We have a file named cypress.json, and a dir named cypress, in a repo we are watching with listen.

Since upgrading to 3.3, listen will get confused and report that the file cypress/.json was removed. No such file exists.

I'm pretty sure this PR fixes that.

@ghiculescu ghiculescu force-pushed the filename-matching-dir branch 2 times, most recently from 4f9f0d1 to 8897b31 Compare December 12, 2020 23:18
Copy link
Collaborator

@ColinDKelley ColinDKelley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ghiculescu
Thanks for filing this fix. I wish I knew the context of this code. Are we only concerned with file patterns that start with rel_path? The old code was, but the new code can match (and remove) rel_path from anywhere in the path.

@@ -64,7 +64,10 @@ def _sub_tree(rel_path)
if path == rel_path
result.merge!(meta)
else
sub_path = path.sub(%r{\A#{rel_path}/?}, '')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I had to stare at it for a long time, but the bug is the ? after the /. That means rel_path could match just a prefix.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also it was a bug that the rel_path wasn't escaped before interpolating it into a regexp. All sorts of weird things could happen if rel_path had characters in it that are special in regexps, like ^ $ [ \ ? | . * etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I generally find regexes super hard to grok, so I'm a fan of removing it in favour of more explicit code. That's a matter of personal preference though - I can go back to a regex-based solution if you prefer.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with you! It is best to avoid regexps in code. My experience with them is that the likelihood of being 100% correct drops quadratically with every punctuation character used. :)

lib/listen/record.rb Outdated Show resolved Hide resolved
@@ -64,7 +64,10 @@ def _sub_tree(rel_path)
if path == rel_path
result.merge!(meta)
else
sub_path = path.sub(%r{\A#{rel_path}/?}, '')
parts = path.split("/")
next unless parts.include?(rel_path)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, can rel_path ever have separators in it? If not, then this would always have to be true, wouldn't it, since we established a few lines up that it starts with rel_path?

Copy link
Contributor Author

@ghiculescu ghiculescu Dec 20, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without this line, a bunch of (my new) tests fail. I definitely don't know that much about listen internals, but...

Hmm, can rel_path ever have separators in it?

... from what I've seen I think the answer is no.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If not, then this would always have to be true, wouldn't it, since we established a few lines up that it starts with rel_path?

No, because for example, "appspec.yml" starts with "app", but ["appspec.yml"] doesn't include "app".

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh jeez, I missed that too. Can we get rid of all the confusing special cases and simplify down to

    def _sub_tree(rel_path)
      @tree.each_with_object({}) do |(path, meta), result|
        parts = path.split(::File::SEPARATOR)
        if parts.shift == rel_path
          if parts.empty?
            result.merge!(meta)
          else
            result[parts.join(::File::SEPARATOR)] = meta
          end
        end
      end

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I've moved to this implementation.

parts = path.split("/")
next unless parts.include?(rel_path)

sub_path = (parts - [rel_path]).join("/")
Copy link
Collaborator

@ColinDKelley ColinDKelley Dec 20, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will remove rel_path from anywhere in the path. Wouldn't that trip up with something like

a/b/c/cypress

(a file that happened to have that same name)
or

a/b/cypress/d

(a nested folder that happened to have that same name)?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And the join could use File::SEPARATOR too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added another test for this case, but it seems to work fine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is this should only return files at #{rel_path}/**/**.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, you are correct. I was thinking Array#- worked as a set, but the docs are clear that it must match from the front, and that you need to a Set if you want #- to work like a set.

Copy link
Collaborator

@ColinDKelley ColinDKelley Dec 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, I take that back.

irb> ['a', 'b', 'c'] - ['b']
=> ['a', 'c']

You only want to remove from the front, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see what you mean. Yeah i suspect you’re right. Let me try your simpler looking code tomorrow.

@ghiculescu
Copy link
Contributor Author

ghiculescu commented Dec 20, 2020

Hey @ColinDKelley, thanks for the review!

Context is that listen is used by https://github.com/daveallie/entangler, we use it to 2-way sync files between local & remote workspaces. So the examples in my test are real examples from my codebase, where files that don't actually exist were incorrectly being reported as changed.

@ghiculescu ghiculescu force-pushed the filename-matching-dir branch 5 times, most recently from 90f252a to 049796b Compare December 20, 2020 21:22
@ghiculescu
Copy link
Contributor Author

I'm starting to think we've overcomplicated this: #526

@ghiculescu ghiculescu closed this Dec 21, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants