Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(ssh): optimize traversing ssh config includes #976

Merged
merged 2 commits into from
Jun 3, 2023

Conversation

Rogach
Copy link
Contributor

@Rogach Rogach commented May 10, 2023

Previously each config file was read with separate sed command to extract Include directives. This can get quite slow if there are a lot of included files (on my machine it was ~0.4s for each TAB keypress).

Instead of reading each individual file, we can put all the files at a current recursion level into a single sed invocation and extract all the directives in one go.

@Rogach Rogach force-pushed the pr/optimize-ssh-includes branch from 5a0b540 to 0853f8c Compare May 10, 2023 11:06
bash_completion Outdated Show resolved Hide resolved
Copy link
Collaborator

@akinomyoga akinomyoga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! These are style and minor changes.

bash_completion Outdated Show resolved Hide resolved
bash_completion Outdated Show resolved Hide resolved
bash_completion Outdated Show resolved Hide resolved
@Rogach Rogach force-pushed the pr/optimize-ssh-includes branch from 0853f8c to 3d89182 Compare May 12, 2023 05:32
@Rogach
Copy link
Contributor Author

Rogach commented May 12, 2023

@akinomyoga Implemented the requested changes.

The option -l (linewise splitting) may be specified to _comp_split

This fails the tests, because single Include may specify multiple included files on the same line (separated by spaces).

@akinomyoga
Copy link
Collaborator

This fails the tests, because single Include may specify multiple included files on the same line (separated by spaces).

Thank you for pointing this out. Hmm, so it means that each element of the resulting array of _comp_split does not necessarily correspond to one Include directive. Then, the variable name might better to be something like include_files rather than include_directives, but it's up to you whether to change it to a better one.

@Rogach Rogach force-pushed the pr/optimize-ssh-includes branch from 3d89182 to c19daf3 Compare May 13, 2023 08:19
@Rogach
Copy link
Contributor Author

Rogach commented May 13, 2023

Renamed the variable.

Copy link
Collaborator

@akinomyoga akinomyoga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for quick responses and updating!

These are cosmetic ones. With these addressed, I'd approve it.

bash_completion Outdated Show resolved Hide resolved
bash_completion Outdated Show resolved Hide resolved
bash_completion Outdated Show resolved Hide resolved
bash_completion Outdated Show resolved Hide resolved
@Rogach Rogach force-pushed the pr/optimize-ssh-includes branch from c19daf3 to 447e925 Compare May 13, 2023 10:19
@Rogach
Copy link
Contributor Author

Rogach commented May 13, 2023

Applied the requested changes.

As per ssh_config man, "Files without absolute paths are
assumed to be in ~/.ssh if included in a user configuration
file or /etc/ssh if included from the system configuration file."

So relative include base stays the same throughout recursion,
even if the system-wide config is included from user's config.

This commit also optimizes traversing ssh config includes.
Previously each config file was read with separate `sed` command
to extract `Include` directives. This can get quite slow if there are
a lot of included files.
Instead of reading each individual file, we can put all the files
at a current recursion level into a single sed invocation and
extract all the directives in one go.
@Rogach Rogach force-pushed the pr/optimize-ssh-includes branch from 447e925 to 4ac86e8 Compare May 13, 2023 11:30
@scop
Copy link
Owner

scop commented May 16, 2023

Looks mostly good to me, added a comment about the "16" to make it less magical.

However, the commit message seems to talk about an actual fix in addition to a performance improvement, whereas the PR title here and the commentary seems to refer only to the perf improvement.

If there's a fix in here, could we add a test case showcasing what was broken before this and fixed now?

Or if there's no fix, could we rephrase the commit message some to clarify that? Also, if there's no actual fix here, the conventional commit type would be better set as perf instead of fix.

@Rogach
Copy link
Contributor Author

Rogach commented Jun 2, 2023

However, the commit message seems to talk about an actual fix in addition to a performance improvement, whereas the PR title here and the commentary seems to refer only to the perf improvement.
If there's a fix in here, could we add a test case showcasing what was broken before this and fixed now?

Story about a fix is hidden in the resolved diffs. Basically the problem we found during discussion was that if a user config file included some config in /etc/ssh/, and that config used relative include to include another config, this last Include must be resolved relative to users ~/.ssh. Previous implementation would have resolved it relatively to /etc/ssh/.

I don't know how to add a test for this, since the scenario involves creating an ssh config file somewhere under /etc/ssh/ and that isn't happening.

Or if there's no fix, could we rephrase the commit message some to clarify that? Also, if there's no actual fix here, the conventional commit type would be better set as perf instead of fix.

Original commit message was about perfomance improvement (that's probably the more interesting part anyway, since the failing scenario seems extremely unlikely). However @akinomyoga suggested that it would be changed to talk about the include fix. Should I change it back?

@akinomyoga
Copy link
Collaborator

There is a fix. At least the commit changes the behavior, so, in my opinion, we should mark it as fix rather than perf.

Actually, the ideal way is to separate the commit into the fix one and the perf one so that we could later track the relevant changes when necessary.

@Rogach
Copy link
Contributor Author

Rogach commented Jun 2, 2023

Actually, the ideal way is to separate the commit into the fix one and the perf one so that we could later track the relevant changes when necessary.

Ideally yes, but in this particular case it will be difficult. There are two possible commit orderings:

  1. "first fix, then perf": in the "fix" commit we'll need to add an additional argument to _included_ssh_config_files, something like top_level_config, to pass around the include base. Then the "perf" commit will be rewriting recurisve version into a loop, removing this additional argument.
  2. "first perf, then fix": here I'm not even sure how to keep the old behavior in the non-recursive version. Something like making another array of include bases? I might be okay with doing it in some other language, but in Bash array handling is will be too difficult for me and I might easily make a lot of mistakes.

I support marking commit as "fix". Or maybe we can mark it as "fix,perf"?

@akinomyoga
Copy link
Collaborator

For me it's optional to separate the commit; I wouldn't request you to separate it. But if you are willing to separate the commit, either option works. If it were me, I think I'd use option 1.

@Rogach
Copy link
Contributor Author

Rogach commented Jun 2, 2023

I really don't want to separate it :)

@scop scop merged commit 523dd5d into scop:master Jun 3, 2023
@scop
Copy link
Owner

scop commented Jun 3, 2023

Ok, merged as is. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants