Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add several more Roff extensions #4309

Closed
wants to merge 1 commit into from
Closed

Add several more Roff extensions #4309

wants to merge 1 commit into from

Conversation

jordemort
Copy link

To go along with github/markup#1196 this PR adds several more extensions to the Roff language.

Description

I surveyed the manpages in the Ubuntu Cosmic, Debian Stretch, and CentOS 7, and in the OpenIndiana Hipster Live DVD. Here are the extensions used, by count, aggregated by unique filename: https://gist.github.com/c69fa7739d6e4731e90d5749c887f29b

I then went and searched for any of the extensions that had more than 100 uses on GitHub - this PR adds the extensions that, as a result of that search:

  • Appear to be in wide use on GitHub
  • Appear to primarily be used for manpages, as opposed to some other format
  • Appear in more than one or two repositories

They went through a further round of winnowing as I was writing up this PR and deciding if I was prepared to defend the addition of each extension.

This ended up excluding a lot of the nonstandard ones from OpenIndiana, which only appear in one or two mirrors on GitHub, along with a lot of assorted exotica. Those will have to wait until @Alhadis adds a shiny new smart Roff strategy.

Checklist:

Copy link
Collaborator

@Alhadis Alhadis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Covering every in-the-wild extension used by Roff is as daunting as covering every XML-based file format, which is why we seriously need a similarly dedicated strategy.

Copy link
Contributor

@pchaigno pchaigno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jordemort for the pull request!

I've checked the usage of the new file extensions, and several are well below our current threshold for in-the-wild usage of hundreds of repositories. I'm not too much worried about conflicts with other languages; even if such conflicts arise, Roff looks fairly easy to distinguish from other languages. Nevertheless, I'm not very comfortable accepting some of these file extensions (those below 100 repositories) as they will make it much harder for us to reject other candidates for support in the future: it's difficult to stay credible when rejecting a file extension with 80 repositories if you accepted one with 20 repositories a month ago...

For those extensions, maybe working on a dedicated strategy as @Alhadis suggested would be a better approach. (With a dedicated strategy, we don't need to add new file extensions.)

Alternatively, if you believe their usage will grow over time, I can add them to #4219, and I'll add support for them myself when they reach the threshold.

@@ -4091,14 +4091,22 @@ Roff:
color: "#ecdebe"
extensions:
- ".roff"
- ".0p"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I counted only 19 users with 19 repositories using this file extension. That's very low considering our usual threshold of hundreds of repositories. That might be less of an issue if we consider .h.0p instead of .0p (because it's very unlikely we ever get a conflicting language for that longer extension), but it will only match 13 repositories out of the 19. We will have a hard time rejecting other file extensions if we accept that one.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

- ".1"
- ".1in"
- ".1m"
- ".1p"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file extension is used in 170 repositories by 153 users. It's a bit low but I'd say it's okay, because Roff looks fairly easy to distinguish from other languages with a heuristic rule, if needed.

- ".1"
- ".1in"
- ".1m"
- ".1p"
- ".1pm"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file extension is used by only 11 users with 11 repositories. Again, this is very low.

- ".3in"
- ".3m"
- ".3o"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file extension is used by 13 users with 15 repositories.

- ".3in"
- ".3m"
- ".3o"
- ".3p"
- ".3perl"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file extension is used by only 19 users (20 repositories), but perhaps it's less likely to conflict with other languages in the future due to the perl?

@Alhadis
Copy link
Collaborator

Alhadis commented Nov 4, 2018

[…] maybe working on a dedicated strategy as @Alhadis suggested would be a better approach.

I got this.

@jordemort
Copy link
Author

@pchaigno

  • .Np extensions - I added these because they're used in https://mirrors.edge.kernel.org/pub/linux/docs/man-pages/man-pages-posix/ and there seemed to be a bit of concern about supporting them over in the discussion on Add support for manpages via mandoc github/markup#1196 but I'm fine with leaving them out in favor of waiting for a better Roff strategy
  • .1pm / .3pm - The .3pm extension is really the one I want added most out of all of these, since it seems to be in wide use by various Perl projects. I waffled on .1pm but it seems weird to add .3pm but not .1pm
  • .3perl - added because it seemed weird to do .3pm but not .3perl - I think .3perl might mostly be a Solarisism, though; it's probably fine to wait
  • .3o - not sure why I added this, it can also probably wait

@Alhadis
Copy link
Collaborator

Alhadis commented Nov 6, 2018

Okay, I need help. Which do you folks prefer for a manpage-strategy?

  • Highly accurate but complicated and ugly code?
  • Lower accuracy but much clearer and idiomatic Ruby?

I'm trying to juggle performance, accuracy, and something that makes sense to Ruby programmers, but I realise trying to uphold all three is going to be impossible.

For the record, I'm only concentrating on matching manpages (Roff documents which use man(7) and mdoc(7) macros specifically)...

@pchaigno
Copy link
Contributor

pchaigno commented Nov 6, 2018

I'll always be leaning for the less-accurate but clearer solution for heuristics. Ugly code is harder to maintain, and we have to consider the performance impact.

@Alhadis
Copy link
Collaborator

Alhadis commented Nov 6, 2018

and we have to consider the performance impact.

Actually, that's a large part of why this looks like a mess. 😀 Everything is hard-wired to short-circuit as soon as possible, leaving to a very inside-out looking flow of logic.

I think I'll open a W.I.P pull-request so we can discuss this a bit clearer. Just wait until I've finished rewriting the first mess. :D

@stale
Copy link

stale bot commented Dec 6, 2018

This pull request has been automatically marked as stale because it has not had recent activity, and will be closed if no further activity occurs. If this pull request was overlooked, forgotten, or should remain open for any other reason, please reply here to call attention to it and remove the stale status. Thank you for your contributions.

@stale stale bot added the Stale label Dec 6, 2018
@stale
Copy link

stale bot commented Dec 22, 2018

This pull request has been automatically closed because it has not had activity in a long time. Please feel free to reopen it or create a new issue.

@stale stale bot closed this Dec 22, 2018
Alhadis added a commit that referenced this pull request Feb 24, 2019
Alhadis added a commit that referenced this pull request Aug 12, 2019
* Add strategy to identify Roff man pages

References: #4258, #4309, #4317

* Remove `# coding: utf-8` junk injected by accident
@Alhadis Alhadis deleted the more-roff branch August 12, 2019 12:11
Copy link

@Nweman Nweman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nwes Thailand

@github-linguist github-linguist locked as resolved and limited conversation to collaborators Jun 17, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants