Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] new syntax for pages #958

Closed
wants to merge 3 commits into from
Closed

Conversation

waldyrious
Copy link
Member

@waldyrious waldyrious commented Jul 18, 2016

I'm opening this to serve as a discussion thread to decide whether we might want to simplify the pages' syntax.

The proposal is inspired on Inkscape's man page (source code), and uses a format that I believe is more readable for humans and machines alike, for several reasons:

  • the main heading is changed from a # prefix to a row of ==== underneath, which makes it stand out more in plain text mode;
  • the current syntax involves bullet (unordered) lists of command descriptions with code samples --the actual commands-- between them (rather than within each list item), which is both visually distracting in most markdown renderers, and semantically incorrect;
  • the bullet list syntax was changed to plain paragraphs, which solves both the semantic issue, and a syntactic issue with some renderers not recognizing the code block if preceded by a list entry (case in point, from PR Switch from `` code blocks to 4 spaces #758)
  • the example code is now indented, so each description/code pair stands out better in the markdown source, and we can easily jump visually from example to example;
  • the adding of indentation triggers markdown's code block formatting, so now we don't have to wrap the examples in backticks (which have been the source of many linting errors)
  • by forgoing the backticks, the examples can be directly copied from the source markdown without the need to remove the backticks manually, which is more cumbersome than e.g. replacing tokens;
  • since the markdown is more readable, clients don't need to be as sophisticated, opening the door to dumb "clients" like the one proposed in Simple shell client shipping with the tldr-pages repo #527.

All in all, I believe this makes the pages be more inline with the markdown philosophy, which is to be human-readable, without the need for a renderer/client. Compare the plaintext source: before, after; and the github-rendered output: before, after.

But before this is discussed, we must come to agreement regarding which clients are officially supported, and what benchmark we'll hold third-party clients to in order to consider them semi-supported (i.e. we'd pledge not to break them without ample prior warning). Such an agreement might result in the clients currently in the tldr-pages umbrella going back to their original authors, or important third-party adopted under the tldr-pages umbrella (e.g. @Ostera's tldr.jsx). Either option would simplify the way the project handles clients, by having a single client management model and a clear standard for clients to match. I'll open the discussion later (probably in a few weeks, as I currently don't have the time to get involved too deeply).

@waldyrious waldyrious added syntax Issues related to the content and markup of the pages. decision A (possibly breaking) decision regarding tldr-pages content, structure, infrastructure, etc. labels Jul 18, 2016
@leostera
Copy link
Contributor

I like it. I'd have to put some more thought into it, but it already gives me a good gut-feeling.

@jedahan
Copy link
Contributor

jedahan commented Jul 19, 2016

A similar exploration : #758

@waldyrious
Copy link
Member Author

@jedahan thanks for the backreference! I knew this had been discussed before but didn't find that thread :)

@waldyrious
Copy link
Member Author

For reference, the current syntax was decided way back in the first versions of the project -- see 2ce1178.

@waldyrious
Copy link
Member Author

waldyrious commented Sep 3, 2016

Tangentially relevant: https://github.com/tldr-pages/tldr/pull/19/files (regarding the use of {{}} as token markers, as opposed to the standard format as described in http://docopt.org)

@waldyrious
Copy link
Member Author

waldyrious commented Nov 10, 2016

Pinging @tldr-pages/content: I've made an additional tweak to the proposed format, and highlighted the changes in the opening comment to this PR to make it clearer what the actual changes are (they're pretty minor overall). I've been testing this syntax locally using this shell script, which is based on @sitaramc's proposal in #527 (i.e. it simply cats the contents of the page into the terminal), with some extra substitutions to morph the current pages syntax into the one proposed here. I invite you to give that a try as well so you can get a feel of how readable this new syntax is.

to make the heading stand out more in plain text output
@waldyrious waldyrious force-pushed the waldyrious/alt-syntax branch from 0255e9e to 55cd4f8 Compare November 10, 2016 11:59
@agnivade
Copy link
Member

This needs careful scrutiny. I will take a look this weekend and get back.

@leostera
Copy link
Contributor

Love it. :shipit:

@agnivade
Copy link
Member

LGTM from me too. Nice and simplified.

But a couple of concerns regarding the deployment of it:

  • How are we going to transform the existing pages ? Write a script or something ?
  • The markdown linter needs to be updated too. How should the transition be made ?
  • Obviously, the clients also need to be updated. I am okay with doing it in the node client.

We need to come up with a transition process for this.

@waldyrious
Copy link
Member Author

How are we going to transform the existing pages ? Write a script or something?

That is the only sensible option, yeah.

The markdown linter needs to be updated too. How should the transition be made ?

It somehow doesn't seem to choke on this PR, though, which honestly surprises me. But we need to update it regardless. @Ostera, was it you who suggested updating the linter to a more mainstream language? This could be a good opportunity to do so. The downside is that if that takes some work, this change will be blocked until that's completed, so an alternative would be to adapt the current linter to match the new syntax. Who wants to give that a shot?

Obviously, the clients also need to be updated. I am okay with doing it in the node client.

Yeah, this is the biggest issue. We need to make sure that the change doesn't break at least the node, cpp, python and web clients. Ideally, we will want these (and all others) clients to adapt their syntax highlighting to the new format, but that's not a blocker IMO. Which clients can you guys test to see how they work with the new syntax?

Pinging all client authors listed in the readme and the wiki: @anoopengineer, @lord, @terenceng, @cs1707, @melpa, @lord63, @YellowApple, @Ostera, @dbrgn, @hterkelsen, @kirillseva, @porras, @pranavraja, @raylee, @rilut, @shoichikaji, @gianasista, @hidroh, @freesuraj, @mflint, @felixonmars, @Leandros.

@YellowApple
Copy link
Contributor

tldrb currently doesn't do a whole lot format-wise anyway (it just outputs the plain-text file to STDOUT), so as far as I know this shouldn't have any negative impact there.

@dbrgn
Copy link
Contributor

dbrgn commented Nov 12, 2016

I see a single problem with the new format: The old format can be parsed line-by-line. The prefix of a line specifies what the type is.

The new format breaks with that in the title formatting. A line needs to be formatted differently depending on the line that follows. That requires lookahead in the parser.

My suggestion would be to switch to the new format, but to keep the headline as-is with the leading # character. That also makes sure that the format is still valid Markdown.

Opinions?

dbrgn added a commit to tealdeer-rs/tealdeer that referenced this pull request Nov 12, 2016
Titles are not handled properly yet.

See discussion at tldr-pages/tldr#958
@dbrgn
Copy link
Contributor

dbrgn commented Nov 12, 2016

I just ported the Rust client (tealdeer) to the new format, except for the new title format: tealdeer-rs/tealdeer#32

Would be nice if the lookahead were not required.

@agnivade
Copy link
Member

My suggestion would be to switch to the new format, but to keep the headline as-is with the leading # character. That also makes sure that the format is still valid Markdown.

Its still valid Markdown. Its just an alternative syntax. All proper markdown renderer libraries should support it. Except handling the tokens {{}}, you don't need to write your own parser.

@dbrgn
Copy link
Contributor

dbrgn commented Nov 12, 2016

Its still valid Markdown. Its just an alternative syntax. All proper markdown renderer libraries should support it.

Ah, you're right. They're called "setext headers" in commonmark: http://spec.commonmark.org/0.26/#setext-headings

But it's still much simpler to parse when going with the ATX headings. And adding a custom line based parser is actually simpler than using a markdown library, as it needs much less dependencies and not all markdown libraries allow full customization of rendering.

@Leandros
Copy link
Contributor

Sorry, for being ignorant, I haven't read the whole thread, but what's wrong with the current format? The C client is currently not using a markdown library, or adhering to any markdown spec, but is using a very simple, line-based, parser.
Simply changing the heading would require a look-ahead parser (only one token, but still).

@hidroh
Copy link
Contributor

hidroh commented Nov 12, 2016

My client hidroh/tldroid would not be affected. I did some simple formatting wrt bullet list indentation on the client side, but with no bullet list and indentation already in place it would just simply skip formatting logic so no big deal.

Probably we can merge the change in this PR first to test it out and apply to the rest if all good.

@agnivade
Copy link
Member

We can revert the heading change, no issues. It doesn't hurt much of a readability anyways IMO.

@freesuraj
Copy link
Contributor

Good to go !!

@porras
Copy link
Contributor

porras commented Nov 12, 2016

Thanks for the heads up. tlcr uses a standard Markdown parser so it should be ok but I'll have a look.

@kirillseva
Copy link
Contributor

R client should also be largely unaffected, it just prints the markdown file with minimal formatting. 👍

@agnivade
Copy link
Member

Ok, I see that your linter is part of your main client repo. I was thinking that we separate it and replace the current linter we have with your bash linter.

Trying to "call" your linter can be involved because the existing linter is an npm package and it is installed as a dependency in the pages repo. I am in favor of ditching the existing linter simply because of its complexity and un-maintainability. So I don't want to "switch back" at all.

How do you suggest that we "call" your linter ? I see 2 ways -

  • We include the linter script in the pages repo itself. And just call the script in travis test. Simplest.
  • We have some way to pull your linter as a dependency (either as npm package or sth else) and run it. I don't want to just execute this script https://raw.githubusercontent.com/pepa65/tldr-bash-client/master/tldr-lint ./pages directly in the test.

P.S. - This conversation is getting digressed. I propose we open a new issue to talk about the linter. It is anyways a necessary thing regardless of this switch to new syntax.

@psibi
Copy link
Contributor

psibi commented Jan 13, 2018

@agnivade Can you summarize the things which are left for new syntax to get adopted ? (Sorry for asking it again, but this thread has been very difficult to follow)

@pepa65
Copy link
Contributor

pepa65 commented Jan 13, 2018

Haha, I guess you wil want to copy https://raw.githubusercontent.com/pepa65/tldr-bash-client/master/tldr-lint and inspect it by hand just to make sure a malicious actor doesn't somehow change the file in my repo..!

Shall we close here then..?
By the way, the 8 example limit is now enforced. I was wondering, does the workflow depend on certain exit codes for the linter?

@agnivade
Copy link
Member

@pepa65 - Nothing like that :) I just wanted to keep things simple and elegant. Executing a script from a remote github repo did not seem very elegant to me when we could include it in our repo. It's an unforseen dependency which is not very apparent. The file name can change, even raw.githubusercontent.com can change their urls. Then everything breaks.

I was wondering, does the workflow depend on certain exit codes for the linter?

Travis will respond accordingly. If your linter exits with a non-zero exit code, it will fail. Else pass.

@psibi - Sure.

  1. We need to update the linter. (@pepa65 already has a script for that)
  2. We need to convert the pages to the new format using a script (@mflint has a script for that)
  3. We need to update the clients (Most clients are already updated)

@waldyrious
Copy link
Member Author

@pepa65 I'd love if you could submit the linter script yourself to this repo, since then it would keep your authorship in the git history :) of course, if you prefer we can do it ourselves.

@pepa65
Copy link
Contributor

pepa65 commented Jan 13, 2018

Wouldn't that mean double maintenance?

I guess if I make changes, I make them in my own repo, and then resubmit here, right?
@waldyrious Do you mean submit to this repo (tldr) or to tldr-lint?

@pepa65
Copy link
Contributor

pepa65 commented Jan 13, 2018

@agnivade It looks like you're calling the linter on a directory (./pages), right? Would you like the linter to fail on the first flag?? It is now made/meant to analyze and give full feedback, not just compliance yes/no. I can do that (thus speeding it up for single or multiple files), but I would prefer a commandline-flag for that, like -c/--check.

If only want compliance pass/fails, maybe the grammar-check javascript file in pegjs/pegjs#506 would be more appropriate, it doesn't give feedback, just passes or fails.

@agnivade
Copy link
Member

Wouldn't that mean double maintenance?

Not really. We would want you to move the script from your repo to this repo. So it will exist only this repo.

Would you like the linter to fail on the first flag?? It is now made/meant to analyze and give full feedback, not just compliance yes/no.

Yes, we want full feedback. Because that is what the tldr-bot hooks into and updates the PR with. Just output the results to stdout.

@pepa65
Copy link
Contributor

pepa65 commented Jan 13, 2018

Right now it expects to output to a terminal and uses ANSI codes for colors. I would guess this is not desirable? What kind of markup would you prefer?

@Leandros
Copy link
Contributor

Updating the pages will break tldr for hundreds of people. You can't expect hundreds if not thousands of people to uniformally update their clients.

@pepa65
Copy link
Contributor

pepa65 commented Jan 13, 2018

It would be best if all clients support both formats for this transition to work. We don't want to alienate users. It isn't hard to support both. The tldr-bash-client does, tealdeer the Rust client does as well, and I would guess the node client as well? A disruptive change for sure, maybe for that reason it should not be done, and just accept the current syntax...

@agnivade
Copy link
Member

@pepa65 - Would you like to continue the conversation on our gitter channel ? This thread is long as is.

And I guess @Leandros is trying to say that even if all the clients are updated, the actual users still need to update their clients. As all the clients will break simultaneously for people who regularly do not update their clients.

For the record, I am against this change. But me and @waldyrious and @sbrl have had a long discussion on gitter on this. They want to go forward with it. And it seems the community is also okay with it. Hence we are doing this.

P.S. - I feel that this thread is about to continue longer. I would recommend everyone to comment only if something is pertaining to this exact issue. For meta-level concerns(like should we do this or not), please use the gitter channel.

@psibi
Copy link
Contributor

psibi commented Jan 13, 2018

It would be best if all clients support both formats for this transition to work.

That would be lot of work and it would take a quite a bit of time for all the other clients to support both format (And I don't want to wait another one year for the new syntax to get approved). And people will just upgrade to a newer version, when things will break.

@Leandros
Copy link
Contributor

And people will just upgrade to a newer version, when things will break.

No. People will stop using it.

@pepa65
Copy link
Contributor

pepa65 commented Jan 13, 2018

Discussion here: https://gitter.im/tldr-pages/tldr

@zlatanvasovic
Copy link
Contributor

@waldyrious Can this branch be deleted?

@pepa65
Copy link
Contributor

pepa65 commented Jan 29, 2020

The format hasn't even been converted yet..!

@zlatanvasovic
Copy link
Contributor

zlatanvasovic commented Jan 29, 2020

I don't have the time to read the whole discussion, but it seems this syntax never got accepted.

@pepa65
Copy link
Contributor

pepa65 commented Jan 29, 2020

It's trivial to convert between the two syntaxes, and the new syntax itself was hardly opposed.

@sbrl
Copy link
Member

sbrl commented Jan 30, 2020

Please don't delete the branch @zdroid. It contains valuable information.....

Edit: Also, I believe the problem was with the unmaintainable linter - not that it didn't get accepted. So in effect this is actually still a possibility - just one that requires a rewrite of the linter as a prerequisite.

@waldyrious
Copy link
Member Author

I don't have the time to read the whole discussion

@zdroid I'll have to be a bit blunt in response to that, but I hope you take these words as constructive criticism: please be respectful of other people's time. If you can't afford to read a long thread, that's totally understandable, but please don't suggest destructive actions (deleting branches, closing issues, etc.) without gathering the proper context first and making it clear up-front why you're suggesting that action. Feel free to chat on Gitter if you need context, as that doesn't make a long thread even longer.

It bears repeating what I recently said elsewhere: we don't have shortage of storage space (for branches, issues, files, and so on.), and old discussions won't pollute the list of active issues if they aren't unnecessarily bumped. Hope this makes sense to you. 🙏

@zlatanvasovic
Copy link
Contributor

@walydirious I didn't mean to delete it at any time, unless you directly approved it. That's why I asked you, of course.
The storage is not a problem, but managability of the project is. You can see many projects with hundreds of branches, 90% of them being stale and thus making the other 10% harder to find and manage. Also, having those hundreds of branches means not having decided on hundreds of issues, as those are practically open forever. Deleting stale branches is just like cleaning up the house, nothing else.

@waldyrious
Copy link
Member Author

I understand your points, and would like to discuss this further with you to make sure we are aligned as a project, but this is not the place for that dscussion. I'll reach out on Gitter when I get the chance.

@sbrl
Copy link
Member

sbrl commented Feb 5, 2020

In GitHub's web interface, it orders branches by activeness. So the stale ones automatically fall to the bottom of the list anyway :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
decision A (possibly breaking) decision regarding tldr-pages content, structure, infrastructure, etc. mass changes Changes that affect multiple pages. syntax Issues related to the content and markup of the pages. waiting Issues/PRs with Pending response by the author.
Projects
None yet
Development

Successfully merging this pull request may close these issues.