Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define spec for officially sanctioned clients #1065

Closed
waldyrious opened this issue Sep 16, 2016 · 64 comments
Closed

Define spec for officially sanctioned clients #1065

waldyrious opened this issue Sep 16, 2016 · 64 comments
Labels
architecture Organization of the pages per language, platform, etc. clients Issues pertaining to a particular client or the clients as whole. decision A (possibly breaking) decision regarding tldr-pages content, structure, infrastructure, etc. documentation Issues/PRs modifying the documentation.

Comments

@waldyrious
Copy link
Member

waldyrious commented Sep 16, 2016

Background:

@waldyrious waldyrious added documentation Issues/PRs modifying the documentation. architecture Organization of the pages per language, platform, etc. clients Issues pertaining to a particular client or the clients as whole. decision A (possibly breaking) decision regarding tldr-pages content, structure, infrastructure, etc. labels Sep 16, 2016
@agnivade agnivade changed the title Define spec for officialy sanctioned clients Define spec for officially sanctioned clients Sep 16, 2016
@agnivade
Copy link
Member

We should close #1010 and #1000 and make it part of this issue.

Mentioning a comment made by @rubenvereecken for reference -

The client specificity is exactly the reason why I think a completion script should not be bundled with the main repo. People will miss out on client-specific completions because they'll never install those.

We did want to make the listing behavior standard. After all, it's useful for all clients. Never got around to it I think.

The main idea was to enforce a -a|--list-all flag on all clients, by which clients can further implement bash|zsh command completion.

@waldyrious
Copy link
Member Author

Are you saying that every sanctioned client must implement auto-completion? If not (which is my position), I'd say we should connect those issues to this one, but not close them. That way we can track implementation of that feature without blocking this one. A --list-all feature can certainly be part of the minimal spec, though (IMO).

@agnivade
Copy link
Member

Are you saying that every sanctioned client must implement auto-completion?

Not at all. But I was saying if they are to be implemented at all, it should be implemented by clients(because a proper auto-completion should also complete the helper flags and not just commands, and the parent repo has no way to know all the helper flags of all the clients). That's what we concluded at the end of the PR. Therefore, I want a --list-all option to be part of the minimum spec, which will allow a standard way for all clients to implement auto-completion.

@waldyrious
Copy link
Member Author

Ok, in that case those issues should remain open until (1) we define what a sanctioned client is --i.e. this issue--, (2) we identify which clients are (or we can easily help become) sanctioned, and finally (3) we ensure that the sanctioned clients all support auto-completion.

And this issue will make sure that --list-all is part of the official spec. Sounds good?

@agnivade
Copy link
Member

Sure.

@agnivade
Copy link
Member

agnivade commented Oct 3, 2016

Writing down some notes that I collected.

Architecture

  • Clients must have a local cache to query from.
  • Local cache should be stored under .tldr folder inside user's home directory.
  • Fresh data should be downloaded from here - page cache, index.

I think nothing else should be specified regarding how the client wishes to store the pages internally. If it wishes to store the metadata in a RocksDB cache, that should be allowed.

Features

  • -u|--update option to update its local cache
  • -a|--list-all option to list all commands for the current platform.
  • out of box support for command autocompletion for bash atleast.
  • support read from symlinks
  • support reading a json config from a .tldrrc file in the home folder.

This should be a good starting point.

@waldyrious
Copy link
Member Author

waldyrious commented Oct 3, 2016

Good points. Some observations:

  • I'd add that --update should list the new / changed commands
  • we must specify what .tldrrc can contain and what format (i.e. keys, values) can be in it
  • a --version option (c.f. Should support -v argument to check version #78) that uniquely identifies this client (username/project) and its version (optionally the repo link as well?) is a must.

@waldyrious
Copy link
Member Author

According to tldr-pages/tldr-node-client#52, we'll also want the clients to support a --os (optionally --platform?) option, to get the page for a specific system.

@waldyrious
Copy link
Member Author

waldyrious commented Oct 19, 2016

Another feature we should probably also include: auto-updating the pages content.

Quoting from #1056 (comment):

I don't care what it's built in or on as long as I can get this flow:

$ one-step-install tldr
done <-- in under 5 seconds
$ tldr tar
...
# 30 days later
$ tldr tar
(...background fetch of new content)
new page

@pepa65
Copy link
Contributor

pepa65 commented Feb 7, 2017

Super helpful notes, deserving of a more prominent place!
Following Ray Lee, for the cache I first look if ~/.config exists, and then put a tldr (no dot) directory there, else it goes to ~/.tldr and I think that's a good practice, but I'm happy to change it. Maybe it's even desirable that various clients share the same cache? (On my system: node client; cpp-client uses ~/.tldrc)

Ray Lee's client fetches single pages that aren't cached yet or which have expired, but I haven't seen another bash client that uses the zip file.

I think requiring a json config file is not a nice thing to implement in a bash client. Great if your language has easy libraries for that...

Adding autocomplete for bash is possible, but more intrusive for the installation. It is nice to just need 1 file in the right location, and you're good to go.

@waldyrious
Copy link
Member Author

Following Ray Lee, for the cache I first look if ~/.config exists, and then put a tldr (no dot) directory there, else it goes to ~/.tldr and I think that's a good practice, but I'm happy to change it.

What do you think about #876?

Maybe it's even desirable that various clients share the same cache? (On my system: node client; cpp-client uses ~/.tldrc)

In that case we would need to include that in the client spec. I don't see why it wouldn't work, but I'd prefer if other client authors commented here.

I haven't seen another bash client that uses the zip file

Related previous discussions: #336, #343, #718. I think we should shorten the archive formats we officially provide for clients as much as possible, to encourage harmonization and reduce endpoints we must keep working.

Personally I'd prefer if all clients fetched from the git repo, but it's understandable that git can't be expected to be always available. Still, I wonder if there's a way we could use github's auto-archive feature (https://github.com/tldr-pages/tldr/archive/master.zip) for that effect, rather than a custom-made zip file. This should be workable since this repo consists almost exclusively of pages content. We can even consider how to migrate non-pages content elsewhere to avoid polluting the zip file.

I think requiring a json config file is not a nice thing to implement in a bash client. Great if your language has easy libraries for that...

What would you suggest? We must use a standardized format for the config file. Markdown or yaml are the most lightweight ones that come to mind, which don't require a sophisticated parser, but there could be options I'm not considering.

Adding autocomplete for bash is possible, but more intrusive for the installation. It is nice to just need 1 file in the right location, and you're good to go.

Yeah, I believe we're all in agreement that this should be a recommendation, but not a requirement.

@pepa65
Copy link
Contributor

pepa65 commented Mar 12, 2017

It would be good if there is a shared location among all installed clients, as the data is the same. (Just how expiration is handled might be different.) Because the pages are data, it should then use $XDG_DATA_HOME (or when not set or empty: $HOME/.local/share). I'm happy to change this.

The most important consideration when using Github for pages retrieval is availability, because of Github's download rate limitations on certain locations but not others. It seems the assets directory on http://tldr-pages.github.io is a good place.

Using YAML is way easier to handle without libraries.

@waldyrious
Copy link
Member Author

waldyrious commented Mar 12, 2017

I'm wondering if, instead of requiring a full content dump download, we could implement a partial update "service" that all clients could use if they don't want to use git -- specifically, a regularly scheduled "release" archive (say once a week) which would be built automatically and placed in the assets directory of http://tldr-pages.github.io, containing all the pages changed in master in that period.

Clients could then download that archive, and extract the pages to the cache folder. The archive could even include a log file with all the commit messages from that week, which the client could present to the user to show what's new.

This is just off the top of my head -- do you guys think it's workable, or would it be too complex to implement (either in the client or the server side)?

@pepa65
Copy link
Contributor

pepa65 commented Mar 12, 2017

I think the overhead on the server is almost more if every non-existent/too-old page has to be downloaded separately, since we're looking at a roughly 100kB download for the whole package -- which comes with a consistent & up-to-date index.json.

@agnivade
Copy link
Member

@waldyrious - I think its a bit too complicated.

@jasonkarns
Copy link

Throwing my vote in for adhering to the XDG spec for where to place data.

The TLDR config file should be under $XDG_CONFIG_HOME/tldr (so probably named $XDG_CONFIG_HOME/tldr/config.json or some such.

It can be argued that the page data is "data" and be placed under $XDG_DATA_HOME/tldr. However, it's already been referred to as a "cache" in this thread a number of times. And the app would work if the data were wiped (b/c it just pulls fresh when necessary). Thus, IMO, the data qualifies as true cache data and would thusly live under $XDG_CACHE_HOME/tldr. But I think this is a minor point.

The major sticking point, IMO, is that no client should be considered "official" if it doesn't use XDG-spec env vars for storing the config/data/cache.

@pepa65
Copy link
Contributor

pepa65 commented Mar 13, 2017

I understand the case for $XDG_CACHE_HOME/tldr. I prefer to see the pages as data, and I don't expect that many will change that often once they have been added and amended during a period of settling. I've also found that downloading the whole 100-200 kB archive is really quick, and it comes with a matching index.json. Doing that once a month or so is not too bandwidth intensive. I think storing the pages at $XDG_DATA_HOME/tldr is more proper. But I will follow the mandates when they come.

@jasonkarns
Copy link

@pepa65 generally agree. I think the pages do feel more like data than cache and I might switch my vote on that one.

  • for cache: the app runs without them (since it will refetch/repopulate the cache as necessary)
  • for data: pages change rarely
  • for data: pages are certainly not "nonessential"

One thing about leveraging cache is that it virtually eliminates the need to build a feature for wiping/re-fetching. (Since the xdg_cache_home is generally expected to be wiped occasionally)

But given that those pages are preferred to be long-lived, (and could conceivably even be delivered as part of the tool's initial installation) makes them fit properly as data.

XDG_DATA_HOME for me, now!

@jedahan
Copy link
Contributor

jedahan commented Dec 18, 2018

I'd prefer any specification to only describe the interface between the user and the output, and not any implementation.

In this spirit I forked the current proposal to remove whatever prescriptions on implementation I could while keeping the spirit of providing a consistent set of expectations and high quality of support:

I also added some of the suggestions in this thread that support that spirit.

https://ybin.me/p/127d5bfb038aa09e#CGtXFguDi0X+cA2u6/ruEaCwzqIm5xu2lyPlP0Bu/GY=

@agnivade
Copy link
Member

A specification is a good to have because it gives guidance for new client authors on how to go about creating a client. Or what are some of the features a client should have.

@jedahan
Copy link
Contributor

jedahan commented Dec 18, 2018

I agree that examples are good to have, but I'd rather avoid listing all features every client has done in a specification, and avoid restricting clients to only one way of implementing things.

A good example is specifying how files should be stored. What if my client was much better with an internal db? Or was meant to be accessed offline? It would be a shame if we couldn't link to them as 'an officially sanctioned client' just because they implement things differently but are otherwise high quality and behaved consistently with peoples expectations.

When I read a specification with so many MAYs, it just feels like a laundry list of all the features any client has done, and I feel obligated to adhere to all the optional stuff or wonder if they are useful.

If we can specify a minimum set of unambiguous design features that make a quality client, it helps encourage new clients to be made, possibly with better ideas than we have right now.

@sbrl
Copy link
Member

sbrl commented Dec 18, 2018

Thanks for the feedback everyone! I've created an updated version of the spec. Here's a link: https://ybin.me/p/dddac3fe0177c4dd#+fRQMkWH7Nuv2geAhX96O8nND2IqzbnICRWDS9+cvc4=

I've also responded to individual points raised below:

@MasterOdin:

Add a --single-column flag to show commands in single column

I'm unsure what you mean by a --single-column flag. Could you elaborate please?

Make --os required flag to force lookup page for given OS

Would that be like --os windows, or --os linux? I seem to have forgotten to tackle that!

Personally, I'd prefer something like this:

tldr mkdir # Shows page for default platform
tldr windows/mkdir # Shows page in 'windows' platform

I'd also make the required alias (-a for --list-all, -v for --version, etc.) be included here

Excellent idea.

@agnivade:

I think if the client supports the --os flag, then it would need to be updated to recognize that flag. Also, there may not be any direct mapping from the host platform name to the folder name.

I'm unsure what you mean. Surely if it's --os windows then it shouldn't matter?

If not available, then the client MUST (SHOULD?) check other platforms and choose a page from there instead

I think it should be a SHOULD.

Sure. The only reason I made it a MUST is because people thinking a page doesn't exist when it actually does is a common source of new issues for pages that already exist in the repository.

Cache location - The location should be well-defined. But the modus-operandi should be left to the clients.

So we shouldn't specify the directory to store it in? I see. Never written a 'real' spec before, so thanks for the tip!

@MasterOdin
Copy link
Collaborator

@jedahan

A good example is specifying how files should be stored. What if my client was much better with an internal db? Or was meant to be accessed offline? It would be a shame if we couldn't link to them as 'an officially sanctioned client' just because they implement things differently but are otherwise high quality and behaved consistently with peoples expectations.

Should it be possible to use two clients on the same machine with expectation that both work totally fine regardless of which client I'm using at a given time. Would a user expect that if they were to use the python, node, and cpp implementations available within the tldr org, they'd end up with a unique cache per client? You could probably put this behind a "If your client utilizes a file based cache, it MUST ..." for the case of clients that use internal DBs. I do agree the number of "SHOULD"/"MAY" be minimized and focus on a core of what you want out of implementations.

@sbrl

I'm unsure what you mean by a --single-column flag. Could you elaborate please?

This comes down to I guess how you want --list-all works. The node client for example gives a comma separated list of all commands. To make it easy to use that then in piping or for bash completions, it's necessary for a --single-column command to exist which then shows commands one per line. I'd just like a consistent interface of "get all tldr commands, one per line" possible as it does then make it possible to distribute a single base autocompletion script for the tldr project.

@jedahan
Copy link
Contributor

jedahan commented Dec 18, 2018

The 'one page per line' use case seems the simplest, I'd be happy to see that be the default behavior for --list.

@MasterOdin I would prefer clients that allow you to point them towards a cache dir, than the organization dictate where it should be. But I don't think people making clients should be forced to support configurable cache directories, as that adds friction to development and just plain might not make sense.

By not forcing a particular directory, we are making it easier for multiple clients to coexist, at the cost of on-disk duplication. Though I don't think its a great idea, it also allows for clients to do other things in their own directory.

In addition there are differing opinions on where files should go - for example the XDG spec, or ~Library/Application Support/ vs ~/.config on macOS. I'm sure other operating systems have different places that make way more sense than a single place we define.

I put some files on a separate volume to work better with backup software. If tldr-pages forced it to be in my home directory, I'd need to hack around it with symlinks which might break some clients or my backup software, etc.

Though my philosophy for specifications of this nature are "Does this help specify the minimum criteria for developing a solid tldr client?" and hard-coding a location for where things should live on-disk seems over that bounds.

If we are trying to show examples of good cli design, I could maybe see:

A tldr client MAY/SHOULD allow --cache-location to specify the desired location to read from an on-disk cache

Or something of that nature

@agnivade
Copy link
Member

and avoid restricting clients to only one way of implementing things.

Yes, if you see my earlier comment, this is exactly what I wanted.

I'm unsure what you mean. Surely if it's --os windows then it shouldn't matter?

In the node client, it is --os=<linux|windows|mac>. So the extra flag value needs to be there in the client.

So we shouldn't specify the directory to store it in? I see.

Sorry, I wasn't clear maybe. I was saying we should specify the directory. But the internal implementation of the the cache is managed should be left upto the client.

@sbrl
Copy link
Member

sbrl commented Dec 19, 2018

This comes down to I guess how you want --list-all works. The node client for example gives a comma separated list of all commands. To make it easy to use that then in piping or for bash completions, it's necessary for a --single-column command to exist which then shows commands one per line. I'd just like a consistent interface of "get all tldr commands, one per line" possible as it does then make it possible to distribute a single base autocompletion script for the tldr project.

I see. I'm unsure about that comma-separated output.

In the node client, it is --os=<linux|windows|mac>. So the extra flag value needs to be there in the client.

...I see. What do you think of the alternative platform/command syntax?

A tldr client MAY/SHOULD allow --cache-location to specify the desired location to read from an on-disk cache

Or something of that nature

I'm unsure that such an option should be specified via a command-line argument. I suspect that it's better suited to an environment variable, such as TLDR_CACHE_DIRECTORY - since you probably want to keep it the same. Thoughts?

I definitely think we should not specify how the cache should be stored - only where. e.g. it could be stored in a leveldb, or an sqlite database, etc.

@pepa65
Copy link
Contributor

pepa65 commented Dec 19, 2018

The easiest, most accessible and portable way to store the cache is to unpack the archive containing index.json and the platform directories containing the pages. If you stick to POSIX, you should put it in $XDG_DATA_HOME/tldr, where $XDG_DATA_HOME can be user defined, but on Ubuntu it is $HOME/.local/share by default (and $HOME is /home/$USER).

@jedahan
Copy link
Contributor

jedahan commented Dec 19, 2018

tl;dr I think the only thing the spec should recommend regarding how and where files should be stored, is that clients provide a way to specify where files should be stored in TLDR_CACHE_LOCATION and maybe a --cache-location commandline option.


I'm getting confused about the difference people are using the words how and where things are stored, and how to configure where things should be stored. Here's my understanding:

  • how to store data - flat files or a db
  • where to store data - path to store flat files or a db
  • how to configure where to store data - commandline flag and/or environment variable

Now some thoughts on what the spec should say on these things

  • how to store data: should not be prescribed by the spec, as it needlessly restricts clients.
  • where to store data: should not be prescribed by the spec, because the main benefit of two clients having the same place to store data is if how to store data is prescribed, and that client authors have the best context of where to store data by default.
    • My hope is to avoid doubling the size of the spec to quantify every possible path of configuring where directories are by simply recommending that clients conform to whatever is most natural for the platform it is on, and just provide a straightforward escape hatch in allowing that directory to be configured.
  • how to configure where to store data should be recommended by the spec so that people can switch clients without things breaking

What I expect will be the most common implementation:

  • clients will keep things in the flat file format
  • clients will default to XDG_DATA_HOME/tldr or HOME/.tldr on linux, ~/Library/Application Support/tldr or HOME/.tldr on macOS, and %RoamingData%/tldr or whatever new jazz exists on windows.
  • clients will accept TLDR_CACHE_LOCATION, and maybe --cache-location, which works on all three platforms and is unambiguous but verbose

I think one of the trickiest things about this, is that tldr uses -- for commands, which conflates commands and options which is why --cache-location feels weird and pushes the idea that it should be an environment variable like TLDR_CACHE_LOCATION.

Oh and I chose LOCATION because CACHE{DIRECTORY,PATH, FOLDER} are called different things on different platforms (and even within different file managers, for instance), but I really don't care what its called as long as the spec recommends that it is configurable.

@sbrl
Copy link
Member

sbrl commented Dec 19, 2018

Excellent demystification, @jedahan :D

@agnivade
Copy link
Member

I definitely think we should specify how the cache should be stored - only where.

I think you meant to say "we should not" 😆

...I see. What do you think of the alternative platform/command syntax?

I don't understand. What is the alternative syntax ?

@sbrl
Copy link
Member

sbrl commented Dec 20, 2018

Yeah, I did! I'll edit that comment.

Here's a comparison of the 2 syntaxes:

tldr --os windows mkdir
tldr windows/mkdir

Edit: We should probably also mention that failures MUST result in a non-0 exit code, if appropriate.

@agnivade
Copy link
Member

I see. But in both cases, the client needs to be updated right ? That was my point. Unless you are saying that <platform> should exactly match with the folder name, in which case it does not.

Also, sometimes the default host platform does not exactly match with the folder name (the case without
--os). In those cases, an explicit mapping may be required. My point being, when we add a new platform, the client might be needed to update.

@pepa65
Copy link
Contributor

pepa65 commented Dec 20, 2018

I just implemented platform-agnosticism in the tldr-bash-client (meaning no hardcoded platforms, whatever directories are in the archive will be allowed), and that is no problem. Of course the detection of the current platform is still dependent on the detection code, but new directories in the archive will be usable without the code needing to be updated.

@sbrl
Copy link
Member

sbrl commented Dec 21, 2018

The platform names don't correspond to the folder names?

@pepa65
Copy link
Contributor

pepa65 commented Dec 21, 2018

I would hope the archive will keep having folder names that correspond to the platform names. So far that is the case. (Except for current, which is common plus the detected current platform, if any.)

@agnivade
Copy link
Member

The platform names don't correspond to the folder names?

I mean it depends on the client, the programming language, and where it's being run. Take osx for eg. To detect the host OS, the client might get osx, darwin, macOS and all will be correct. But the folder name is osx.

@sbrl
Copy link
Member

sbrl commented Dec 22, 2018

@agnivade Oh, right! I never knew. I don't actually have a mac machine, so I have no idea as to how it identifies itself :P

So what should we put for the platform specification in the spec? platform/page, or --os platform page? Or both?

@agnivade
Copy link
Member

Whatever is fine. I don't have an opinion on that. I just wanted to say that the client needs to be updated.

@sbrl
Copy link
Member

sbrl commented Jan 12, 2019

It's finally happened! See #2706 for the full specification I've drafted up. Taken me all afternoon, but worth it 😺 🎆

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
architecture Organization of the pages per language, platform, etc. clients Issues pertaining to a particular client or the clients as whole. decision A (possibly breaking) decision regarding tldr-pages content, structure, infrastructure, etc. documentation Issues/PRs modifying the documentation.
Projects
Development

No branches or pull requests

8 participants