Standardise STDOUT output for CLI commands #22

emmacasolin · 2022-02-17T23:19:07Z

Specification

Our new commandments for a better, brighter, command line:

Send output to stdout. The primary output for your command should go to stdout. Anything that is machine readable should also go to stdout—this is where piping sends things by default.
Send messaging to stderr. Log messages, errors, and so on should all be sent to stderr. This means that when commands are piped together, these messages are displayed to the user and not fed into the next command.
When we are displaying output text to indicate success, we need to be brief. Feedback is always preferred, even though deviating from UNIX conventions.
The output from stderr can be redirected to /dev/null, but we're going to provide more verbosity options that suppresses all non-essential text (feedback, warnings, etc.).
We should ALWAYS be using camelCase for anything going to stdout. As we prioritize parsability, this needs to be the default. No more manually outputting strings like Name: amy.
Nested structures should always be avoided in dicts formats wherever possible. Instead, the nested structure should be spread into the data.
Streaming with the JSON format should output the objects in JSONL format. To parse it with jq, one needs to use --slurp

STDOUT

Traditionally, we have constructed human readable dictionaries as such:

process.stdout.write(
  binUtils.outputFormatter({
    type: options.format === 'json' ? 'json' : 'list',
    data: [`Root certificate:\t\t${response.getCert()}`],
  }),
);

This is clunky, inconsistent, worse for parsability and more...

In all cases where the output is "semantically" a list, the output should use either json or list. In all other cases, the list option should be replaced with dict. This should be done like this:

process.stdout.write(
  binUtils.outputFormatter({
    type: options.format === 'json' ? 'json' : 'dict',
    data: { rootCertificate: response.getCert() },
  }),
);

The data parameter must be kept as close to the return value of the RPC call possible. This ensures that our output is predictable. Furthermore

STDERR

Log messages, errors, and so on should all be sent to stderr. This means that when commands are piped together, these messages are displayed to the user and not fed into the next command.

Most STDERR messages should either be using the raw or list format. This is because they are intended to be human readable. Under json format, the data should always be { message: "..." }.

process.stderr.write(
  binUtils.outputFormatter({
    type: 'list',
    data: ['Message 1'],
  }),
);

process.stderr.write(
  binUtils.outputFormatter({
    type: 'raw',
    data: 'Message 1',
  }),
);

process.stderr.write(
  binUtils.outputFormatter({
    type: 'json',
    data: { message: "Message 1" },
  }),
);

STDIN

STDIN should always be checked for interactivity before asking prompts. This means that if a Polykey Client session has not already been opened, we should error before asking for a password prompt on a non-interactive terminal.

If input or output is a file, support - to read from stdin or write to stdout. This lets the output of another command be the input of your command and vice versa, without using a temporary file. For example:

cat secret.txt | polykey secret create - vault:secret

Piping

There is currently issues with piping input into Polykey-CLI: #139

TBD...

Additional context

General CLI review issue: Publishing the PK CLI for General Use Polykey#268
https://clig.dev/#output

Tasks

Ensure that the output formatter can correctly format output for all possible format options without modifying content
Replace all usage of the list format option with dict (except in situations where the output is more semantically suited to a list)
Ensure that there are no tabs or spaces in json output (as this needs to be machine readable)
Replace usage of the output formatter with just outputting a single string in cases where this is appropriate
Implement -q argument to suppress non-essential output.
Review commands for new specification changes

Agent - 7/7
Bootstrap - 1/1
Identities - 13/13
Keys - 12/12
Nodes - 6/6
Notifications - 3/3
Secrets - 12/12
Vaults - 12/12

The text was updated successfully, but these errors were encountered:

CMCDragonkai · 2022-02-18T01:53:04Z

I was thinking that for commands that output a stream of data, the human readable formats don't currently have anything separating each "unit" of output. That is, for dict, we might see something like:

pid	3497062
nodeId	vng8qsa39osoqrj3o1ftnbgig80drej2on6fok35jvsm5457up72g
pid	3497062
nodeId	vng8qsa39osoqrj3o1ftnbgig80drej2on6fok35jvsm5457up72g
pid	3497062
nodeId	vng8qsa39osoqrj3o1ftnbgig80drej2on6fok35jvsm5457up72g

Which represents 3 units of output.

In JSON, this is distinguished simply by the newline:

{"pid":3497062,"nodeId":"vng8qsa39osoqrj3o1ftnbgig80drej2on6fok35jvsm5457up72g"}
{"pid":3497062,"nodeId":"vng8qsa39osoqrj3o1ftnbgig80drej2on6fok35jvsm5457up72g"}
{"pid":3497062,"nodeId":"vng8qsa39osoqrj3o1ftnbgig80drej2on6fok35jvsm5457up72g"}

So for the human readable output, it may be a good idea to have a double line separation between each unit. The end result should look like:

pid	3497062
nodeId	vng8qsa39osoqrj3o1ftnbgig80drej2on6fok35jvsm5457up72g

pid	3497062
nodeId	vng8qsa39osoqrj3o1ftnbgig80drej2on6fok35jvsm5457up72g

pid	3497062
nodeId	vng8qsa39osoqrj3o1ftnbgig80drej2on6fok35jvsm5457up72g

And this would be similar for other kinds of outputs like list:

a
b
c

a
b
c

a
b
c

But to do this, you'd have to add \n after each process.stdout.write, but only when it's not json output, since json output now already has a \n appended afterwards in the output formatter, and double lines is not valid for jsonlines format convention.

Something like this:

for (const _ of [1,2,3,4]) {
  process.stdout.write(binUtils.outputFormatter({ ... }));
  if (options.format !== 'json') process.stdout.write('\n');
}

To make this more robust, we may add a new output formatter async generator (consumer), that takes input and makes these decisions automatically rather than having to put it in each command.

CMCDragonkai · 2022-10-17T08:02:01Z

According to MatrixAI/Polykey#446 (comment) our dictionary formatter doesn't support recursive dictionaries.

I think we need to special case handling of object values, that is if they don't have a proper toString format, then we should be providing a recursive outputting of that object.

It could work through simply calling the outputFormatter again on that.

CMCDragonkai · 2022-10-17T08:11:14Z

Actually to handle the additional indentation, one has to keep track of how many levels of indentation required as a prefix to each line. This may require some additional bookkeeping. A sort of indentation counter.

However if the object has a custom toString defined, we just use that. See: https://stackoverflow.com/a/47903498/582917 for the distinction. That way if a custom object has a toString() then it can be used directly.

So generally we would be sending POJOs to the output formatter.

CMCDragonkai · 2023-07-10T14:09:44Z

This issue is to be moved to Polykey-CLI.

addievo · 2023-10-26T06:42:58Z

table format needs an update for different types of values

For instance for this value:

NodeId vq480ofld7vrol0ura1viif9474pb7ptrqm9s8ld23vn5t92oif4g, Address 127.0.0.1:55551, bucketIndex 249

The table output, outputs like this:

Private Zenhub Image

Which is not ideal.

CMCDragonkai · 2023-10-26T20:05:50Z

Prefer to upload images to GH, not zenhub, otherwise the uploads could break.

You should see that table output should always take an array of dictionaries. Then the initial keys of the first dictionary should be used as the columns.

It should be tab separated as per CLI standardisation between each cell.

CMCDragonkai · 2023-10-26T20:11:40Z

There is a bit of a problem with maintaining padding though, if you have to loop over all records to find out the maximum size of each value, which is a bit annoying.

Basically I would want the CLI output to be a valid TSV (tab separated values).

However I'm not sure about tabs and newlines within the data field.

See: http://dataprotocols.org/linear-tsv/ and https://github.com/eBay/tsv-utils/blob/master/docs/comparing-tsv-and-csv.md

CMCDragonkai · 2023-10-26T20:14:09Z

I think a quoted version of TSV would be suitable. If the field ends up having a tab or newline, it would be wrapped in quotes and printed out with \t or \n. Similar to how the values were printed in the current agent status.

CMCDragonkai · 2023-10-26T20:15:26Z

Imagine:

first	second	third
"abc\t\nfoo"	123	lol

Of course you can see that even using tabs is misaligned. Usually the CLI tool column is supposed to be piped to understand it.

CMCDragonkai · 2023-10-26T20:17:52Z

You can do a 1 iteration, scanning over all values and keeping a max-counter for the longest field in each column, then apply appropriate space padding whenever you're outputting them in the CLI.

That should be fast enough, printing anything in table format then becomes a O(2n) process.

CMCDragonkai · 2023-10-26T20:18:36Z

For streaming outputs you can keep a running max length and adjust accordingly for all subsequent new data.

addievo · 2023-10-31T17:51:19Z

@CMCDragonkai the issue with stream based padding, i.e. updating counter for every item in stream is that, that would have to be implemented on a case-to-case basis, since the data is fed into output formatter after it has been all collected and added to an array.

CMCDragonkai · 2023-10-31T19:06:32Z

I don't believe that is true.

addievo · 2023-10-31T22:59:32Z

table has been recently modified in #45

To be TSV compatible, allow custom options, like custom handlers, and optional numbered rows. Also, table is now O(2n) for streams.

CMCDragonkai · 2023-11-08T19:00:36Z

Now that #50 got merged, @amydevs can you review the OP spec of this issue and report on how it has been closed, and what has been updated?

CMCDragonkai · 2023-11-09T17:10:46Z

This is the original issue for standardising the design of the CLI stdout.

See my notes here for the recent changes: #44 (comment)

There's quite a few important points there, that requires updating the OP spec here.

CMCDragonkai · 2023-11-09T17:11:38Z

PK commands are likely to be combined in various ways in the shell. This means things like cmd $(pk ...), cmd <(pk ...), pk ... | cmd, cmd | pk ..., pk ... > ./blah, pk ... <<<blahblah. This means at any point in time where this occurs, we don't actually want to quote things unless the expected output in the case of dict and table fundamentally requires disambiguation between the usage of \n and \t formatting characters.
There is only 3 output formats that make use of formatting characters, and that's list, dict and table. In all 3 cases, they \n, and for dict and table they use \t. Our table format should use the Linear TSV specification: http://dataprotocols.org/linear-tsv/. The dict format however does not an equivalent, and we just use \t to separate between keys and values, and to pad the keys, and each line is then terminated with a \n. Quotation is needed for the keys and values if they contain control or non-printable characters.
Beware of unicode non-printable characters, they tend to appear like \u134908. However a question must be answered here, how do we deal with these? In some cases it make sense to always print them with the \u quotation, otherwise they are not printable in any sense.
List output only uses \n as delimiter. What if the list value itself has a \n internally? Then it likely needs to be quoted too. We could use "\n" in that scenario. Consider that when printing out the list of certificates, the certificate PEM string itself may already have \n in it. When printing, do we print one line at a time for each certificate? It would look strange. Do we just print the \n and thus realise that on the input side, there's no guarantee that 1 line is 1 field.

The main issue I'm wrestling with here is that when quotation is used, it impacts piping, composition, as in it complicates it requiring the input side to realise how to deal with the quoted form.

I think the main problem is that we have a CLI with a STDOUT that could be going to the terminal - an interactive thing, or to non-interactive outputs.

We can actually detect if the output fd is interactive or non-interactive. See: https://github.com/sindresorhus/is-interactive and we can make decisions here.

CMCDragonkai · 2023-11-09T17:13:19Z

Another thing is that when the format is changed to JSON, then the expected output is 1 JSON object or multiple JSON objects?

See: http://dataprotocols.org/ndjson/

I think: list, table all mean multiple json objects.

Whereas dict is a single json object.

In what sense do some list outputs translate to a JSON output? Needs to be investigated.

CMCDragonkai · 2023-11-09T22:39:13Z

So the output formatter can be applied multiple times, so it is possible to have multiple tables.

amydevs · 2024-02-22T00:54:35Z

There are also some intermittent failures of an output formatter that was updated recently. You can look into that as well.

@tegefaulkes Was there a specific command that it failed on?

tegefaulkes · 2024-02-22T01:00:15Z

It may have been a test in tests/utils.test.ts. It will fail sometimes if you keep running the tests.

amydevs · 2024-02-22T01:28:45Z

@tegefaulkes I've pushed a commit to staging that fixes it.

The error happened on seed 1318872640 for bin/utils › encodeEscaped should encode all escapable characters.

The reason was that \ characters needed to be encoded as \\

amydevs · 2024-02-22T02:33:37Z

We also need to make standardize whether commands that either succeed or fail, should have an output at all. And if they do have an output, what the output should be if the json output format is selected. An example is the identities allow command, where no output is given on success. The identities invite command on the other hand, has a message that is displayed on success.

I am in favor of having feedback on the success of a command, but there will need to be discussion on the standardization of the JSON output.

CMCDragonkai · 2024-02-22T03:35:03Z

Usually status reports for to stderr even on success. This is allow automation of the stdout. That's to separate general messages versus an actual output of the command.

CMCDragonkai · 2024-02-22T03:36:20Z

So yes we can definitely have stderr report useful messages. However right now it is only used for exception formatting. But you could create one as a general reporting for stderr. And yes we should align that with error reporting formatting.

amydevs · 2024-02-22T04:38:10Z

I also think we should more definitively decide whether our CLI is non-interactive, interactive, or both. We should also have a clear spec for this. This also especially relates to #22.

The AWS cli, being a industry standard example of an non-interactive CLI, will not output anything on successful commands. This is inline with any other CLI tool. If I were to run touch text.txt, there would be no output. Authentication is done by setting envvars/configs/running cli commands that set said config. The expected flow, is that the user gets an authentication token from the service, and copies it in. Therefore, we should not only consider if our command is interactive in the sense of whether the environment is an interactive shell, but also whether interactivity is happening.

The GitLab cli, is completely interactive. Every command has feedback, even things like animated loading spinners. This is much like using apt over apt-get.

The Pulumi cli has is able to do both, so that it can both be used as an interactive CLI, and also as non-interactive for CI purposes. An example of authenticating using both CLI modes:

CMCDragonkai · 2024-02-22T05:09:22Z

If the feedback is on stderr, then I think the distinction isn't necessary.

CMCDragonkai · 2024-02-22T05:10:08Z

So the idea is to keep it non-interactive as per awscli, but then provide feedback on stderr for both success and fail. Then we use the -v verbose flags to affect client side reporting.

CMCDragonkai · 2024-02-22T05:11:38Z

Meaning -v gets used for both controlling the agent reporting level when we do agent start, but for other commands affects the client reporting level.

Of course default level starts at info. We don't have a silent switch to go down 1 level. Could be done with an additional flag.

amydevs · 2024-02-22T05:31:53Z

So the idea is to keep it non-interactive as per awscli, but then provide feedback on stderr for both success and fail. Then we use the -v verbose flags to affect client side reporting.

sounds good. So all feedback should always be kept as stderr?

amydevs · 2024-02-26T03:16:40Z

Reference for CLI Guidelines:

https://clig.dev/

amydevs · 2024-02-29T02:02:56Z

help should never be written to stderr:

https://news.ycombinator.com/item?id=37682859

amydevs · 2024-02-29T02:30:04Z

there needs more discussion on the output of identity get command. At the moment, it's a bit confusing what each line refers to, as the output omits the keys of each value. I think we should use a dict format instead and include the keys.

amydevs · 2024-02-29T02:30:10Z

@tegefaulkes ^

amydevs · 2024-02-29T02:35:40Z

nevermind, i'm changing it over, because i realised that that is what identities search does at the moment, they should be consistent

amydevs · 2024-02-29T02:41:04Z

actually there does need to be discussion.

For long running commands under JSON format, like identities search, the output will be lines of json objects like {}\n{}\n{}.... However, for identities get, since it is not long running, and the value is resolved immediately, we could have the output of the identies be just an array, like [{}, {}, {}, etc]. For long running, should we manually add brackets around the output to have an array of objects? It feels like a bit of a hack, but i'm not sure what would be the best solution. @CMCDragonkai @tegefaulkes

tegefaulkes · 2024-02-29T02:49:52Z

We use a stream json parser for js-rpc. I think it can be configured to work both ways where it could parse [{},{},...] OR {},{},.... So I think both can be valid.I'm not sure which is preferred however.

CMCDragonkai · 2024-02-29T05:24:19Z

Json list format is literally newline separated JSON. Are you talking about human format? Or json format? If json format, then yes array output may be expected for batch commands. Stream commands use json list format with newline separation.

CMCDragonkai · 2024-02-29T05:26:36Z

help should never be written to stderr:

https://news.ycombinator.com/item?id=37682859

I like the third comment down. But not sure if commander works that way.

tegefaulkes · 2024-02-29T06:27:37Z

We should look into how we format and add information to the help text as well. There is a commander function addHelpText('after', 'text') but it's pretty simple. Formatting will have to be done ourselves and it's subject to word wrapping in the terminal. So it can be a little hard to read some times.

amydevs · 2024-03-14T04:12:59Z

we have commands that will output multiple objects separated by new lines as such {}\n{}\n. jq has a hard time parsing this. I think we should standardize all our streamed outputs to be finalised as [{},{}] so that jq would be able to parse them without rearranging them too much

amydevs · 2024-03-14T05:33:39Z

function outputStreamTransformJson(): TransformStream<any, string> {
  let firstChunk = true;
  return new TransformStream({
    start(controller) {
      controller.enqueue('[');
    },
    transform(chunk, controller) {
      const output = outputFormatterJson(chunk, { trailingNewLine: false });
      if (firstChunk) {
        controller.enqueue(output);
        firstChunk = false;
      } else {
        controller.enqueue(',' + output);
      }
    },
    flush(controller) {
      controller.enqueue(']\n');
    },
  });
}

I've written a utility function that creates a transformstream for this purpose.

amydevs · 2024-03-14T05:38:41Z

we have commands that will output multiple objects separated by new lines as such {}\n{}\n. jq has a hard time parsing this. I think we should standardize all our streamed outputs to be finalised as [{},{}] so that jq would be able to parse them without rearranging them too much

it turns out, that jq does have a --stream option, but it will output an entirely [key, value] format. This is obviously inconvenient for most users, so most people probably wont use this.

CMCDragonkai · 2024-03-14T16:46:46Z

No this is actually by design. These are called JSONL format. You don't use jq directly for this. Talk to chatgpt about this. Don't change it to array. 13 Mar 2024 23:13:23 Amy! ***@***.***>:

…

we have commands that will output multiple objects separated by new lines as such *{}\n{}\n*. *jq* has a hard time parsing this. I think we should standardize all our streamed outputs to be finalised as *[{},{}]* so that *jq* would be able to parse them without rearranging them too much — Reply to this email directly, view it on GitHub[#22 (comment)], or unsubscribe[https://github.com/notifications/unsubscribe-auth/AAE4OHLPFTN4DRQLPBL37TTYYEPWBAVCNFSM6AAAAAA3ZX2W6SVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOJWGM3DOMRQGU]. You are receiving this because you were mentioned. [Tracking image][https://github.com/notifications/beacon/AAE4OHJIDH5MHDLYNANQ6UTYYEPWBA5CNFSM6AAAAAA3ZX2W6SWGG33NNVSW45C7OR4XAZNMJFZXG5LFINXW23LFNZ2KUY3PNVWWK3TUL5UWJTTW7YSWK.gif]

CMCDragonkai · 2024-03-15T18:46:13Z

@amydevs regarding JSONL - did you review my original comment starting here #22 (comment).

There is a need to have JSONL format - it does make sense for commands that considered "streaming commands" as opposed to batch commands - they do work differently.

Consider the MatrixAI/Polykey#237 as well - that has to factor into streaming commands - so that one can convert to paginated results too.

CMCDragonkai · 2024-03-22T13:54:04Z

Close this if done @amydevs

amydevs · 2024-03-27T23:32:54Z

Close this if done @amydevs

The only part that hasn't been done is the verbosity of feedback messages. I think it's worth it to have a separate issue and PR for it, so I'll close this for now.

CMCDragonkai · 2024-03-27T23:39:10Z

Can you start a new issue for that?

emmacasolin added the development Standard development label Feb 17, 2022

CMCDragonkai added the epic Big issue with multiple subissues label Feb 18, 2022

CMCDragonkai assigned tegefaulkes Jul 10, 2023

CMCDragonkai added the r&d:polykey:core activity 1 Secret Vault Sharing and Secret History Management label Jul 10, 2023

CMCDragonkai transferred this issue from MatrixAI/Polykey Aug 22, 2023

tegefaulkes mentioned this issue Oct 25, 2023

CLI Beta Launch #40

Closed

CMCDragonkai assigned addievo and unassigned tegefaulkes Oct 26, 2023

addievo mentioned this issue Oct 31, 2023

Feature cli polish #45

Merged

11 tasks

CMCDragonkai assigned amydevs and unassigned addievo Nov 8, 2023

amydevs closed this as completed Mar 27, 2024

Standardise STDOUT output for CLI commands #22

Standardise STDOUT output for CLI commands #22

Comments

emmacasolin commented Feb 17, 2022 • edited by amydevs Loading

Specification

STDOUT

STDERR

STDIN

Piping

Additional context

Tasks

CMCDragonkai commented Feb 18, 2022 • edited Loading

CMCDragonkai commented Oct 17, 2022

CMCDragonkai commented Oct 17, 2022

CMCDragonkai commented Jul 10, 2023

addievo commented Oct 26, 2023

CMCDragonkai commented Oct 26, 2023

CMCDragonkai commented Oct 26, 2023

CMCDragonkai commented Oct 26, 2023

CMCDragonkai commented Oct 26, 2023

CMCDragonkai commented Oct 26, 2023

CMCDragonkai commented Oct 26, 2023

addievo commented Oct 31, 2023

CMCDragonkai commented Oct 31, 2023

addievo commented Oct 31, 2023

CMCDragonkai commented Nov 8, 2023

CMCDragonkai commented Nov 9, 2023

CMCDragonkai commented Nov 9, 2023

CMCDragonkai commented Nov 9, 2023

CMCDragonkai commented Nov 9, 2023

amydevs commented Feb 22, 2024

tegefaulkes commented Feb 22, 2024

amydevs commented Feb 22, 2024

amydevs commented Feb 22, 2024

CMCDragonkai commented Feb 22, 2024

CMCDragonkai commented Feb 22, 2024

amydevs commented Feb 22, 2024

CMCDragonkai commented Feb 22, 2024

CMCDragonkai commented Feb 22, 2024

CMCDragonkai commented Feb 22, 2024

amydevs commented Feb 22, 2024 • edited Loading

amydevs commented Feb 26, 2024 • edited Loading

amydevs commented Feb 29, 2024

amydevs commented Feb 29, 2024

amydevs commented Feb 29, 2024

amydevs commented Feb 29, 2024

amydevs commented Feb 29, 2024

tegefaulkes commented Feb 29, 2024

CMCDragonkai commented Feb 29, 2024

CMCDragonkai commented Feb 29, 2024

tegefaulkes commented Feb 29, 2024

amydevs commented Mar 14, 2024

amydevs commented Mar 14, 2024

amydevs commented Mar 14, 2024

CMCDragonkai commented Mar 14, 2024 via email

CMCDragonkai commented Mar 15, 2024

CMCDragonkai commented Mar 22, 2024

amydevs commented Mar 27, 2024

CMCDragonkai commented Mar 27, 2024

emmacasolin commented Feb 17, 2022 •

edited by amydevs

Loading

CMCDragonkai commented Feb 18, 2022 •

edited

Loading

amydevs commented Feb 22, 2024 •

edited

Loading

amydevs commented Feb 26, 2024 •

edited

Loading