-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Standardise STDOUT output for CLI commands #22
Comments
I was thinking that for commands that output a stream of data, the human readable formats don't currently have anything separating each "unit" of output. That is, for
Which represents 3 units of output. In JSON, this is distinguished simply by the newline:
So for the human readable output, it may be a good idea to have a double line separation between each unit. The end result should look like:
And this would be similar for other kinds of outputs like
But to do this, you'd have to add Something like this:
To make this more robust, we may add a new output formatter async generator (consumer), that takes input and makes these decisions automatically rather than having to put it in each command. |
According to MatrixAI/Polykey#446 (comment) our dictionary formatter doesn't support recursive dictionaries. I think we need to special case handling of object values, that is if they don't have a proper It could work through simply calling the |
Actually to handle the additional indentation, one has to keep track of how many levels of indentation required as a prefix to each line. This may require some additional bookkeeping. A sort of indentation counter. However if the object has a custom So generally we would be sending POJOs to the output formatter. |
This issue is to be moved to Polykey-CLI. |
table format needs an update for different types of values For instance for this value:
The table output, outputs like this: Which is not ideal. |
Prefer to upload images to GH, not zenhub, otherwise the uploads could break. You should see that table output should always take an array of dictionaries. Then the initial keys of the first dictionary should be used as the columns. It should be tab separated as per CLI standardisation between each cell. |
There is a bit of a problem with maintaining padding though, if you have to loop over all records to find out the maximum size of each value, which is a bit annoying. Basically I would want the CLI output to be a valid TSV (tab separated values). However I'm not sure about tabs and newlines within the data field. See: http://dataprotocols.org/linear-tsv/ and https://github.com/eBay/tsv-utils/blob/master/docs/comparing-tsv-and-csv.md |
I think a quoted version of TSV would be suitable. If the field ends up having a tab or newline, it would be wrapped in quotes and printed out with |
Imagine:
Of course you can see that even using tabs is misaligned. Usually the CLI tool |
You can do a 1 iteration, scanning over all values and keeping a max-counter for the longest field in each column, then apply appropriate space padding whenever you're outputting them in the CLI. That should be fast enough, printing anything in table format then becomes a O(2n) process. |
For streaming outputs you can keep a running max length and adjust accordingly for all subsequent new data. |
@CMCDragonkai the issue with stream based padding, i.e. updating counter for every item in stream is that, that would have to be implemented on a case-to-case basis, since the data is fed into output formatter after it has been all collected and added to an array. |
I don't believe that is true. |
To be TSV compatible, allow custom options, like custom handlers, and optional numbered rows. Also, |
This is the original issue for standardising the design of the CLI stdout. See my notes here for the recent changes: #44 (comment) There's quite a few important points there, that requires updating the OP spec here. |
The main issue I'm wrestling with here is that when quotation is used, it impacts piping, composition, as in it complicates it requiring the input side to realise how to deal with the quoted form. I think the main problem is that we have a CLI with a STDOUT that could be going to the terminal - an interactive thing, or to non-interactive outputs. We can actually detect if the output fd is interactive or non-interactive. See: https://github.com/sindresorhus/is-interactive and we can make decisions here. |
Another thing is that when the format is changed to JSON, then the expected output is 1 JSON object or multiple JSON objects? See: http://dataprotocols.org/ndjson/ I think: Whereas In what sense do some |
So the output formatter can be applied multiple times, so it is possible to have multiple tables. |
@tegefaulkes Was there a specific command that it failed on? |
It may have been a test in |
@tegefaulkes I've pushed a commit to The error happened on seed The reason was that |
We also need to make standardize whether commands that either succeed or fail, should have an output at all. And if they do have an output, what the output should be if the I am in favor of having feedback on the success of a command, but there will need to be discussion on the standardization of the JSON output. |
Usually status reports for to stderr even on success. This is allow automation of the stdout. That's to separate general messages versus an actual output of the command. |
So yes we can definitely have stderr report useful messages. However right now it is only used for exception formatting. But you could create one as a general reporting for stderr. And yes we should align that with error reporting formatting. |
I also think we should more definitively decide whether our CLI is non-interactive, interactive, or both. We should also have a clear spec for this. This also especially relates to #22. The AWS cli, being a industry standard example of an non-interactive CLI, will not output anything on successful commands. This is inline with any other CLI tool. If I were to run The GitLab cli, is completely interactive. Every command has feedback, even things like animated loading spinners. This is much like using The Pulumi cli has is able to do both, so that it can both be used as an interactive CLI, and also as non-interactive for CI purposes. An example of authenticating using both CLI modes: |
If the feedback is on stderr, then I think the distinction isn't necessary. |
So the idea is to keep it non-interactive as per awscli, but then provide feedback on stderr for both success and fail. Then we use the |
Meaning Of course default level starts at info. We don't have a silent switch to go down 1 level. Could be done with an additional flag. |
sounds good. So all feedback should always be kept as stderr? |
Reference for CLI Guidelines: |
help should never be written to stderr: |
there needs more discussion on the output of |
nevermind, i'm changing it over, because i realised that that is what |
actually there does need to be discussion. For long running commands under JSON format, like |
We use a stream json parser for |
Json list format is literally newline separated JSON. Are you talking about human format? Or json format? If json format, then yes array output may be expected for batch commands. Stream commands use json list format with newline separation. |
I like the third comment down. But not sure if commander works that way. |
We should look into how we format and add information to the help text as well. There is a commander function |
we have commands that will output multiple objects separated by new lines as such |
function outputStreamTransformJson(): TransformStream<any, string> {
let firstChunk = true;
return new TransformStream({
start(controller) {
controller.enqueue('[');
},
transform(chunk, controller) {
const output = outputFormatterJson(chunk, { trailingNewLine: false });
if (firstChunk) {
controller.enqueue(output);
firstChunk = false;
} else {
controller.enqueue(',' + output);
}
},
flush(controller) {
controller.enqueue(']\n');
},
});
} I've written a utility function that creates a transformstream for this purpose. |
it turns out, that |
No this is actually by design. These are called JSONL format. You don't use jq directly for this. Talk to chatgpt about this. Don't change it to array.
13 Mar 2024 23:13:23 Amy! ***@***.***>:
…
we have commands that will output multiple objects separated by new lines as such *{}\n{}\n*. *jq* has a hard time parsing this. I think we should standardize all our streamed outputs to be finalised as *[{},{}]* so that *jq* would be able to parse them without rearranging them too much
—
Reply to this email directly, view it on GitHub[#22 (comment)], or unsubscribe[https://github.com/notifications/unsubscribe-auth/AAE4OHLPFTN4DRQLPBL37TTYYEPWBAVCNFSM6AAAAAA3ZX2W6SVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOJWGM3DOMRQGU].
You are receiving this because you were mentioned.
[Tracking image][https://github.com/notifications/beacon/AAE4OHJIDH5MHDLYNANQ6UTYYEPWBA5CNFSM6AAAAAA3ZX2W6SWGG33NNVSW45C7OR4XAZNMJFZXG5LFINXW23LFNZ2KUY3PNVWWK3TUL5UWJTTW7YSWK.gif]
|
@amydevs regarding JSONL - did you review my original comment starting here #22 (comment). There is a need to have JSONL format - it does make sense for commands that considered "streaming commands" as opposed to batch commands - they do work differently. Consider the MatrixAI/Polykey#237 as well - that has to factor into streaming commands - so that one can convert to paginated results too. |
Close this if done @amydevs |
The only part that hasn't been done is the verbosity of feedback messages. I think it's worth it to have a separate issue and PR for it, so I'll close this for now. |
Can you start a new issue for that? |
Specification
Our new commandments for a better, brighter, command line:
/dev/null
, but we're going to provide more verbosity options that suppresses all non-essential text (feedback, warnings, etc.).camelCase
for anything going tostdout
. As we prioritize parsability, this needs to be the default. No more manually outputting strings likeName: amy
.dicts
formats wherever possible. Instead, the nested structure should be spread into thedata
.jq
, one needs to use--slurp
STDOUT
Traditionally, we have constructed human readable dictionaries as such:
This is clunky, inconsistent, worse for parsability and more...
In all cases where the output is "semantically" a list, the output should use either
json
orlist
. In all other cases, thelist
option should be replaced withdict
. This should be done like this:The
data
parameter must be kept as close to the return value of the RPC call possible. This ensures that our output is predictable. FurthermoreSTDERR
Log messages, errors, and so on should all be sent to stderr. This means that when commands are piped together, these messages are displayed to the user and not fed into the next command.
Most STDERR messages should either be using the
raw
orlist
format. This is because they are intended to be human readable. Underjson
format, the data should always be{ message: "..." }
.STDIN
STDIN should always be checked for interactivity before asking prompts. This means that if a Polykey Client session has not already been opened, we should error before asking for a password prompt on a non-interactive terminal.
If input or output is a file, support
-
to read from stdin or write to stdout. This lets the output of another command be the input of your command and vice versa, without using a temporary file. For example:cat secret.txt | polykey secret create - vault:secret
Piping
There is currently issues with piping input into Polykey-CLI: #139
TBD...
Additional context
Tasks
list
format option withdict
(except in situations where the output is more semantically suited to a list)-q
argument to suppress non-essential output.The text was updated successfully, but these errors were encountered: