Add builtin to output to file #3153

myaaaaaaaaa · 2024-07-20T19:57:53Z

I would like to propose a new builtin, say output_file("filename"; "contents"), which copies its input to its output while saving its arguments as a file. It's similar in principle to the debug builtin.

If --sandbox is specified, it will simply output the filename to stderr, essentially acting as a dry run.

Having this builtin would make it less awkward to split json files. See below for some workarounds that are currently required:

Proposed semantics:

# sample script
to_entries[] | output_file(.key; .value) | .key

# stdin
{
	"a.txt": "string\nstring",
	"b/c.txt": "invalid",
	"d.json": {
		"e": 10
	}
}

# stdout
"a.txt"
"b/c.txt"
"d.json"

# stderr
b/c.txt: No such file or directory


# a.txt
string
string

# d.json
{"e":10}

The text was updated successfully, but these errors were encountered:

wader · 2024-07-31T06:51:03Z

If you ok with using fq i've used a tar hack a few times to output multiple files. Something like this:

Copy tar code from https://github.com/wader/fq/wiki/snippets into tar.jq then

$ fq -n -L . 'include "tar"; to_tar({filename: "a", data: "aaa"}, {filename: "b", data: "bbb"})' | tar tv
-rw-r--r--  0 user   group       3 Jan  1  1970 a
-rw-r--r--  0 user   group       3 Jan  1  1970 b

Maybe you could rewrite the tar code to work with standard jq but then as jq does not support raw binary output you might be limited to just ASCII data in files etc.

itchyny · 2024-08-01T02:05:03Z

Simple way of doing this is outputting a shell script from jq. That's how @sh is used for.

jq -r 'to_entries[] | @sh"echo \(.value|tostring) > \(.key)"' | sh

myaaaaaaaaa · 2024-08-05T23:43:35Z

Simple way of doing this is outputting a shell script from jq. That's how @sh is used for.

In general, I also prefer outputting shell scripts over something like #3133 (although I didn't know about @sh for escaping - thanks for that!)

However, there's been several times where I've had to split the output into multiple shell scripts, and the resulting doubly-escaped script was a headache to review, which was what inspired this proposal.

Presumably, there's many other places where having this as a builtin would make for a nice quality-of-life improvement.

wader · 2024-08-06T15:19:26Z

Agree that it would be nice with more I/O features. In my view the biggest issue is how to make it all fit nicely together, e.g #1843 includes file handles support that would make some of this possible to implement as builtins i think. Then also what would be good names and API? input/1 to read a file as JSON, string or how to specify? output/1 to write? tee/1 to write and pass thru? things like that.

Maybe a way forward could be to flash out how these API could look like and be used by a user and then maybe see what subset could be implement without major changes? that way we could minimize risk of adding something that turns out to be incompatible or awkward to combine with future fancier I/O, coeval, etc support.

myaaaaaaaaa · 2024-08-06T18:14:59Z

Agree that it would be nice with more I/O features. In my view the biggest issue is how to make it all fit nicely together, e.g #1843 includes file handles support that would make some of this possible to implement as builtins i think.

For IO, I would advocate for having very few individually tailored high-level primitives, rather than many low-level building blocks like in that PR.

Due to the nature of jq being a functional language, interacting with the outside world is a much more advanced feature than usual¹, and can end up being surprisingly asymmetrical (see below).

I'm even willing to be convinced that IO doesn't even belong in jq at all (hence this proposal being opened as an issue rather than a PR).

input/1 to read a file as JSON, string or how to specify?

I would actually advocate for something like an --input-var option instead, which reads all files into an $input variable containing a filename-to-contents map (essentially a more generalized form of --slurpfile and --rawfile)

Usage would be something like:

jq --input-var '$input | .["a.json"]' *.json

Maybe a way forward could be to flash out how these API could look like and be used by a user and then maybe see what subset could be implement without major changes? that way we could minimize risk of adding something that turns out to be incompatible or awkward to combine with future fancier I/O, coeval, etc support.

Another way to manage this risk could be to prefix experimental APIs (for example, this could be named _exp_output_file), and print warnings that the functionality is subject to change.

For another example of typically standard functionality being treated as an advanced feature, note that jq officially considers variables an advanced feature ↩

myaaaaaaaaa mentioned this issue Aug 25, 2024

Implement _experimental_snapshot/2 #3165

Closed

myaaaaaaaaa closed this as not planned Won't fix, can't repro, duplicate, stale Sep 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add builtin to output to file #3153

Add builtin to output to file #3153

myaaaaaaaaa commented Jul 20, 2024 •

edited

Loading

wader commented Jul 31, 2024

itchyny commented Aug 1, 2024

myaaaaaaaaa commented Aug 5, 2024 •

edited

Loading

wader commented Aug 6, 2024

myaaaaaaaaa commented Aug 6, 2024 •

edited

Loading

Add builtin to output to file #3153

Add builtin to output to file #3153

Comments

myaaaaaaaaa commented Jul 20, 2024 • edited Loading

wader commented Jul 31, 2024

itchyny commented Aug 1, 2024

myaaaaaaaaa commented Aug 5, 2024 • edited Loading

wader commented Aug 6, 2024

myaaaaaaaaa commented Aug 6, 2024 • edited Loading

Footnotes

myaaaaaaaaa commented Jul 20, 2024 •

edited

Loading

myaaaaaaaaa commented Aug 5, 2024 •

edited

Loading

myaaaaaaaaa commented Aug 6, 2024 •

edited

Loading