Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Current file function #753

Closed
wants to merge 2 commits into from
Closed

Conversation

agordon
Copy link
Contributor

@agordon agordon commented Apr 17, 2015

based on previous patch (#752), this adds 'filename' and 'line' built-in
functions to jq (discussed in #743).

Example:

$ printf '{"a":1}\n{"a":2}\n' > 4.json
$ printf '{"a":"hello"}\n' > 5.json
$ ./jq '{ "file":filename, "line":line, "value":.a }' 4.json 5.json
{
  "file": "4.json",
  "line": 1,
  "value": 1
}
{
  "file": "4.json",
  "line": 2,
  "value": 2
}
{
  "file": "5.json",
  "line": 1,
  "value": "hello"
}

On runtime errors (ending in invalid state at 'main.c:process()'),
print the input filename and current line number which triggered the
error.

* util.c:
  struct jq_util_input_state: add variables
  jq_util_input_init(): initialize new variables
  jq_util_input_read_more(): update current file/line number on
                             `fgets()` calls.
  jq_util_input_get_position(): helper functions returning a JV_STRING
                                containing the current file/line.
  strncpyz(): helper function for safe string copy

* main.c:
  process(): upon invalid result ('uncaught jq exception'), get the
             current input position and print it to stderr.

* jq.h: declare 'jq_input_get_position()'.

With this patch, runtime errors printed to stderr will contain the
filename and line of the offending input.

Examples:

With stdin and multiple lines:

    $ printf '{"a":43}\n{"a":{"b":66}}\n' | ./jq '.a+1'
    44
    jq: error (at stdin:2): object and number cannot be added

With multiple files:

    $ printf '{"a":43}' > 1.json
    $ printf '{"a":"hello"}\n' > 2.json
    $ printf '{"a":{"b":66}}\n' > 3.json
    $ ./jq '[.a]|@TSV' 1.json 2.json 3.json
    "43"
    "hello"
    jq: error (at 3.json:1): object is not valid in a csv row

With very long lines (spanning multiple `fgets` calls):

    $ (  printf '{"a":43}\n' ;
         printf '{"a":{"b":[' ; seq 10000 | paste -d, -s | tr -d '\n' ;
         printf ']}}\n' ;
         printf '{"a":"hello"}\n' ) | ./jq '[.a] | @TSV'
    "43"
    jq: error (at stdin:2): object is not valid in a csv row
    "hello"

With raw input:

    $ seq 1000 | ./jq --raw-input 'select(.=="700") | . + 10'
    jq: error (at stdin:700): string and number cannot be added

Caveat:
The reported line will be the last line of the (valid) parsed JSON data.
Example:

    $ printf '{\n"a":\n"hello"\n\n\n}\n' | ./jq '.a+4'
    jq: error (at stdin:6): string and number cannot be added

minor ugly hack:
The call the get the current filename/line in 'main.c' is hard-coded
to 'jq_util_input_get_position()' which somewhat bypasses the idea
of using an input callback (e.g. 'jq_set_input_cb()').
But since similar calls to 'jq_utl_input_XXXX' are also hard-coded in
'main.c', the input callback mechanism isn't really generatic at the moment.
based on previous patch (jqlang#752), this adds 'filename' and 'line' built-in
functions to jq (discussed in jqlang#743).

Example:

    $ printf '{"a":1}\n{"a":2}\n' > 4.json
    $ printf '{"a":"hello"}\n' > 5.json
    $ ./jq '{ "file":filename, "line":line, "value":.a }' 4.json 5.json
    {
      "file": "4.json",
      "line": 1,
      "value": 1
    }
    {
      "file": "4.json",
      "line": 2,
      "value": 2
    }
    {
      "file": "5.json",
      "line": 1,
      "value": "hello"
    }
@nicowilliams
Copy link
Contributor

It would be nice too if we could get run-time errors to include jq source location. That would be tricky, as I've noted elsewhere (I think), because we don't have that information left in the bytecode. We'd need something like DWARF for jq :(

@nicowilliams
Copy link
Contributor

I'll review.

@pkoppstein
Copy link
Contributor

@agordon - This tweak and PR752 are both very welcome!

With respect to the choice of names for what you have called "filename" and "line", I was thinking that to minimize the chance of conflicts with existing code, it might be better to choose names such as filename and line (or in C/Ruby-style FILE and LINE). What do you think?

Also, while you're on a roll, or whenever you have the time and energy, you might like to have a crack at these ERs:

PRNG #677
curl #650

Thanks!

@agordon
Copy link
Contributor Author

agordon commented Apr 17, 2015

@pkoppstein - to be honest - I don't have much free time.
The patches I sent are directly helping my work when I process huge amounts of JSON data and need to troubleshoot weird errors.

Looking at #677 - it might be useful in some circumstances, but implementation a reliable, truly-random and yet portable PRNG is not trivial at all (adding a crappy one is easy, though :) ).

and for #650 - I don't understand why anyone would want that - JQ is supposed to be a stream-processing (i.e. like a unix filter program) - adding the complexity of web access with it's numerous problems and flow control sounds counter-productive. For example: how do you handle retries? or redirections (IIRC curl doesn't handle redirections automatically). So many related options that will need to be implemented - doesn't sound like it's worth it.
Similarly to the request in #650, I also download large amount of JSON data, but I wouldn't do it in JQ itself - I prefer a different programming language to handle all the different cases.
JQ fits so nicely in the unix paradigm of a 'text filter' - I wouldn't want to make it into something else.

@pkoppstein
Copy link
Contributor

@agordon wrote:

I don't have much free time. ...

Understood. We appreciate your taking the time to enhance jq, even if it is for selfish reasons! :-)

Regarding #677 - yes, portability (across the jq target platforms) is required, but I suspect that those who have expressed an interest in having a PRNG in jq would not require anything like "true-randomness". Perhaps most would prefer even a crappy PRNG to none, provided it's properly documented. Once a working PRNG is in place, non-jq experts could more easily offer enhancements.

Regarding #650 - yes, jq and curl often work nicely together, but in certain contexts, the file or URL one needs is sometimes (or often) determined dynamically. Please also note that #650 envisions a very simple approach. Hopefully, both file: and remote URLs will be supported.

@nicowilliams
Copy link
Contributor

Thanks!

@nicowilliams
Copy link
Contributor

There seems to be a heisenbug that causes the tests to fail (jq --run-tests exits with 1 but that makes no sense because it says all tests pass and none are malformed, unless... could fclose(stdout) be failing?).

@richardbarrell-calvium
Copy link

For anyone else who landed here from a google search: the filename/0 function was later renamed to input_filename/0. Likewise line/0 was renamed to input_line/0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants