Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

def last: #2850

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open

def last: #2850

wants to merge 3 commits into from

Conversation

pkoppstein
Copy link
Contributor

def last: .[length-1]

This is sufficient to ensure that pick(last) works in the obvious way, as it did before some recent updates.

`def last: .[length-1]`

This is sufficient to ensure that `pick(last)` works in the obvious way.
def last: .[length-1]

This is sufficient to ensure that pick(last) works in the obvious way, as it did before some recent updates.
@itchyny
Copy link
Contributor

itchyny commented Aug 18, 2023

I'd like you to share your future plan to make pick/1 work well with any paths. Thanks.

@nicowilliams
Copy link
Contributor

Right, so we can make pick(last) work, but what about pick(.[-2]) and so on?

@pkoppstein
Copy link
Contributor Author

I have no objections to not supporting negative indices. There is currently no named def for .[length-2] but if there were, it would work, so I don’t see any issue there. If someone wants the penultimate item, they can write pick( .[length-2] ), etc.

@nicowilliams
Copy link
Contributor

nicowilliams commented Aug 18, 2023

Here's an ugly, sub-optimal keep/1:

def keep(ps):
    ([norm(path(ps))]|sort) as $ps
  | ([norm(path(..))|$ps[] as $p|select(not(prefix($p; .)))]|sort) as $aps
  | delpaths(($aps - $ps));

I'll share norm/1 (which normalizes paths, turning negative indices into positive ones) and prefix/1 (which outputs true then the first path is a prefix of the second) when I get around to making them work better. (not/1 is the obvious def not(exp): exp|not;.)

@itchyny
Copy link
Contributor

itchyny commented Aug 18, 2023

I'm sharing some observations; considering p is an arbitrary single path (I simplified rules but we can generalize using reduce or wrapping by [...]),

  • p == getpath(path(p)) == (path(p) as $p | getpath($p)) should be guaranteed
  • . == (path(p) as $p | setpath($p; getpath($p))) is also guaranteed
    • this rule leads to p == (path(p) as $p | setpath($p; getpath($p)) | p)
    • and also ($x | p) == ($x | path(p) as $p | setpath($p; $x | getpath($p)) | p)

and constant paths are pure i.e.

  • ($x | path(p)) == ($y | path(p)) if p is a constant path and type error does not occur
    • this leads to ($x | path(p)) == (null | path(p)) is p is constant and type error does not occur
    • and also($x | path(p)) as $p | ($y | p) == ($y | getpath($p)) is p is constant and type error does not occur
    • wrapping the path of negative indexing/slicing will break this property, but is it okay?
    • what [1] | path(p) as $p | [4,5] | getpath($p) should yield when def p: last;? what if def p: .[-1:]?

but when setting a path against different values,

  • ($x | p) == ($x | path(p) as $p | null | setpath($p; $x | getpath($p))) | p) is not guaranteed
    • but pick/1 wants this rule to be followed, where we're facing multiple issues;
      • to make it work with p == .[-1], we need either to break the purity of constant paths or to keep the array size information somehow
      • to make it work with p == .[1:3], we need to introduce breaking changes by padding nulls on updating smaller arrays (or null) by slicing paths
      • to make it work with p == .[-3:-1], we need to solve the both above
        • we could include the size information to slicing path object but not able to breaking indexing paths

@nicowilliams
Copy link
Contributor

@itchyny yes, the issue is null | setpath($p; $v) -- setpath/2 has no information with which to normalize negative indices. We could have a setpath/3 that takes a model value whose array lengths could be used to normalize those negative indices. Or we could normalize those indices in pick/1's body before passing them to setpath/2. For keep/1 I'm taking the latter approach (and then I'm deleting paths instead of setting them, but this last difference is just to have slightly different semantics from pick/1, thus the different name).

@nicowilliams
Copy link
Contributor

I have no objections to not supporting negative indices. There is currently no named def for .[length-2] but if there were, it would work, so I don’t see any issue there. If someone wants the penultimate item, they can write pick( .[length-2] ), etc.

That's certainly an option. But jq is a fairly elegant language, so it'd be nice to support negative indices here.

@nicowilliams
Copy link
Contributor

./jq -cn '{a:[{b:0,c:1},{d:2,e:3},{d:50,f:{g:4},h:5}]}|pick(.a[1:3][].d)'
{"a":[{"d":2},null,{"d":50}]}

Seems... wrong. I expected {"a":[null,{"d":2"},{"d":50}]}.

@nicowilliams
Copy link
Contributor

nicowilliams commented Aug 18, 2023

OK, finally path normalization code I can share. The idea is to take an array or object and a path (as output by path/1) in that array/object, and output a normalized version of that path with a) all negative indices normalized to positive indices, b) all array slices expanded to array indices (so [{start:1, end:3},0] becomes [1]). Then we can use those paths without problem to either setpath($p; $value) like pick/1 does, or to delete all but these paths like keep/1 might want to do.

(This is still not very pretty. We can optimize for elegance later.)

# Normalizes negative indices in `$p` relative to `.`
def norm($p):
    def type($x): $x|type;
    def length($x): $x|length;

    # `$p` here is of the form `[<number>, ...]`.
    # `.`  here is an array that `$p` is a path in
    #
    # Outputs `$p` with `$p[0]` normalized.
    def norm_idx($p):
        if $p[0] < 0 then [$p[0] + length] + $p[1:] else $p end;

    # `$p` here is of the form `[{start: ..., end: ...}, ...]`.
    # `.`  here is an array that `$p` is a path in
    #
    # Outputs `$p` with `$p[0] normalized
    def norm_slice($p):
        length as $len
      |(if $p[0].start < 0
        then
            [$p[0].start + $len, $p[0].end]
        else $p
        end) as $p
      |(if $p[0].end < 0
        then
            [$p[0].start, $p[0].end + $len]
        else $p
        end);

    # Main body of `norm/1`
   (if length($p) == 0
    # $p == [], nothing to do
    then $p

    elif type($p[0]) == "object"
    # Normalize slice indices
    then
        norm_slice($p) as $p
      | if length($p) == 1
        then [range($p.start; $p.end)]
        elif type($p[1]) != "number"
        then "Array slice index must be numeric" | error
        elif $p[1] < 0
        then [$p[0] + $p[1] + ($p.end - $p.start)] + $p[2:]
        else
            [$p[0].start + $p[1]] + $p[2:]
        end
    elif type($p[0]) == "number"
    # Normalize negative array indices
    then
        norm_idx($p)
    # No normalization needed for object keys
    else $p
    end) as $p
  | if length($p) <= 1
    then $p
    else
        [$p[0]] + (getpath($p[0:1]) | norm($p[1:]))
    end
    ;
def keep(ps):
    def prefix($a; $b):
        def min($a; $b):
            if $a < $b
            then $a
            else $b
            end;
        min($a|length; $b|length) as $l
  | $a[0:$l] == $b[0:$l];
    def not($x): $x|not;

    ([norm(path(ps))]|sort) as $ps
  | ([norm(path(..))|$ps[] as $p|select(not(prefix($p; .)))]|sort) as $aps
  | delpaths(($aps - $ps));

@nicowilliams
Copy link
Contributor

@pkoppstein please check if you can make pick/1 use a path normalization function the one above, and see if that makes it work correctly. Maybe you can make an elegant path normalization function?

@pkoppstein
Copy link
Contributor Author

pkoppstein commented Aug 18, 2023

@nicowilliams wrote:

please check if you can make pick/1 use a path normalization function ...

I will do so, but for 1.7, I am inclined to think it would be sufficient to tweak the def of last to make pick(last) work, and otherwise leave things pretty much as they are. The documentation already states that pick is not intended for use with negative or m:n indices, which I think is good enough for 1.7 (though perhaps the proviso could be slightly weakened).

There are two reasons for my reservations about aiming higher for 1.7:
(1) it would almost certainly contribute to unwarranted delay (if only because of distraction);
(2) I have already noticed some slowdowns of 1.7rc1 relative to earlier (post 1.6) versions, and think it would be better in the near term either to address those or to avoid any changes that might make things worse w.r.t. performance.

jq.1.prebuilt Outdated
@@ -1,3 +1,4 @@
pipenv run python validate_manual_schema.py content/manual/manual.yml
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accidentally added line?

rm auto-inserted spurious pipenv line
@pkoppstein
Copy link
Contributor Author

@wader wrote:

Accidentally added line?

It was added by make jq.1.prebuilt (!!!)

I removed it manually. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants