Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

faster version of _nwise #1572

Closed
pkoppstein opened this issue Jan 4, 2018 · 11 comments · Fixed by #2641
Closed

faster version of _nwise #1572

pkoppstein opened this issue Jan 4, 2018 · 11 comments · Fixed by #2641

Comments

@pkoppstein
Copy link
Contributor

Here are plug-in replacements for the existing defs. These replacements use 0-arity filters to effect the recursion, and are ridiculously faster even for arrays of size 10^4. (*)

def _nwise($n):
  def nw: if length <= $n then . else .[0:$n] , (.[$n:] | nw) end;
  nw;

def _nwise(a; $n): a | _nwise($n);

(*) [[range(0;n)] | _nwise(3)] | length

            builtin.jq   replacement
n           u+s          u+s
10,000      23.0s        0.027s
@nicowilliams
Copy link
Contributor

@pkoppstein Very nice! Yeah, we should probably audit src/builtin.jq for non-tail-call recursion and see if we can fix all of those. Another idea might be to have the compiler discover such cases and warn. (Ideally the compiler could even rewrite into tail-call forms, but I'm not sure how many patterns the compiler could practically recognize, and we'd probably want to be able to prove equivalence -- way beyond the complexity level that jq should have).

@pkoppstein
Copy link
Contributor Author

@nicowilliams - btw, since (a) _nwise isn’t documented and (b) it begins with _ and (c) we want to minimize the number of builtins, I’d suggest scrapping the arity-2 version :-)

@itchyny
Copy link
Contributor

itchyny commented Mar 9, 2021

I really hope nwise/1 is exposed and documented. Another definition of this function is as follows (shows no performance difference with the pkoppstein's definition).

def nwise($n): while(.!=[]; .[$n:])[:$n];

@pkoppstein
Copy link
Contributor Author

pkoppstein commented Jul 28, 2023

@itchyny wrote:

I really hope nwise/1 is exposed and documented.

nwise is included in #2777 #2786

@nicowilliams nicowilliams added this to the 1.8 release milestone Jul 28, 2023
@wader
Copy link
Member

wader commented Jul 29, 2023

I really hope nwise/1 is exposed and documented. Another definition of this function is as follows (shows no performance difference with the pkoppstein's definition).

Agree, think it would be a good addition.

def nwise($n): while(.!=[]; .[$n:])[:$n];

Neat implementation! Do we want it to work with strings? currently this one gets stuck with a string as input.

@pkoppstein
Copy link
Contributor Author

@wader - The _nwise in jq 1.6 works on both strings and arrays:

jq-1.6 -n '"abcdef" | _nwise(2)'
"ab"
"cd"
"ef"

One of the purposes of PR #2786 is to make this filter "official".

Another is to add a stream-oriented version that imposes no restrictions on the stream.

@itchyny
Copy link
Contributor

itchyny commented Jul 31, 2023

Do we want it to work with strings? currently this one gets stuck with a string as input.

Fixing this is easy.

def nwise($n): while(length>0; .[$n:])[:$n];

@leonid-s-usov
Copy link
Contributor

shouldn't it be def nwise($n): while(length>=$n; .[$n:])[:$n];?

@wader
Copy link
Member

wader commented Jul 31, 2023

Good question, what to do on non-even length? think i would expect to get a short last output.

Also need a check for $n > 0 i think?

@pkoppstein
Copy link
Contributor Author

pkoppstein commented Jul 31, 2023

Please note that this issue (the efficiency of implementation of _nwise/1) was resolved by #2641

I left the issue open because of @itchyny's remark:

I really hope nwise/1 is exposed and documented.

thus tying this issue to PR #2786, in which

  • _nwise/1 is renamed to nwise/1 (but with an added check that its argument > 0)

  • a stream-oriented version of nwise is added

  • supporting documentation and test cases have been added

(Since @nicowilliams suggested that _nwise/1 should be dropped
rather than retained for backward compatibility, the
above-mentioned PR also drops _nwise/2, which added
no functionality anyway.)

As for the discussion about an implementation using while,
please note that @itchyny has already pointed out that there is no efficiency gain to be obtained
by doing so. Also, the confusion evident in this thread about such
an implementation suggests that playing code golf here is probably ill-advised from a
certain perspective. The implementation strategy using if can be said to have stood the test of time quite well.

@itchyny itchyny modified the milestones: 1.8 release, 1.7 release Jul 31, 2023
@itchyny
Copy link
Contributor

itchyny commented Jul 31, 2023

Since the original intention of this issue is already resolved by #2641 (I've forgot this until now; please do not mixing things in a PR), I'm closing this issue.

@itchyny itchyny closed this as completed Jul 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants