Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add repeat for Char to fix performance of pow #22797

Merged
merged 3 commits into from
Jul 16, 2017
Merged

Conversation

musm
Copy link
Contributor

@musm musm commented Jul 13, 2017

close #22785

But more importantly fix the bug for ::Char ^ 0 and ::Char^1

As a future optimization someone may want to improve the nonascii case to remove the allocation for the string conversion.

# benchmark using 0.6

julia> using BenchmarkTools; import Base.repeat, Base._string_n

julia> function repeat(c::Char, r::Integer)
           r < 0 && throw(ArgumentError("can't repeat a character $r times"))
           if isascii(c)
               out = _string_n(r)
               ccall(:memset, Ptr{Void}, (Ptr{UInt8}, Cint, Csize_t), out, c, r)
               return out
           else
               return repeat(string(c), r)
           end
       end
repeat (generic function with 4 methods)

julia> @btime repeat($('x'), $(100));
  28.140 ns (1 allocation: 128 bytes)

julia> @btime repeat($("x"), $(100));
  280.596 ns (1 allocation: 128 bytes)

julia> @btime repeat($('x'), $(10));
  18.158 ns (1 allocation: 32 bytes)

julia> @btime repeat($("x"), $(10));
  42.239 ns (1 allocation: 32 bytes)

julia> @btime repeat($('x'), $(2));
  18.158 ns (1 allocation: 32 bytes)

julia> @btime repeat($("x"), $(2));
  21.711 ns (1 allocation: 32 bytes)

@ararslan ararslan requested a review from stevengj July 13, 2017 19:17
"""
repeat(s::String, r::Integer)

Repeat a string `r` times.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

little odd that repeat is currently in the linalg doc index, maybe should be moved to more general collections section

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could have it in both the linalg and string indices, with AbstractArray and AbstractString signatures.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I'm not sure why anyone would ever call repeat for a string rather than using ^.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated PR please check if this now looks good.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the multiple docstrings would need to be distinguished by signature otherwise they will all be repeated in both sections

Copy link
Contributor Author

@musm musm Jul 13, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tkelman thanks will update. This means I should include:

Base.repeat(::AbstractString, ::Integer)
Base.repeat(::Char, ::Integer)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well those two docstrings would probably be fine to list together since they're closely related - it's the array one that should be in a different section

@@ -438,3 +449,25 @@ function repeat(s::String, r::Integer)
end
return out
end

"""
repeat(c::Char, r::Integer) -> String
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't we use Julia syntax repeat(c::Char, r::Integer)::String for this sort of thing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, that change should however be made or brought up in another PR/issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't find an open issue on this, so I filed one here #22804

```
"""
function repeat(c::Char, r::Integer)
r < 0 && throw(ArgumentError("can't repeat a character $r times"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put the r < 0 check inside the isascii check, since repeat(string(c), r) also does this check; otherwise you are doing the same check twice for non-ascii chars.

@test repeat("xx",3) == repeat("x",6) == "xxxxxx"
@test repeat("αα",3) == repeat("α",6) == "αααααα"
@test repeat("xx",3) == repeat("x",6) == repeat('x',6) == "xxxxxx"
@test repeat("αα",3) == repeat("α",6) == repeat('α',6) == "αααααα"
Copy link
Member

@stevengj stevengj Jul 13, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also add a regression test for ^ to make sure 'x'^0 == "" and 'x'^1 == "x".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks done.


Repeat `n` times the string `s`.
Repeat `n` times the string or character `s`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost reads like "repeat n * string". Perhaps:
Repeat the string or character `s` `n` times.

@musm
Copy link
Contributor Author

musm commented Jul 13, 2017

Interestingly the performance characteristics of repeat on master have changed substantially than from 0.6.

@test repeat("x",1) == repeat('x',1) == "x"
@test repeat("x",0) == repeat('x',0) == ""
@test "x"^1 == 'x'^1 == "x"
@test "x"^0 == 'x'^0 == ""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably merge these two tests with the previous lines, e.g. @test "x"^1 == 'x'^1 == repeat("x",1) == repeat('x',1) == "x"

@musm
Copy link
Contributor Author

musm commented Jul 14, 2017

Updated to also test AbstractString subtypes

"""
repeat(s::AbstractString, r::Integer)

Repeat a string `r` times.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably the docstring should also cross-reference ^; similarly for Char.

@musm
Copy link
Contributor Author

musm commented Jul 14, 2017

@tkelman and @stevengj I have addressed your reviews

(please, don't forget to squash merge)

@@ -155,7 +155,7 @@ reverseind(s::SubString{String}, i::Integer) =
"""
repeat(s::AbstractString, r::Integer)

Repeat a string `r` times.
Repeat a string `r` times. This can equivalently be accomplished by calling [`^`](@ref).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be clearer to write:

by calling [`s^r`](@ref ^).

Ideally, we should cross-reference ^(::AbstractString, ::Integer) ... @MichaelHatherly, does @ref ^(::AbstractString, ::Integer) work in Documenter?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, fixed.

@stevengj
Copy link
Member

stevengj commented Jul 15, 2017

(Travis failure is #17626. Not sure what the ProcessExitedException on AppVeyor is about.)

@musm
Copy link
Contributor Author

musm commented Jul 15, 2017

I think there is a bug in the implementation here
e.g. try this on $ , which spits

"\$\$\$\$\$\$\$\$\$\$\$\$\$\$\$\$\$\$\$\$"

@TotalVerb
Copy link
Contributor

@musm That's because $ must be escaped within double quotes.

@musm
Copy link
Contributor Author

musm commented Jul 15, 2017

@TotalVerb oh right, thanks!

I added a new commit which removed the GenericString test by accident, I dropped the commit.

@KristofferC
Copy link
Member

Good to go then?

@musm
Copy link
Contributor Author

musm commented Jul 16, 2017

yes (please squash)

@KristofferC KristofferC merged commit c2a848a into JuliaLang:master Jul 16, 2017
@musm musm deleted the ch branch July 16, 2017 05:07
jeffwong pushed a commit to jeffwong/julia that referenced this pull request Jul 24, 2017
* Add repeat for Char to fix performance of pow

* Address reviews

* Update docstring
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add specialized repeat function for Char
6 participants