-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Support an asterisk in bothify and optimize #701
Support an asterisk in bothify and optimize #701
Conversation
* | ||
* @param string $string String that needs to bet parsed | ||
* @return string | ||
*/ | ||
public static function bothify($string = '## ??') | ||
{ | ||
$string = self::replaceWildcard($string, '*', function () { | ||
return mt_rand(0, 1) ? '#' : '?'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why don't you directly pass randomDigit()
or randomAscii()
here? That's save one function call for each placeholder.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does it save a function call? For randomLetter()
, it will be called once anyway and randomDigit()
is not called at all, there's an optimization already that gets many digits from a big number.
Hi, thanks for the patch! Being utf8-safe also explains why the current version may be slower. Although faster, your method drops an important ability (UTF-8), which is a stopper. I think |
I've been thinking about it. We can still use I'll benchmark that version and the one with custom utf-8 string iteration. |
Btw there should be an unit test to see it fail for special utf-8 strings. Maybe I'll try it with minimaxir/big-list-of-naughty-strings. |
There's an answer on SO that says it should be fine to search for an ascii char in an utf-8 string. |
Then you'll need a test to prove it :) |
Done and it seems to work. Done with https://gist.github.com/nineinchnick/917e644df42ccd62db5c |
I mean a unit test un Faker. |
Done. |
I meant testing Faker, not PHP. You have to make sure |
I'm not sure how such a test would look like. Should I run it on that big utf-8 string, replace the single char back to the placeholder and compare? |
No, just use a sample UTF-8 string with a couple special chars AND placeholders, and check that the result matches the expected with a regular expression (using |
Done. Note that it doesn't matter if you use |
Support an asterisk in bothify and optimize
Great, thanks! |
Base::bothify()
now replaces an asterisk ('*') with either a random number or a random letter.I also replaced
preg_replace_callback
innumerify
andlexify
as it is slower than iterating over a range of chars found by usingstrpos
andstrrpos
.Benchmarks are available here: https://gist.github.com/nineinchnick/84895a560910e3ead9cc
The biggest question is should we worry about UTF-8? I guess in a worst case it could replace the wrong byte. There are a few SO questions how to iterate over UTF-8 strings efficiently.
Should the new helper method
Base::replaceWildcard
be public?