-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Build regexes only once #133
Conversation
speedup non-first Tokenizer bootstrap
993d0e0
to
5d88fec
Compare
Is there a reason why I would want to destroy and recreate the Tokenizer all the time? |
See https://github.com/atk4/data/blob/6fbe7e23a4/src/Persistence/Sql/Expression.php#L437 - the regex build is quite slow, so if someone use it like this, this PR will help. |
I'd rather fix the code you've linked, tbh. |
Sure, we already did, and let's fix it here too. |
No. |
@greg0ire can you please reopen this PR - I did a test with:
and the speed difference is 500x - currently, single Tokenizer bootstrap takes ~1.5 ms! Creating the regexes from huge string lists is quite slow process and even when the Tokenizer is reused a few times, this PR imrpoves real world performance by a big amount! Imagine usecases like:
|
None of these use cases seem realistic. |
Please, the improvement is really huge and I hope this provides strong evidence so let me rephrase the question differently, what is againt this PR - the class is final, so the regexes can be easily hold within static property. |
We should indeed fix that as well. Sorry, I certainly don't want a static cache here. And honestly, if you put the tokenizer on a hot path and create and destroy 100k instances of that class instead of reusing a single instance, you had it coming. |
Even then, I doubt anybody will notice the resulting performance improvement. Maybe generating the initial migration of a project will get slightly faster, but after that… meh. |
@greg0ire the speed difference is 500x and although the migratins speedup might be only ~100 ms, as long as someone would use it for web, even if for debug only, the unwanted slowdown should be avoided - please, is there really anything againts this PR? |
What do you mean by that? Do people generate migrations through a web interface nowadays? |
In atk4, we use it to format SQL debug/log queries, witch this feature (and not reusing the SqlFormatter instance), the slowdown can be easily the whole page render.. So given how easy but powerful this PR is, I do not see a major problem by making it fast when not used perfectly. |
I do see a major problem with static caches. This is a no, please stop wasting our time. |
OK - because of memory kept allocated? |
|
Huge speedup when the Tokenizer is created many times (not cached/reused in user app).