-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SQL query formatting improvements #1752
Conversation
Because the token values escaped by BoldKeywordFilter are simply intermediate values and are not directly included in HTML templates, use Python's html.escape() instead of django.utils.html.escape() to eliminate the overhead of converting the token values to SafeString. Also pass quote=False when calling escape() since the token values will not be used in quoted attributes.
sqlparse's SerializerUnicode filter does a bunch of fancy whitespace processing which isn't needed because the resulting string will just be inserted into HTML. Replace with a simple EscapedStringSerializer that does nothing but convert the Statement to a properly-escaped string. In the process stop the escaping within BoldKeywordFilter to have a cleaner separation of concerns: BoldKeywordFilter now only handles marking up keywords as bold, while escaping is explicitly handled by the EscapedStringSerializer.
Instead of using a regex to elide the select list in the simplified representation of an SQL query, use an sqlparse filter to elide the select list as a preprocessing step. The result ends up being about 10% faster.
Instead of only eliding select lists longer than 12 characters, now only elide select lists that contain a dot (from a column expression like `table_name`.`column_name`). The motivation for this is that as of Django 1.10, using .count() on a queryset generates SELECT COUNT(*) AS `__count` FROM ... instead of SELECT COUNT(*) FROM ... queries. This change prevents the new form from being elided.
If a query has subselects in its WHERE clause, do not elide the select lists in those subselects.
The "<strong>" tokens inserted by the BoldKeywordFilter were causing the AlignedIndentFilter to apply excessive indentation to queries which used CASE statements. Fix by rewriting BoldIndentFilter as a statement filter rather than a preprocess filter, and applying after AlignedIndentFilter.
When formatting SQL statements using sqparse, grouping only affects the output when AlignedIndentFilter is applied.
By using a settings_changed signal receiver to clear the query caching, the parse_sql() and _parse_sql() functions can be merged and the check for the "PRETTIFY_SQL" setting can be moved back inside the get_filter_stack() function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for refactoring the way settings are modified! That's an excellent change.
The code amount increase makes me a bit sad, but I think NOT hiding count(*)
queries etc. is definitely a net positive.
Until now we haven't used sqlparse
's insert_(before|after)
methods. They are documented https://sqlparse.readthedocs.io/en/latest/analyzing/?highlight=insert_before#sqlparse.sql.TokenList.insert_before but I find the idx
manipulation and those methods a bit disquieting. The test suite seems to cover the relevant cases, we are depending on sqlparse
anyway and it has been quite stable over the years and that's sufficient for me, but I want to wait a bit if anyone else has some reservations about this change before I'll merge it.
Thanks!
Thanks! |
Description
Various improvements to the code which formats SQL statements for the SQL panel. The majority are performance improvements, with an overall 35% reduction in formatting time on my system. However, there is also an improvement in indentation for queries which use the
CASE
keyword, and better simplification of queries generated by.count()
querysets.Checklist:
docs/changes.rst
.