Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* Fix LIKE with escapes Fix LIKE processing for patterns containing escapes - the starts_with / ends_with optimization did not correctly check for escapes when checking rest of the pattern for being literal or not - the pattern to regexp compiler incorrectly processed \ followed by a character other than % or _. In PostgreSQL '\x' pattern matches single 'x'. There are two tests - like_escape_many was generated using PostgreSQL with the code attached below for verification - like_escape is hand-picked test cases that are more interesting. Lower cardinality of hand-picked test cases allows for exercising all scalar/array vs scalar/array combinations. The below script isn't simples possible, because it was attempted to generate more test cases by adding padding. Hence e.g. is_like_without_dangling_escape. Since this is attached for reference, should be attached as-is. ```python import psycopg2 data = r""" \ \\ \\\ \\\\ a \a \\a % \% \\% %% \%% \\%% _ \_ \\_ __ \__ \\__ abc a_c a\bc a\_c %abc \%abc a\\_c% """.split('\n') data = list(dict.fromkeys(data)) conn = psycopg2.connect(host='localhost', port=5432, user='postgres', password='mysecretpassword') conn.set_session(autocommit=True) cursor = conn.cursor() for r in data: try: # PostgreSQL verifies dandling escape only sometimes cursor.execute(f"SELECT %s LIKE %s", (r, r)) is_like, = cursor.fetchone() has_dandling_escape = False pg_pattern = r except Exception as e: if 'LIKE pattern must not end with escape character' not in str(e): raise e has_dandling_escape = True pg_pattern = r + '\\' for l in data: # print() # print(' '.join(str(v) for v in (l, r, has_dandling_escape, postgres_pattern))) cursor.execute(f"SELECT %s LIKE %s", (l, pg_pattern)) is_like, = cursor.fetchone() assert type(is_like) is bool if not is_like and has_dandling_escape: pattern_without_escaped_dandling_escape = pg_pattern[:-2] cursor.execute(f"SELECT %s LIKE %s", (l, pattern_without_escaped_dandling_escape)) is_like_without_dangling_escape, = cursor.fetchone() assert type(is_like_without_dangling_escape) is bool else: is_like_without_dangling_escape = False assert '"' not in l assert '"' not in r print('(r"%s", r"%s", %s),' % ( l, r, str(is_like).lower(), # str(has_dandling_escape).lower(), # str(is_like_without_dangling_escape).lower(), )) ``` * Compact tests for regex_like Reduce test code boilerplate and make it easier to see what are the test cases. * Add more test cases for regex_like
- Loading branch information