Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Searching like with bracket (, ) , [, ] #385

Closed
hkcity1111 opened this issue Aug 8, 2015 · 19 comments
Closed

Searching like with bracket (, ) , [, ] #385

hkcity1111 opened this issue Aug 8, 2015 · 19 comments

Comments

@hkcity1111
Copy link

I want to search following json by asql and the following problem found.
How can I change text to solve it.
Thanks!

var compListTest = [
    {"compDisplay" : "MGT. LTD.-SUNSHINE (N-R) [1535]"},
    {"compDisplay" : "INVESTMENT LIMITED/ [JC01]"},
    {"compDisplay" : "(METRO C. III) [2823]"}
];

// Problem 1: searching with ( or )
found=alasql("SELECT * FROM ? WHERE compDisplay like '%III) [2823]%' ",[compListTest]);
// Uncaught SyntaxError: Invalid regular expression:

// Problem 2: searching with [ ]
found=alasql("SELECT * FROM ? WHERE compDisplay like '%[1535]%' ",[compListTest]);
// should be MGT. LTD.-SUNSHINE (N-R) [1535]
@agershun
Copy link
Member

agershun commented Aug 8, 2015

I am not sure we can fix it before Monday. You can use indexOf() function to perform this search:

s -> indexOf("search pattern") > -1

@agershun agershun added this to the SQL milestone Aug 8, 2015
@mathiasrw
Copy link
Member

Thank you for a well described issue.

The LIKE search string is passed as a regular expression and it looks like we dont escape the regular expression special chars - so the ) is a problem. Same problem would come with a (

We must fix this - I have a feeling it must be in the same files as edited for this commit: efc33e6

Now - the [ and ] part is a bit different. The LIKE expression allows you to match on any character in the [...] brackets (for example, [abc] would match on a, b, or c characters). so for problem 2

alasql("SELECT * FROM ? WHERE compDisplay like '%[1535]%' ",[compListTest]);

you will get all results back as its searching for any string that contains 1, 5, or 3 or (and all the example data fulfills this)

A workaround for both issues would be to escape the [, ], (, and ). Normally that would be with a \\but we are having an issue that we have two layers of string handeling - so we need to use \\\\

Here is a working version of your example while we prepare a release that fixes the issues

var compListTest = [
    {"compDisplay" : "MGT. LTD.-SUNSHINE (N-R) [1535]"},
    {"compDisplay" : "INVESTMENT LIMITED/ [JC01]"},
    {"compDisplay" : "(METRO C. III) [2823]"}
];

// Problem 1: searching with ( or )
found=alasql("SELECT * FROM ? WHERE compDisplay like '%III\\\\) \\\\[2823\\\\]%' ",[compListTest]);
console.log(JSON.stringify(found))
// [{"compDisplay":"(METRO C. III) [2823]"}]

// Problem 2: searching with [ ]
found=alasql("SELECT * FROM ? WHERE compDisplay like '%\\\\[1535\\\\]%' ",[compListTest]);
// should be MGT. LTD.-SUNSHINE (N-R) [1535]
console.log(JSON.stringify(found))
//[{"compDisplay":"MGT. LTD.-SUNSHINE (N-R) [1535]"}] 

@agershun
Copy link
Member

agershun commented Aug 9, 2015

@mathiasrw We can do more strong version of LIKE (with only % and .) and include RegEx as special type. What do you think?

@mathiasrw
Copy link
Member

@agershun I think its important that LIKE acts like expected for SQL. We can maybe have a LIKE_REGEX to do a pure regexp search. We are not far :)

@hkcity1111
Copy link
Author

Thanks yours for prompt reply. I can use it from my solution first.
Many Many Thanks!

@agershun
Copy link
Member

Ok. We will reduce to this specification:
%
_
ESCAPE

See:

Lately we can extend it with [] and {}

@mathiasrw
Copy link
Member

Sounds great !

For SQL Server, from http://msdn.microsoft.com/en-us/library/ms179859.aspx :

  • % Any string of zero or more characters.

    WHERE title LIKE '%computer%' finds all book titles with the word 'computer' anywhere in the book title.

  • _ Any single character.

    WHERE au_fname LIKE '_ean' finds all four-letter first names that end with ean (Dean, Sean, and so on).

  • [^] Any single character within the specified range ([a-f]) or set ([abcdef]).

    WHERE au_lname LIKE '[C-P]arsen' finds author last names ending with arsen and starting with any single character between C and P, for example Carsen, Larsen, Karsen, and so on. In range searches, the characters included in the range may vary depending on the sorting rules of the collation.

we must remember to escape chars in [ ] as your can search for _ by LIKE "[_]"

@agershun
Copy link
Member

I found the SQL LIKE JavaScript function in SO, but is should be checked how it works before implementing into AlaSQL:

alasql.utils.like = function (pattern,value) {
    var specials = [
      '/', '.', '*', '+', '?', '|',
      '(', ')', '[', ']', '{', '}', '\\'
    ];
    var re  = new RegExp(
      '(\\' + specials.join('|\\') + ')', 'g'
    );
  }
  return pattern.replace(re, '\\$1').replace("%", ".*").replace("_", ".").match(value);
}

Now AlaSQL uses LIKE function in many statement

@mathiasrw
Copy link
Member

Looks as a good start.

Inputs for emprovements:

  • lets get the new Regexp out of the function so we dont need to initiate it every time
  • Lets add ^ and $ to special list

Future:

  • We need to remove the '[', ']' from the specials so we can still support the [ ] syntax.
  • We must make sure that ^ is not escaped if its the first char in [ ]
  • We must make sure % and _ are not replaced within a [ ]
  • Expand the function with an ESCAPE parameter

@agershun
Copy link
Member

@mathiasrw Thank you for detail analysis! Hope... I can do it ))

@mathiasrw
Copy link
Member

bughunter

@agershun
Copy link
Member

16141_one-does-not-simply-_demotivators_to

@mathiasrw
Copy link
Member

:)

@agershun
Copy link
Member

Just updated AlaSQL. Now LIKE works like in other SQL dialects. Now it properly supports:

    SELECT "abcdef" LIKE "%bc%"
    SELECT "abcdef" LIKE "_bc%"
    SELECT "abcdef" LIKE "[aq]bc%"
    SELECT "abcdef" LIKE "[^qw]bc%"
    SELECT "(abcdef)" LIKE "(%)"
    SELECT '[abc]' LIKE '![%!]' ESCAPE '!'

I m,odified test366 where replaced '?' to '_' and added ESCAPE clause to skipped tests.

Unfortunately, this relization is 'slow', because I wrote own parser and didnot use Regular Expressions to convert SQL search pattern to JS one.

As the next step we need to realize REGEXP operator and REGEXP_LIKE() function.

@mathiasrw
Copy link
Member

Woooohooooo!!!

Lets make it work before we optimise :)

@agershun
Copy link
Member

I think we can cache (memoize) patterns for this LIKE. In the future, not now ))

@mathiasrw
Copy link
Member

Good idea !

@mathiasrw
Copy link
Member

Closed per request from #628

@paulrutter
Copy link
Contributor

Like pattern cache was added in #1822.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants