Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regular expression not working (all characters are shifted) #52

Open
dataexcess opened this issue Apr 11, 2021 · 0 comments
Open

Regular expression not working (all characters are shifted) #52

dataexcess opened this issue Apr 11, 2021 · 0 comments

Comments

@dataexcess
Copy link

dataexcess commented Apr 11, 2021

Hi there,

I am using a regular expression to look for the url of the 'visually similar' button on google.
This is the regex I use:
"href=((?:(?!href).)*?)>Vis"

and it works perfectly when testing on https://regexr.com/
this is some example to-match text:


enu-panel" role="menu" tabindex="-1" jsaction="keydown:Xiq7wd;mouseover:pKPowd;mouseout:O9bKS" data-ved="2ahUKEwjRitfP9vXvAhXf_7sIHViuBewQqR8wAXoECAMQBQ"><li class="action-menu-item" role="menuitem"><a class="fl" href="https://webcache.googleusercontent.com/search?q=cache:8lDNWm_duSMJ:https://www.trustedshops.com/+&amp;cd=2&amp;hl=en&amp;ct=clnk&amp;gl=de\" ping="/url?sa=t&source=web&rct=j&url=https://webcache.googleusercontent.com/search%3Fq%3Dcache:8lDNWm_duSMJ:https://www.trustedshops.com/%2B%26cd%3D2%26hl%3Den%26ct%3Dclnk%26gl%3Dde&amp;ved=2ahUKEwjRitfP9vXvAhXf_7sIHViuBewQIDABegQIAxAG\">Cached<div class="IsZvec"><span class="aCOpRe">Trusted Shops is the European Trustmark for online shops with money-back guarantee for consumers. Trusted Shops offers a comprehensive service to raise ...<div class="ULSxyf"><div jsmodel="gpo5Gf" class="LnbJhc" data-count="28" style="position:relative" data-iu="1" data-hveid="CAIQAA" data-ved="2ahUKEwjRitfP9vXvAhXf_7sIHViuBewQ8w0oAHoECAIQAA"><div class="e2BEnf U7izfe mfMhoc"><a class="ekf0x hSQtef" href="/search?tbs=simg:CAESiQIJQs8eCt9yzs0a_1QELELCMpwgaOgo4CAQSFNcy_1TP2GfwQ9zXEDcYqnTbjEq4kGhqVVToFJUcTvzott-6Sl5Qp4R6jBL5G5bKsuyAFMAQMCxCOrv4IGgoKCAgBEgTTC8hDDAsQne3BCRqdAQofCgxvZmZpY2UgY2hhaXLapYj2AwsKCS9tLzA4cTF4cAofCgxzd2l2ZWwgY2hhaXLapYj2AwsKCS9tLzBncTZreAoiCg9mdXJuaXR1cmUgc3R5bGXapYj2AwsKCS9qLzl3MHFqcwobCghmb3IgdGVlbtqliPYDCwoJL2EvNnEzMDY3ChgKBXNvbGlk2qWI9gMLCgkvYS8zbWcxY20M&q=trusted+shop&tbm=isch&sa=X&ved=2ahUKEwjRitfP9vXvAhXf_7sIHViuBewQjJkEegQIAhAB"><div class="iv236"><span class="iJddsb" style="height:20px;width:20px"><svg focusable="false" viewbox="0 0 24 24"><path d="M14 13l4 5H6l4-4 1.79 1.78L14 13zm-6.01-2.99A2 2 0 0 0 8 6a2 2 0 0 0-.01 4.01zM22 5v14a3 3 0 0 1-3 2.99H5c-1.64 0-3-1.36-3-3V5c0-1.64 1.36-3 3-3h14c1.65 0 3 1.36 3 3zm-2.01 0a1 1 0 0 0-1-1H5a1 1 0 0 0-1 1v14a1 1 0 0 0 1 1h7v-.01h7a1 1 0 0 0 1-1V5"><div class="iJ1Kvb"><h3 class="GmE3X" aria-level="2" role="heading">Visually similar images<div style="padding-bottom:0" id="iur">

<div jsmodel="" jscontroller="IkchZc" jsaction="PdWSXe:h5M12e;rcuQ6b:npT2md" jsdata="X2sNs;;CiOOHU"><div data-h="130" data-nr="4" style="margin-right:-2px;margin-bottom:-2px"><div jsname="dTDiAc" class="eA0Zlc qN5nNb tapJqb ivg-i" data-docid="DrX4TNBpITAGoM" jsdata="XZxcdf;DrX4TNBpITAGoM;CiOOJI" data-ved="2ahUKEwjRitfP9vXvAhXf_7sIHViuBewQ5r0BegQIIRAA"><a href="/search?q=trusted+shop&tbm=isch&source=iu&ictx=1&tbs=simg:CAESiQIJQs8eCt9yzs0a_1QELELCMpwgaOgo4CAQSFNcy_1TP2GfwQ9zXEDcYqnTbjEq4kGhqVVToFJUcTvzott-6Sl5Qp4R6jBL5G5bKsuyAFMAQMCxCOrv4IGgoKCAgBEgTTC8hDDAsQne3BCRqdAQofCgxvZmZpY2UgY2hhaXLapYj2AwsKCS9tLzA4cTF4cAofCgxzd2l2ZWwgY2hhaXLapYj2AwsKCS9tLzBncTZreAoiCg9mdXJuaXR1cmUgc3R5bGXapYj2AwsKCS9qLzl3MHFqcwobCghmb3IgdGVlbtqliPYDCwoJL2EvNnEzMDY3ChgKBXNvbGlk2qWI9gMLCgkvYS8zbWcxY20M&fir=DrX4TNBpITAGoM%252CO1KPcBx95JlNxM%252C&vet=1&usg=AI4_-


When using this exact same regex with your swift Regex expression I do not get the expected result, but I get the following:


FNcy_1TP2GfwQ9zXEDcYqnTbjEq4kGhqVVToFJUcTvzott-6Sl5Qp4R6jBL5G5bKsuyAFMAQMCxCOrv4IGgoKCAgBEgTTC8hDDAsQne3BCRqdAQofCgxvZmZpY2UgY2hhaXLapYj2AwsKCS9tLzA4cTF4cAofCgxzd2l2ZWwgY2hhaXLapYj2AwsKCS9tLzBncTZreAoiCg9mdXJuaXR1cmUgc3R5bGXapYj2AwsKCS9qLzl3MHFqcwobCghmb3IgdGVlbtqliPYDCwoJL2EvNnEzMDY3ChgKBXNvbGlk2qWI9gMLCgkvYS8zbWcxY20M&q=trusted+shop&tbm=isch&sa=X&ved=2ahUKEwjRitfP9vXvAhXf_7sIHViuBewQjJkEegQIAhAB"><div class="iv236"><span class="iJddsb" style="height:20px;width:20px"><svg focusable="false" viewbox="0 0 24 24"><path d="M14 13l4 5H6l4-4 1.79 1.78L14 13zm-6.01-2.99A2 2 0 0 0 8 6a2 2 0 0 0-.01 4.01zM22 5v14a3 3 0 0 1-3 2.99H5c-1.64 0-3-1.36-3-3V5c0-1.64 1.36-3 3-3h14c1.65 0 3 1.36 3 3zm-2.01 0a1 1 0 0 0-1-1H5a1 1 0 0 0-1 1v14a1 1 0 0 0 1 1h7v-.01h7a1 1 0 0 0 1-1V5">

<div class="iJ1Kvb"><h3 class="GmE3X" aria-level="2" role="heading">Visually similar images<div s


As you can see it somehow captures beyond the last capture group ">Vis". And additionally there are a lot of characters missing from the start of the expected capture.. all the characters next to the "href=".

I tried a lot to rewrite my regex, but as it is confirmed to be working on regex testers I must conclude that there is something wrong with this Regex cocoapod.

Please help! Thank you
A link to the regex helper tool: regexr.com/5qfv4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant