Skip to content
This repository has been archived by the owner on Jul 23, 2019. It is now read-only.

HTML end tag seems to break ASCII conversion #54

Open
divad opened this issue Mar 5, 2015 · 4 comments
Open

HTML end tag seems to break ASCII conversion #54

divad opened this issue Mar 5, 2015 · 4 comments
Labels

Comments

@divad
Copy link

divad commented Mar 5, 2015

When using the ASCII conversion (on say shortcodes to images) the ascii codes, e.g. <3 are converted to ❤️. However, this only works if a space precedes the ascii smiley like so:

<p>text <3</p>

It doesn't work if there is no space and/or it follows a HTML bracket:

<p><3</p>

This remains as the ascii rather than an image.

@mjau-mjau
Copy link

mjau-mjau commented May 22, 2016

This bug still persists. ASCII conversion does not work for smilies immediately succeeding or preceding a html tag. Strange that it works for mapping emoji short names, but not for ascii. I made a dirty hack for it, since I had to map additional PHPBB smilies anyway. Here is the concept if anyone is interesting in the general idea:

// create a map of ascii smiley conversions
var emoji_map = {
    ':oops:':':flushed:',
    ':D':':smiley:',
    ';)':':wink:',
    ':)':':slight_smile:',
    ':(':':disappointed:',
    ':o':':open_mouth:',
    ':shock:':':astonished:',
    ':?:':':grey_question:',
    ':!:':':grey_exclamation:',
    ':?':':confused:',
    '8)':':sunglasses:',
    ':lol:':':laughing:',
    ':x':':no_mouth:',
    ':P':':stuck_out_tongue:',
    ':evil:':':imp:',
    ':twisted:':':smiling_imp:',
    ':roll:':':rolling_eyes:',
    ':idea:':':bulb:',
    ':arrow:':':arrow_right:',
    ':mrgreen:':':alien:',
    ':|':':expressionless:'
};

// function to replace ascii with shortnames from the emoji_map
function match_smilies(str){
    return str.replace(/:oops:|:D|;\)|:\)|:\(|:o|:shock:|:\?:|:!:|:\?|8\)|:lol:|:x|:P|:evil:|:twisted:|:roll:|:idea:|:arrow:|:mrgreen:|:\|/g, function(matched){
      return emoji_map[matched];
    });
}

// apply emoji and mapped ascii
function map_smilies(){
    // loop html elements where to apply emoji
    $(".postbody").each(function() {
    $(this).html(emojione.shortnameToImage(match_smilies($(this).html())));
    });
}

// page load
$(function(){
    // not really necessary, as I am mapping chars already, but this will map additional ascii smilies that were not part of my phpbb
    emojione.ascii = true;

    // apply emoji and mapped ascii after page load
    map_smilies();
});

PS! The above will only apply fix for the items in the emoji_map object. I needed a fix for this for an existing forum, because they were part of the existing 'smilies' interface. If you need to apply the fix for all items in the ascii-smileys list, you will have to populate the object 😑

@caseyahenson
Copy link
Contributor

This issue requires a node-oriented solution to the replacement strategy, and we'll be working that into a future update.

@mjau-mjau Your solution is certainly effective, thank you for providing that! The root of the issue is in the limitations of the methodology used to identify ASCII strings. Unfortunately, element tags are difficult to distinguish from ASCII when doing regex so for that reason the regex was built to search only for strings that exist after the space character like so:

ns.regAscii = new RegExp("<object[^>]>.?</object>|<span[^>]>.?</span>|<(?:object|embed|svg|img|div|span|p|a)[^>]*>|((\s|^)"+ns.asciiRegexp+"(?=\s|$|[!,.?]))", "g");

That could be replaced with the following regex to consider all matches, not just those with a leading space:

ns.regAscii = new RegExp("<object[^>]>.?</object>|<span[^>]>.?</span>|<(?:object|embed|svg|img|div|span|p|a)[^>]*>|("+ns.asciiRegexp+"(?=\s|$|[!,.?<]))", "g");

You'll almost certainly run into the root issue here when doing this. Instead, our upcoming solution of looping through nodes on the dom will allow for the clear distinction between tags and text.

@SirCumz
Copy link

SirCumz commented Apr 26, 2017

still not fixed :(

@caseyahenson
Copy link
Contributor

caseyahenson commented May 2, 2017

A solution for this has been published in the 3.0.2 release. The object property riskyMatchAscii can be set to true, allowing all ASCII chars to be replaced regardless of whether they're space-char adjacent. While this isn't a perfect solution since it'll still break strings like 'c://', it should be a good starting point. Hopefully with feedback we can continue to improve this.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

5 participants