HTML end tag seems to break ASCII conversion #54

divad · 2015-03-05T20:44:26Z

When using the ASCII conversion (on say shortcodes to images) the ascii codes, e.g. <3 are converted to ❤️. However, this only works if a space precedes the ascii smiley like so:

<p>text <3</p>

It doesn't work if there is no space and/or it follows a HTML bracket:

<p><3</p>

This remains as the ascii rather than an image.

The text was updated successfully, but these errors were encountered:

mjau-mjau · 2016-05-22T12:57:58Z

This bug still persists. ASCII conversion does not work for smilies immediately succeeding or preceding a html tag. Strange that it works for mapping emoji short names, but not for ascii. I made a dirty hack for it, since I had to map additional PHPBB smilies anyway. Here is the concept if anyone is interesting in the general idea:

// create a map of ascii smiley conversions
var emoji_map = {
    ':oops:':':flushed:',
    ':D':':smiley:',
    ';)':':wink:',
    ':)':':slight_smile:',
    ':(':':disappointed:',
    ':o':':open_mouth:',
    ':shock:':':astonished:',
    ':?:':':grey_question:',
    ':!:':':grey_exclamation:',
    ':?':':confused:',
    '8)':':sunglasses:',
    ':lol:':':laughing:',
    ':x':':no_mouth:',
    ':P':':stuck_out_tongue:',
    ':evil:':':imp:',
    ':twisted:':':smiling_imp:',
    ':roll:':':rolling_eyes:',
    ':idea:':':bulb:',
    ':arrow:':':arrow_right:',
    ':mrgreen:':':alien:',
    ':|':':expressionless:'
};

// function to replace ascii with shortnames from the emoji_map
function match_smilies(str){
    return str.replace(/:oops:|:D|;\)|:\)|:\(|:o|:shock:|:\?:|:!:|:\?|8\)|:lol:|:x|:P|:evil:|:twisted:|:roll:|:idea:|:arrow:|:mrgreen:|:\|/g, function(matched){
      return emoji_map[matched];
    });
}

// apply emoji and mapped ascii
function map_smilies(){
    // loop html elements where to apply emoji
    $(".postbody").each(function() {
    $(this).html(emojione.shortnameToImage(match_smilies($(this).html())));
    });
}

// page load
$(function(){
    // not really necessary, as I am mapping chars already, but this will map additional ascii smilies that were not part of my phpbb
    emojione.ascii = true;

    // apply emoji and mapped ascii after page load
    map_smilies();
});

PS! The above will only apply fix for the items in the emoji_map object. I needed a fix for this for an existing forum, because they were part of the existing 'smilies' interface. If you need to apply the fix for all items in the ascii-smileys list, you will have to populate the object 😑

caseyahenson · 2016-06-05T17:59:39Z

This issue requires a node-oriented solution to the replacement strategy, and we'll be working that into a future update.

@mjau-mjau Your solution is certainly effective, thank you for providing that! The root of the issue is in the limitations of the methodology used to identify ASCII strings. Unfortunately, element tags are difficult to distinguish from ASCII when doing regex so for that reason the regex was built to search only for strings that exist after the space character like so:

ns.regAscii = new RegExp("<object[^>]>.?</object>|<span[^>]>.?</span>|<(?:object|embed|svg|img|div|span|p|a)[^>]*>|((\s|^)"+ns.asciiRegexp+"(?=\s|$|[!,.?]))", "g");

That could be replaced with the following regex to consider all matches, not just those with a leading space:

ns.regAscii = new RegExp("<object[^>]>.?</object>|<span[^>]>.?</span>|<(?:object|embed|svg|img|div|span|p|a)[^>]*>|("+ns.asciiRegexp+"(?=\s|$|[!,.?<]))", "g");

You'll almost certainly run into the root issue here when doing this. Instead, our upcoming solution of looping through nodes on the dom will allow for the clear distinction between tags and text.

SirCumz · 2017-04-26T09:50:20Z

still not fixed :(

caseyahenson · 2017-05-02T22:24:50Z

A solution for this has been published in the 3.0.2 release. The object property riskyMatchAscii can be set to true, allowing all ASCII chars to be replaced regardless of whether they're space-char adjacent. While this isn't a perfect solution since it'll still break strings like 'c://', it should be a good starting point. Hopefully with feedback we can continue to improve this.

emitxyz added the bug label Apr 13, 2015

mikebe11 mentioned this issue Jul 2, 2015

Fix RegExp to match non-breaking space (U+C2A0) #115

Closed

This was referenced May 2, 2016

Compatibility with emojione plugin and the ASCII Smiley option kalvn/Shaarli-Material#47

Closed

HTML end tag breaks ASCII conversion NerosTie/emojione#5

Closed

caseyahenson mentioned this issue Apr 26, 2017

[PHP] &lt is ignored #485

Closed

tzar mentioned this issue Apr 3, 2018

Text conversion to Emoji broken? RocketChat/Rocket.Chat#9498

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HTML end tag seems to break ASCII conversion #54

HTML end tag seems to break ASCII conversion #54

divad commented Mar 5, 2015

mjau-mjau commented May 22, 2016 •

edited

Loading

caseyahenson commented Jun 5, 2016

SirCumz commented Apr 26, 2017

caseyahenson commented May 2, 2017 •

edited

Loading

HTML end tag seems to break ASCII conversion #54

HTML end tag seems to break ASCII conversion #54

Comments

divad commented Mar 5, 2015

mjau-mjau commented May 22, 2016 • edited Loading

caseyahenson commented Jun 5, 2016

SirCumz commented Apr 26, 2017

caseyahenson commented May 2, 2017 • edited Loading

mjau-mjau commented May 22, 2016 •

edited

Loading

caseyahenson commented May 2, 2017 •

edited

Loading