-
Notifications
You must be signed in to change notification settings - Fork 145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
getAggregateText function sticks together text between different nodes #29
Comments
Sorry for the massive delay on this. Am investigating now. |
A paragraph can be styled presentationally as an inline element in CSS. Likewise, an emphasis text run can be styled as a block element too. Two adjacent |
This is a tough one... I'm happy to make it configurable, but it seems like the default behaviour should be what people typically expect, and people probably don't typically expect |
So, yeh, defaulting to "everything is inline" makes sense. Just trying to figure out how to make this easily configurable and obvious. |
The alternative is to assume everything is a block-level element by default, and then provide the opportunity to configure inline elements. |
Do you mean with the boundary matcher |
Ok, so, this is what I've come up with: (Please let me know what you both think) PROPOSAL Everything stays the same as it currently is, by default. The If someone wants to explicitly disallow this, e.g. in the case of block-level findAndReplaceDOMText(document.getElementById('test'), {
find: RegExp('\\b' + 'highlighted' + '\\b', 'gi'),
boundary: function(a, b) {
return a.nodeName.toLowerCase() === 'p' || b.nodeName.toLowerCase() === 'p';
},
wrap: 'em'
}); You can also set Additionally there will be an easy way to set ALL block-level borders as boundaries: findAndReplaceDOMText(document.getElementById('test'), {
find: RegExp('\\b' + 'highlighted' + '\\b', 'gi'),
boundary: findAndReplaceDOMText.BLOCK_LEVEL_ELEMENTS,
wrap: 'em'
}); (I imagine this will be the most common usage) |
Would you care to use https://github.com/ethantw/fibre.js/blob/6857deac21/dist/fibre.js#L18-28 |
Oo, yep, didn't realize that had such good support. Still a shame about IE8. |
Thanks for the answer. The second option is the best one, at least to me. That's what I'm looking for, if I understood you correctly. |
their own matching contexts. E.g. useful with block-level elements like P and DIV. CC #29
Some progress: d41ae22 introduces something similar to my proposal above. Instead of 'boundaries' I've chosen the term 'contexts' and specifically a new option: So, for the block-level use-case, you can do: findAndReplaceDOMText(document.getElementById('test'), {
find: RegExp('\\b' + 'highlighted' + '\\b', 'gi'),
forceContext: findAndReplaceDOMText.BLOCK_LEVEL_MATCH,
wrap: 'em'
}); This means that any block-level elements will create new matching contexts. The You can set If this looks good to you, I'll merge into master and push a new release. (Shall timeout in 48h and assume we're all happy to go ahead with it if I don't hear anything). @weirdy: Here's your JSFiddle with the change added: https://jsfiddle.net/8dcqgcoa/1/ |
Awesome! Thank you, good work. |
Great! Thanks for the significant fix. I have two questions though.
The Highlight<img src="./path"><span>xxx</span> |
Ah, good point; I'll review the list of block elements. Actually, I think what we want is not a list of block elements, per se, but instead a list of all elements that are not inline textual elements (including |
Agree. 👍 |
Until I find a better name for this it's gonna be called "NON_INLINE_PROSE": findAndReplaceDOMText(document.getElementById('test'), {
find: RegExp('\\b' + 'highlighted' + '\\b', 'gi'),
forceContext: findAndReplaceDOMText.NON_INLINE_PROSE,
wrap: 'em'
}); This will force a context on all of the following elements: NON_PROSE_ELEMENTS = {
// Block Elements
address:1, article:1, aside:1, blockquote:1, canvas:1, dd:1, div:1,
dl:1, fieldset:1, figcaption:1, figure:1, footer:1, form:1, h1:1, h2:1, h3:1,
h4:1, h5:1, h6:1, header:1, hgroup:1, hr:1, main:1, nav:1, noscript:1, ol:1,
output:1, p:1, pre:1, section:1, ul:1,
// Other misc. elements that are not part of continuous inline prose:
br:1, li: 1, summary: 1, dt:1, details:1,
// Media / Source elements:
script:1, style:1, img:1, video:1, audio:1, canvas:1, svg:1, map:1, object:1,
// Input elements
input:1, textarea:1, select:1, option:1, optgroup: 1, button:1,
// Table related elements:
table:1, tbody:1, thead:1, th:1, tr:1, td:1, caption:1, col:1, tfoot:1, colgroup:1
}; It's probably missing a few (let me know if you notice any). Thanks! PR happening in #30 (shall merge soon I hope). I also need to add a chunk to the readme to explain what on earth this is all about :P |
Awesome, I think that's comprehensible enough. 👍
I would like to suggest we add He invented <ruby>WWW<rp>(<rt>world wide web<rp>)</ruby> that helps people connect to one another. Well, maybe they belong to those who need ignoring instead? As for And what do you think about the elements designed for web component such as |
Cool, I've added Right now there's no easy way to filter out non-prose elements completely. So, even though they can now have "forced contexts" via So... I think it'd be worth adding a "prose" preset, like: findAndReplaceDOMText(..., {
preset: 'prose'
}); And it could do the following:
It would also avoid Seems like this is the most sensible/common usage of the lib, and it wouldn't be a breaking change since it's a brand new option. Thoughts? |
Excellent! I thought styles, scripts, etc are already skipped by your function. If not, it would be a good option. |
Agree. 🌈 |
Ok, all shipped ( |
Hi
Please, look at my test case
https://jsfiddle.net/weirrrdy/xLcfnrpk/
I'd like to match the exact word "Highlighted", that's why there is a special metacharacter "\b" wrapping the regex expression. Semantically it doesn't necessary to put '\b' in this case but I'm using it in my application and it's required for the test case.
The expected behaviour here would be selecting both 'highlighted' words, however only the second phrase has matched.
The problem is the getAggregateText function has combined text of first two paragraphs into the following text: "HighlightedPhrase". Therefore the current regexp expression has skipped the word "Highlighted".
I would really appreciate if you fix this.
The text was updated successfully, but these errors were encountered: