-
Notifications
You must be signed in to change notification settings - Fork 111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Leading/trailing single spaces not preserved in node text #115
Comments
I have the same issue, it just trims leading and trailing spaces. Seems like the code which causes the issue is in
|
@serjant Good find. Thanks! Any maintainers around to weigh in on this? I'd propose a PR which removes the trim, effectively fixing the issue of removing legitimate leading/trailing whitespace. I'm happy to do this. If there are any potential issues or side effects with this method, please let me know. |
@nonara Likewise the trim() method is dangerous in that piece of code, because the trim() method removes whitespace from both ends of a string. Whitespace in this context is all the whitespace characters (space, tab, no-break space, etc.) and all the line terminator characters (LF, CR, etc.). See the specification |
@taoqf Just checking up on this. It's been close to a month without a reply, so I need to make a determination on what to do. At this point there are two proposals -
I would suggest the former, as it complies with spec and makes the most sense. I'm ready to do a PR for either with all necessary tests. I don't mean to rush, but as this is a dependency, I will either need to write a new parser, fork this one, or (ideally) get a PR through. Whatever you decide is fine, but I would greatly appreciate a response to advise on which route I should take to get this fixed in my library. Thanks! |
Sorry, I thought you would do a pr, now, I'll remove trim now. |
No worries. It's working well! Thanks! |
Looks like I was wrong here. I didn't notice that I had been using Unfortunately, it also calls Details and fix are in #123 - I took an approach which should also help restore some trim functionality that was removed in v3.2.2 |
Issue
Example:
This, unfortunately produces a TextNode whose text is
Hello
as opposed toHello<space>
Examining the whole text of the parent
p
isHelloworld!
Unfortunately this is irreparably breaking
node-html-markdown
, as I have no way to determine whether there is a leading or trailing space.see: crosstype/node-html-markdown#9
Proposal
In effort to be consistent with HTML spec, a single leading or trailing space should be captured, when present, in the text for any TextNode.
I believe this proposal would be consistent with the behaviour of other parsers. However, if you decide against that route, an alternate proposal would be to add properties to the node
hasLeadingSpace
/hasTrailingSpace
.Would be interested to hear your thoughts on these. Thanks!
The text was updated successfully, but these errors were encountered: