-
Notifications
You must be signed in to change notification settings - Fork 70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove some strings #29
Comments
There are fundamentally two ways to go about it: focus on the content to keep; or discard unwanted content. I'm not sure which one makes more sense in the context you've given, so I'll describe both. If you choose to focus on the content to keep, it looks like you're interested in the header and the paragraph thereafter. So you could do something like: HTMLDocument *document = /* load a document */;
HTMLElement *h2 = [document firstNodeMatchingSelector:@"h2"];
HTMLElement *relevantParagraph = [document firstNodeMatchingSelector:@"h2 + p"]; If you choose to discard unwanted content, you might do something like: HTMLDocument *document = /* load a document */;
HTMLElement *img = [document firstNodeMatchingSelector:@"p > img"];
HTMLElement *imageParagraph = img.parentElement;
// Grab the parent of all these paragraphs for later.
HTMLElement *parent = imageParagraph.parentElement;
[imageParagraph removeFromParentNode];
for (HTMLElement *child in parent.children) {
// U+00A0 is non-breaking space, aka
if ([child.tagName isEqualToString:@"p"] &&
[child.textContent isEqualToString:@"\u00a0"])
{
[child removeFromParentNode];
}
} These examples lean pretty heavily on assuming your document looks exactly like the context you've provided here, so you might need to make it a bit more general. Does that make sense? |
Thank you very much! I will experiment two options |
@NBibikov did you ever solve this? |
Hi! Please help me. I read docs but don't understand how remove some strings. I have some html strings with different parts(aHirg7S8Zu0):
How i can delete first line and all nbsp(2-nd line)?
Thank you very much
The text was updated successfully, but these errors were encountered: