-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extracting urls from a list of urls #70
Comments
Hey there! Not exactly sure what you are asking here. Could you explain a bit more? If you would like to crawl a list of URLs that you provide, there is documentation about that here: https://github.com/salimk/Rcrawler/#9-1--scrape-data-from-list-of-urls The list is passed into the Example provided:
|
Hi
The really handy way would be, if one can anyhow input a file path to a .txt or .csv, which contains urls to scrape. What do you think about it?
greats from berlin
evgeniy
…________________________________
Von: Danielle Smalls <notifications@github.com>
Gesendet: Montag, 25. November 2019 20:08
An: salimk/Rcrawler <Rcrawler@noreply.github.com>
Cc: Evgeniy <chilly_bang@hotmail.com>; Author <author@noreply.github.com>
Betreff: Re: [salimk/Rcrawler] Extracting urls from a list of urls (#70)
Hey there! Not exactly sure what you are asking here. Could you explain a bit more?
If you would like to crawl a list of URLs that you provide, there is documentation about that here: https://github.com/salimk/Rcrawler/#9-1--scrape-data-from-list-of-urls
The list is passed into the ContentScraper function in a method that is similar to crawling a single URL.
Example provided:
listURLs<-c("http://www.thedermreview.com/la-prairie-reviews/", "http://www.thedermreview.com/arbonne-reviews/", "http://www.thedermreview.com/murad-reviews/")
Reviews<-ContentScraper(Url = listURLs, CssPatterns =c(".entry-title","#comments p"), ManyPerPattern = TRUE)
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub<#70?email_source=notifications&email_token=ABUNU5WMVJLIISLLLCPYKTDQVQPCLA5CNFSM4JQDEJVKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFDO7ZI#issuecomment-558297061>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ABUNU5TNUHJSXFF2BLNUIEDQVQPCLANCNFSM4JQDEJVA>.
|
Is it in general possible to load a list of urls, from which Rcrawler collects urls?
The text was updated successfully, but these errors were encountered: