Extracting urls from a list of urls #70

e-orlov · 2019-11-21T14:28:06Z

Is it in general possible to load a list of urls, from which Rcrawler collects urls?

smallperks · 2019-11-25T19:08:21Z

Hey there! Not exactly sure what you are asking here. Could you explain a bit more?

If you would like to crawl a list of URLs that you provide, there is documentation about that here: https://github.com/salimk/Rcrawler/#9-1--scrape-data-from-list-of-urls

The list is passed into the ContentScraper function in a method that is similar to crawling a single URL.

Example provided:

listURLs<-c("http://www.thedermreview.com/la-prairie-reviews/", "http://www.thedermreview.com/arbonne-reviews/", "http://www.thedermreview.com/murad-reviews/")

Reviews<-ContentScraper(Url = listURLs, CssPatterns =c(".entry-title","#comments p"), ManyPerPattern = TRUE)

e-orlov · 2019-11-25T22:09:10Z

Hi The really handy way would be, if one can anyhow input a file path to a .txt or .csv, which contains urls to scrape. What do you think about it? greats from berlin evgeniy

…

________________________________ Von: Danielle Smalls <notifications@github.com> Gesendet: Montag, 25. November 2019 20:08 An: salimk/Rcrawler <Rcrawler@noreply.github.com> Cc: Evgeniy <chilly_bang@hotmail.com>; Author <author@noreply.github.com> Betreff: Re: [salimk/Rcrawler] Extracting urls from a list of urls (#70) Hey there! Not exactly sure what you are asking here. Could you explain a bit more? If you would like to crawl a list of URLs that you provide, there is documentation about that here: https://github.com/salimk/Rcrawler/#9-1--scrape-data-from-list-of-urls The list is passed into the ContentScraper function in a method that is similar to crawling a single URL. Example provided: listURLs<-c("http://www.thedermreview.com/la-prairie-reviews/", "http://www.thedermreview.com/arbonne-reviews/", "http://www.thedermreview.com/murad-reviews/") Reviews<-ContentScraper(Url = listURLs, CssPatterns =c(".entry-title","#comments p"), ManyPerPattern = TRUE) — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub<#70?email_source=notifications&email_token=ABUNU5WMVJLIISLLLCPYKTDQVQPCLA5CNFSM4JQDEJVKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFDO7ZI#issuecomment-558297061>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ABUNU5TNUHJSXFF2BLNUIEDQVQPCLANCNFSM4JQDEJVA>.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extracting urls from a list of urls #70

Extracting urls from a list of urls #70

e-orlov commented Nov 21, 2019

smallperks commented Nov 25, 2019

e-orlov commented Nov 25, 2019 via email

Extracting urls from a list of urls #70

Extracting urls from a list of urls #70

Comments

e-orlov commented Nov 21, 2019

smallperks commented Nov 25, 2019

e-orlov commented Nov 25, 2019 via email