Find the hero (main) image of an URL
See a demo on https://hero-scrape.viktor-braun.de
hero-scrape extracts the main image of a webpage. It use different strategies to find the main images (OpenGraph HTML Tags and heuristic search). You can use the existing strategies or implement your own.
To find the "biggest" image it is necessary to download it. fastimage is the perfect choice for that job.
go get github.com/v-braun/hero-scrape
With pre configured strategies
pageUrl, _ := url.Parse("https://github.com/v-braun/hero-scrape")
res, _ := http.Get(pageUrl.String())
defer res.Body.Close()
result, _ := heroscrape.Scrape(pageUrl, res.Body)
fmt.Println(result.Image)
With cusom strategies
pageUrl, _ := url.Parse("https://github.com/v-braun/hero-scrape")
res, _ := http.Get(pageUrl.String())
defer res.Body.Close()
result, _ := heroscrape.ScrapeWithStrategy(pageUrl, res.Body, , NewOgStrategy(), NewHeuristicStrategy(), YourOwnStrategy())
fmt.Println(result.Image)
- hero-scrape Demo for this lib
- fastimage Finds the type and/or size of a remote image given its uri, by fetching as little as needed.
- goquery A little like that j-thing, only in Go.
If you discover any bugs, feel free to create an issue on GitHub fork and send me a pull request.
- Fork it
- Create your feature branch (
git checkout -b my-new-feature
) - Commit your changes (
git commit -am 'Add some feature'
) - Push to the branch (
git push origin my-new-feature
) - Create new Pull Request
See LICENSE.