Bulbasaur was created with the objective of shares the component used by Preadly on crawler operations. He is mudule for crawler operations on each operation is used the xml parser Nokogiri and Bulbasaur is just helper for simplify operations with HTML.
Add this line to your application's Gemfile:
gem 'bulbasaur'
Or to get the latest updates:
gem 'bulbasaur', github: 'preadly/bulbasaur', branch: 'master'
And then execute:
$ bundle
Or install it yourself as:
$ gem install bulbasaur
Bulbasaur is separated in three operations Extracts, Replaces and Others
Composted for operations:
- ExtractImagesFromHTML
- ExtractImagesFromYoutube
- ExtractImagesFromVimeo
- ExtractImagesFromAllResorces
html = "<img src='test.jpg' alt='test' /><img src='test-2.jpg' alt='test' />"
images = ExtractImagesFromHTML.new(html).call
puts images #print [{url: 'test.jpg', alt='alt'}, {url: 'test-2.jpg', alt='test'}]
Composted for operations:
- ReplacesByTagImage
- ReplacesByTagLink
html = "<img src='test.jpg' alt='test' />"
image_replaces = [{original_image_url:"test.jpg", url: "new-image.png"}]
ReplacesByTagImage.new(html, image_replaces).call
puts html #print <img src='new-image.png' alt='test' />
- NormalizeURL
base_url = 'http://github.com'
context_url = 'preadly'
url = NormalizeURL.new(base_url, context_url).call
puts url #print http://github.com/preadly
For more informations about how this components works run our spec with param "--format d"
rspec --format d --color
- Fork it ( https://github.com/preadly/bulbasaur )
- Create your feature branch (
git checkout -b my-new-feature
) - Commit your changes (
git commit -am 'Add some feature'
) - Push to the branch (
git push origin my-new-feature
) - Create a new Pull Request