This project is now deprecated. Please use PHP Scraper.
(2022) Navigates to amazon, searches for samsung phones and pulls the title and price data. I highly recommend working with Linux (including virtual machines) or MacOs.
Before you try to scrape any website, go through its robots.txt file. You can access it via domainname/robots.txt
. There, you will see a list of pages allowed and disallowed for scraping. You should not violate any terms of service of any website you scrape.
Getting up and running on amazon ec2.
cp .env.example .env
touch database/database.sqlite
composer i
make dev
# optional
# make backend-migrate
# (optional)
# npm install
# npm run dev
docker build -t laravel-docker-aws .
docker run -it -p 8001:80 laravel-docker-aws
Update the command at ./app/Console/Commands/BrowseAmazon.php
php artisan browse:amazon
php artisan make:crawler crawler_test
alias sail='vendor/bin/sail'
sail dusk
Mail environment credentials are at .env.
The mailhog docker image runs at http://localhost:8025
.
Using Laravel dusk outside of tests.
Running ChromeDriver and Selenium in Python on an AWS EC2 Instance.
The Makefile
for this project contains useful commands for a Laravel application and can be found at laravel-makefile.
Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.
Please make sure to update tests as appropriate.