web-scraping : "Madrid stock exchange" and "El Pais"

Learn and practice web scraping: hands on BeautifulSoup, Selenium and Scrapy.

To see the web version of this project click => HERE

In this project we are going to learn how to use three of the most used libraries for webscraping:

BeautifulSoup
Selenium
Scrapy.

To learn how to use these libraries, first we are going to extract information from the website of the Madrid stock exchange, then we are going to extract economic information from the website of the newspaper El Pais (English version)

Requirements:

To run this notebook it will be necessary to have the following libraries installed:

beautifulsoup4==4.11.1
itemadapter==0.6.0
matplotlib==3.5.1
numpy==1.22.3
pandas==1.4.1
requests==2.27.1
Scrapy==2.6.1
selenium==4.2.0 (The browser I use for this library and in this project is firefox)

I leave the extracted articles inside this repository (data.csv), they can be useful to carry out some NLP project. (if you find this project helpful, start it up)

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
articleScrap		articleScrap
.DS_Store		.DS_Store
README.md		README.md
_config.yml		_config.yml
data.csv		data.csv
data1.csv		data1.csv
data2.csv		data2.csv
description.txt		description.txt
geckodriver.log		geckodriver.log
image.png		image.png
requirements.txt		requirements.txt
s12_02_web_scraping.html		s12_02_web_scraping.html
s12_02_web_scraping.ipynb		s12_02_web_scraping.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

web-scraping : "Madrid stock exchange" and "El Pais"

About

Releases

Packages

Languages

Pevicsanch/web-scraping

Folders and files

Latest commit

History

Repository files navigation

web-scraping : "Madrid stock exchange" and "El Pais"

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages