Skip to content

Extracting Data from web using python frameworks

Notifications You must be signed in to change notification settings

KizMan-23/scraping

Repository files navigation

Scraping is an essentail task in obtaining data especially from web. It offers an easy alternative to extract when one cannot use APIs of the sites or the sites do not offer substantive APIs. This scraping repository offers all the scraping tasks i have undergone so far. Scraping in itself can be a very complex tasks as many websites are designed proof from scraping and bot actions.

NBA Scraping contains scraped data for the NBA from www.nba.com website. National BasketBall Association is a prestige American League for top and professional basketballers. Obtaining this data was necessary not just in trying the necessary packages such as BeatifulSoup but for use in training a model that could predict Most Valuable Player (MVP) for the NBA is future seasons.

nba_2

In another application, training a model on previous team data can help to better predict the team's performance and standings in the future seasons, this finds a lucrative application in sports analysis and betting markets.

web scraing is also a project on scraping basketball data from basketball-reference website

web_scrape

Premier League Data In the same vein as NBA data, the premier league is the top English Football League. The league is made of 20 teams and is one of the most watched sports league in the worls.

pre-lg scrape

Obtaining data about this league is very invaluable as it is not just predicting teams that could finish top in the season but also for its wide application in the sports betting industry.

pre-lg-2

Tweeter Scraper is a bot system that can be used to scrape information and data from Twitter or more currently known as X. Scraping the X platform became tougher in the hands of the new owner but I had to try my best in scraping for infornatiom as much as the current system could allow me.

tweet scrape

X scrapping is super important in today's frenzy of Artificial Intelligence as the platform holds vast amount of information and discussions that can't be equally obtained from any other social media platform. X APIs fall in the category of platforms which thier apis are not sustainable small and individual persons and businesses due to the pricing of the apis.

Walmart Products Scraped is a notebook bot system i used to scrape the walmart website of different products and product information. This scraping exercise presented a very complex system developed by walmart to proof thier sites against bot actions.

walmart

I was able to get around the system to extract the necessary information and products i needed. Walmart as one of the largest global e-commence platform contains tons of information and details about wide range of products. The data of these products can be used not just in learning more about the products but in training models around those products. In a more comprehensive algorithm for scraping walmart,

walmart-2

it provides an alternative route to achieve better results than what the notebook own offered.

About

Extracting Data from web using python frameworks

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published