Skip to content

IamJYC/sinaWeibo

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

#sinaWeibo

Crawl Weibo user and their tweets, and store in Mysql. Maybe Sina has modified the dom setting or ajax of their website, so my code may not work. But it's still can be a helpful demo

Usage

  • Build Mysql database and tables, execute sina_weibo.sql. Insert a line in table user, the id is short id (length=10) of Weibo. This is the root of breadth-first search.
  • Fill your cookies in cookies.txt.
  • Modify userCrawler.py, including Mysql config and MY_UID.
  • Runing userCrawler.py for 3~4 times. Each time is one level deeper of breadth-first search.

The weiboCrawler.py can be used in almost the same way as userCrawler.py. It will crawl tweets of users in table user.

Acknowledge

Based on work of https://github.com/chyanju/wCrawler, Thanks Yanju Chen.

About

Sina Weibo Crawler

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%