Hot Hub Scraper

Hot Hub Scraper is a Node.js application that scrapes hot topics from Weibo and stores them in a PostgreSQL database.

UI

Installation

Clone the repository:

git clone https://github.com/yourusername/hot-hub-scraper.git
cd hot-hub-scraper

Install dependencies:
```
npm ci
```
Install Playwright browsers:
```
npx playwright install --with-deps
```

Usage

Create a .env file based on the .env.sample file:
```
cp .env.sample .env
```
Update the .env file with your PostgreSQL database URL and Weibo URL.
Run the scraper:
```
npm start
```
Seed the database with historical data:
```
npm run seed
```

Database Schema

The database schema is defined in the scripts/wb_hot.sql file:

CREATE TABLE IF NOT EXISTS wb_hot (
    id SERIAL PRIMARY KEY,
    rank INT NOT NULL,
    title VARCHAR(255) NOT NULL,
    hot INT NOT NULL,
    tag VARCHAR(10),
    icon VARCHAR(10),
    created_at TIMESTAMPTZ DEFAULT CURRENT_TIMESTAMP
);

Contributing

Contributions are welcome! Please open an issue or submit a pull request.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.github/workflows		.github/workflows
scripts		scripts
.env.sample		.env.sample
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
db.js		db.js
index.js		index.js
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Hot Hub Scraper

UI

Installation

Usage

Database Schema

Contributing

About

Languages

License

w4n9hu1/hot-hub-scraper

Folders and files

Latest commit

History

Repository files navigation

Hot Hub Scraper

UI

Installation

Usage

Database Schema

Contributing

About

Topics

Resources

License

Stars

Watchers

Forks

Languages