Hot Hub Scraper is a Node.js application that scrapes hot topics from Weibo and stores them in a PostgreSQL database.
-
Clone the repository:
git clone https://github.com/yourusername/hot-hub-scraper.git cd hot-hub-scraper
-
Install dependencies:
npm ci
-
Install Playwright browsers:
npx playwright install --with-deps
-
Create a
.env
file based on the.env.sample
file:cp .env.sample .env
-
Update the
.env
file with your PostgreSQL database URL and Weibo URL. -
Run the scraper:
npm start
-
Seed the database with historical data:
npm run seed
The database schema is defined in the scripts/wb_hot.sql
file:
CREATE TABLE IF NOT EXISTS wb_hot (
id SERIAL PRIMARY KEY,
rank INT NOT NULL,
title VARCHAR(255) NOT NULL,
hot INT NOT NULL,
tag VARCHAR(10),
icon VARCHAR(10),
created_at TIMESTAMPTZ DEFAULT CURRENT_TIMESTAMP
);
Contributions are welcome! Please open an issue or submit a pull request.