A simple scraper to capture text from Hong Kong forum LIHKG posts, which is a good starting point for who interested in web scraping, It was developed in Python, by using the library [Selenium]
-
Install library [selenium] (https://pypi.org/project/selenium/)
-
Clone the repository
git clone https://github.com/papatekken/simple-LIHKG-scraper-with-python LIHKG-scraper
-
In root directory of 'LIHKG-scraper', run following command to start the application, when the application finished the run, a new text file is created with capture data .
the program is expecting the post ID as argument
e.g. post ID = 1996060
python hkg.py 1996060
Created by @papatekken - feel free to contact me!