Skip to content

NodeJS backend that uses a Dockerized Headless Chrome to scrape light novels, and then converts them.

Notifications You must be signed in to change notification settings

Squirrels/light-novel-converter-api

Repository files navigation

Light Novel Converter Backend

RESTful API to get the data of Light Novels and convert them to the epub format. After passing it the URL for a story, it'll get the metadata and chapter list, download the chapters, pack them and convert them to .epub.

It connects to a Headless Chrome instance running inside a Docker container to make scraping easier (and requiring less configuration).

Currently Supported Sites

  • NovelPlanet
    • Getting metadata
    • Getting chapter list
    • Downloading chapters
    • Combine all chapters into one file
    • Convert compilation of chapters into .mobi
    • Download result

Requirements

Initial Configuration

Create an .env file following the same format as in .env.sample (you can just rename and edit it). There are 3 variables here:

MASTER_KEY=masterKey # The key required when creating users
JWT_SECRET=jwtSecret # The JWT Token
SCRAPER_ADDRESS=ws://localhost:3000 # The address for the Docker container running Headless Chrome

Then you'll have to install the dependencies

$ npm install

Running

First, you'll need to run MongoDB in another terminal instance.

$ mongod

Then, run the server in development mode.

$ npm run dev
Express server listening on http://0.0.0.0:9000, in development mode

Next, you'll need to create a user (Note that creating and authenticating users needs a master key (which is defined in the .env file))

Create a user (sign up):

curl -X POST http://0.0.0.0:9000/users -i -d "email=test@example.com&password=123456&access_token=MASTER_KEY_HERE"

It will return something like:

HTTP/1.1 201 Created
...
{
  "token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9",
  "user": {
    "id": "57d8160eabfa186c7887a8d3",
    "name": "test",
    "picture": "https://gravatar.com/avatar/55502f40dc8b7c769880b10874abc9d0?d=identicon",
    "email": "test@example.com",
    "createdAt":"2016-09-13T15:06:54.633Z"
  }
}

Now you can use the eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9 token (it's usually greater than this) to call user protected APIs.

curl -X POST http://0.0.0.0:9000/stories -i -d "url=www.sampleurl.com/story1&access_token=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9"

It will take a few seconds (it's scraping the required story) and return something like:

HTTP/1.1 201 Created
...
{
    "id": "5bbfd286b769d9b9db71c360",
    "title": "Example Story",
    "user": {
        "id": "5bbfcd0f10d752b902355b4c",
        "name": "test",
        "picture": "https://gravatar.com/avatar/43b05f394d5611c54a1a9e8e20baee21?d=identicon",
        "email": "test@example.com",
        "createdAt": "2016-09-13T15:06:54.633Z"
    },
    "url": "www.sampleurl.com/story1",
    "metadata": [
        {
            "Genre": [
                "Genre1",
                "Genre2"
            ],
            "Date released": "2018",
            "Views": "100",
            "Author": [
                "Author1"
            ],
            "Status": [
                "Ongoing"
            ],
            "Translator": [
                "Translator1"
            ]
        }
    ],
    "chapters": [
        {
            "title": "Chapter 2",
            "url": "/chapters/2",
            "upload_date": "11/10/2018"
        },
        {
            "title": "Chapter 1",
            "url": "/chapters/1",
            "upload_date": "11/10/2018"
        }
    ]
  }

Available Commands

npm test # test using Jest
npm run coverage # test and open the coverage report in the browser
npm run lint # lint using ESLint
npm run dev # run the API in development mode
npm run prod # run the API in production mode
npm run docs # generate API docs

Thanks To

  • A RESTful API generated by generator-rest.
  • Machas, for constantly mentioning that this was useless.

About

NodeJS backend that uses a Dockerized Headless Chrome to scrape light novels, and then converts them.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published