Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LevelDB read performance degradation #273

Closed
madbence opened this issue Apr 11, 2016 · 15 comments
Closed

LevelDB read performance degradation #273

madbence opened this issue Apr 11, 2016 · 15 comments

Comments

@madbence
Copy link

i'm not sure if my issue is related to leveldown or not, but right now i don't have any ideas. i'm experiencing heavy performance degradation after a huge amount of writes/deletes. The db seems to be huge:

$ du -sh db/
3.3G    db/
$ ls -l db | wc -l
1633

But the db is totally empty. Also it takes 30s to finish scanning.

var db = require('levelup')('./db');
var s = Date.now();
var keys = 0;
db.createKeyStream().on('data', function (key) { keys++; }).on('error', function (e) { console.log(e); }).on('end', function () { console.log(keys, Date.now() - s); })
// 0 30970

I've tried to run leveldown.repair, but the numbers are roughly the same:

$ du -sh db/
2.6G    db/
$ ls -l cache | wc -l
1331

I've tried to inspect the db:

> console.log(db.getProperty('leveldb.stats'))
                               Compactions
Level  Files Size(MB) Time(sec) Read(MB) Write(MB)
--------------------------------------------------
  0        3        6         0        0         0
  1        5        9         0        0         0
  2       48       99         0        0         0
  3      485      998         0        0         0
  4      603     1219         0        0         0

Is this behavior something i should expect with leveldb?

@juliangruber
Copy link
Member

if it takes 30s to finish scanning your db isn't totally empty. if you estimate the size in bites of the data actually stored, what do you come up with?

@madbence
Copy link
Author

the keys begin with numbers, so i guess this should cover the whole domain of my keys:

> db.db.approximateSize('', 'z', function () { console.log(arguments); })
{ '0': null, '1': 2445270120 }

@madbence madbence changed the title LevelDB size issue LevelDB performance & size issue Apr 11, 2016
@juliangruber
Copy link
Member

that's 2,4GB, but keep in mind .approximateSize estimates the actual file system size, not the size of the data that you're storing.

@madbence
Copy link
Author

How should i inspect the data then? db.createKeyStream finished without emitting any data events (that's why i said the db is empty)...

@juliangruber
Copy link
Member

ah, I misread your example. hm yeah the db is big then, but my guess is that compaction will eventually get rid of the unused files.

you described first that actually performance is your problem, not file size. can you share some benchmark results?

@madbence
Copy link
Author

i've created an example to demonstrate this behavior. The numbers are amplified, in reality it takes a few days for the db to grow, but the same behavior is observable with this example.

it writes 1M ~2KB records, then reads them back, deletes them, sleeps for a few seconds, reads the db again (the db is empty in this case), then goes back to step one.

'use strict';
const db = require('levelup')('./db');
const uuid = require('uuid');

const val = 'a'.repeat(2000);

function write(n) {
  const s = Date.now();
  function put(m) {
    if (!m) {
      console.log('Appending %d keys took %dms', n, Date.now() - s);
      return Promise.resolve();
    }
    return new Promise((resolve, reject) => {
      db.put(Date.now() + '|' + uuid.v4(), val, err => {
        if (err) return reject(err);
        resolve();
      })
    }).then(() => {
      return put(m - 1);
    });
  }
  return put(n);
}

function proc() {
  const s = Date.now();
  let count = 0;
  const keys = [];
  return new Promise((resolve, reject) => {
    db.createReadStream()
      .on('data', v => {
        keys.push(v.key);
        count++;
      })
      .on('end', () => {
        db.batch(keys.map(key => ({
          type: 'del',
          key: key,
        })), err => {
          if (err) return reject(err);
          db.db.approximateSize('', 'z', (err, size) => {
            if (err) return reject(err);
            console.log('Processing %d keys took %dms (db size after proc: %dMB)', count, Date.now() - s, size / 1024 / 1024);
            resolve();
          })
        });
      });
  });
}

function sleep(s) {
  return () => new Promise(resolve => setTimeout(resolve, s * 1000));
}

function loop() {
  return write(1000000).then(proc).then(sleep(10)).then(proc).then(loop);
}

loop().catch(err => console.error(err.stack));

Output:

Appending 1000000 keys took 42928ms
Processing 1000000 keys took 6933ms (db size after proc: 170.63547897338867MB)
Processing 0 keys took 1069ms (db size after proc: 170.63547897338867MB)
Appending 1000000 keys took 45045ms
Processing 1000000 keys took 8181ms (db size after proc: 362.9449768066406MB)
Processing 0 keys took 1965ms (db size after proc: 359.4268550872803MB)
Appending 1000000 keys took 43966ms
Processing 1000000 keys took 9278ms (db size after proc: 534.1607818603516MB)
Processing 0 keys took 3257ms (db size after proc: 532.405481338501MB)
Appending 1000000 keys took 44695ms
Processing 1000000 keys took 10480ms (db size after proc: 705.4141635894775MB)
Processing 0 keys took 4359ms (db size after proc: 705.4141635894775MB)
Appending 1000000 keys took 44828ms
Processing 1000000 keys took 11154ms (db size after proc: 897.7040395736694MB)
Processing 0 keys took 5130ms (db size after proc: 894.2279348373413MB)
Appending 1000000 keys took 49001ms
Processing 1000000 keys took 12562ms (db size after proc: 1068.9167070388794MB)
Processing 0 keys took 6484ms (db size after proc: 1067.1594734191895MB)
Appending 1000000 keys took 44204ms
Processing 1000000 keys took 13295ms (db size after proc: 1240.0974378585815MB)
Processing 0 keys took 7217ms (db size after proc: 1240.0974378585815MB)
Appending 1000000 keys took 45875ms
Processing 1000000 keys took 14359ms (db size after proc: 1432.3662147521973MB)
Processing 0 keys took 8395ms (db size after proc: 1430.6111516952515MB)
Appending 1000000 keys took 44548ms
Processing 1000000 keys took 15444ms (db size after proc: 1596.5478172302246MB)
Processing 0 keys took 9457ms (db size after proc: 1596.5478172302246MB)
Appending 1000000 keys took 45693ms
Processing 1000000 keys took 16630ms (db size after proc: 1789.492473602295MB)
Processing 0 keys took 10404ms (db size after proc: 1786.1179008483887MB)
Appending 1000000 keys took 44565ms
Processing 1000000 keys took 17631ms (db size after proc: 1957.572431564331MB)
Processing 0 keys took 11515ms (db size after proc: 1955.8863286972046MB)
Appending 1000000 keys took 46789ms
Processing 1000000 keys took 19591ms (db size after proc: 2132.453568458557MB)
Processing 0 keys took 13021ms (db size after proc: 2132.453568458557MB)
Appending 1000000 keys took 45233ms
Processing 1000000 keys took 20303ms (db size after proc: 2324.7067670822144MB)
Processing 0 keys took 13954ms (db size after proc: 2321.191372871399MB)
Appending 1000000 keys took 46269ms
Processing 1000000 keys took 20796ms (db size after proc: 2494.139804840088MB)
Processing 0 keys took 14489ms (db size after proc: 2492.381628036499MB)
Appending 1000000 keys took 45920ms
Processing 1000000 keys took 22416ms (db size after proc: 2667.079620361328MB)
Processing 0 keys took 15948ms (db size after proc: 2667.079620361328MB)
Appending 1000000 keys took 46993ms
Processing 1000000 keys took 22979ms (db size after proc: 2859.3231239318848MB)
Processing 0 keys took 16565ms (db size after proc: 2855.8166103363037MB)
Appending 1000000 keys took 44903ms
Processing 1000000 keys took 25757ms (db size after proc: 3030.5305461883545MB)
Processing 0 keys took 17527ms (db size after proc: 3028.7762775421143MB)

as you can see, despite the fact that the db is empty, it takes up too much space (this wouldn't be an issue for me) and read performance gets really bad.

@madbence
Copy link
Author

google/leveldb#164 and google/leveldb#83 might be related

@madbence madbence changed the title LevelDB performance & size issue LevelDB read performance degradation Apr 11, 2016
@madbence
Copy link
Author

ping @juliangruber, any ideas?

@juliangruber
Copy link
Member

@dominictarr and @maxogden

@madbence
Copy link
Author

ping @dominictarr and @maxogden again, sorry for the disturbance... the code above btw works with any number (eg. 1K records instead of 1M), the db still grows constantly.

@bwzhang2011
Copy link

@madbence, did you test it with the latest version (1.19) ?

@dominictarr
Copy link
Contributor

interesting. I guess this is happening because of compactions. If you really need to delete everything what about literally deleting the entire database and starting a new one? creating a database and deleting everything seems like a architectural issue to be honest.

Sorry i missed this before.

@madbence
Copy link
Author

i've tried to run the "benchmark" again like 2 weeks ago, and it seems that it was fixed. i'm going to check again this with an older leveldb version and then i'm going to close this issue.

@dominictarr i'm using leveldb for log rotation (upload logs to an external service, then delete from the db after a certain time elapsed or a certain number of rows accumulated), but the log is not rotated globally, but per service, so not the whole db is deleted. i admit, leveldb might be an overkill for this (i could do the same with a simple file for every service)

@lilyannehall
Copy link

@madbence you might consider manually triggering compaction when you delete a significant number of entries and see if performance improves at all. this was added in the last release:

https://github.com/Level/leveldown#leveldown_compactRange

@madbence
Copy link
Author

@bookchin thanks, i didn't know about this method

fyi, the issue seems to be fixed, we had no problems with db size in the past months

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants