node: The number of files in Chain_xxx folder keeps growing rapidly when the plugin RpcNep5Tracker was installed. #419

nicolegys · 2019-07-30T04:10:57Z

When I restart the neo-cli, the size of chain folder will be reduced to 12G, and number of files will be about 7k. Such as:

But the number and size will keep growing rapidly then.
Here is the screen shot, about 24 hours after my last retarting.

And as time goes on, growing......growing~
It seems that if the disk was large enough, the number of files would continue to increase.

Here is the screen shot of LOG file, seems no deleting after compacting.

Then, we found that when the plugin RpcNep5Tracker was not installed, this issue didn't appear.

shargon · 2019-07-30T07:56:07Z

Is expected, we index more information, and is stored there

nicolegys · 2019-07-30T08:25:06Z

Is expected, we index more information, and is stored there

But the size will keep growing, someone saw 158G today.

The disk will be full someday because it's size is limited. For example, my disk is only 100G.
I think we should fix the issue, otherwise we would repeated restarting in the long future. TwT

Qiao-Jin · 2019-07-30T09:40:56Z

Is expected, we index more information, and is stored there

The direct reason should probably be that unreleased snapshots prevent leveldb compaction from deleting outdated ldb files, after observation & experiments.

After looking into the leveldb log file on the specific problematic client, I observed that compactions failed to delete any outdated ldb file. I guessed that something such as snapshots prevent file deletion. My experiments & corresponding results are as follows:

I built up a local leveldb env & kept on inserting notes as well as creating snapshots without releasing them. In this period I tried compaction but failed to delete outdated files, as what I expected. The number of ldb files kept on rising even when I was just inserting duplicate notes.
The specific problem in this issue never occurs after removing ALL usage of func leveldb_create_snapshot in the code.

So it's obvious that the reason of this issue should probably be conflicts between snapshots & compaction.

I also observed that there are mulitple places in the code where snapshots are created but never released. We are testing to see whether the problem will re-occur after correction.

erikzhang · 2019-07-31T08:35:09Z

@Qiao-Jin So is it a plugin bug?

Qiao-Jin · 2019-07-31T08:42:42Z

@Qiao-Jin So is it a plugin bug?

The problem might be indirectly caused by some problem in the plugin, say, some exceptions, but the direct reason should be some db snapshots failed to be released. I'm looking for the such snapshots in the code.

superboyiii · 2019-07-31T08:46:16Z

@Qiao-Jin So is it a plugin bug?

Now RPCNep5Tracker exposed this bug. neo-cli works well without RPCNep5Tracker.

erikzhang · 2019-07-31T10:45:43Z

So you believe the bug is in RpcNep5Tracker? I simply checked the RpcNep5Tracker code and found no problems.

superboyiii · 2019-07-31T10:53:43Z

So you believe the bug is in RpcNep5Tracker? I simply checked the RpcNep5Tracker code and found no problems.

Yes, although there seems no obvious relationship between this plugin and this issue. But I've made tests many times for three days. You could try to sync two neo-cli in two different servers, one with RPCNepTracker and one without, syncing to the latest height and wait for four or five hours. You will find absolutely different results of available disk space.

Qiao-Jin · 2019-08-02T02:13:45Z

We retried the version removing ALL usage of func leveldb_create_snapshot in the code for a whole day, and this problem never occurs.

HayesData · 2019-08-03T09:36:48Z

FWIW I'm seeing the exact same issue occur on our node pool as well.

This is a horrible hack way of working around this but we need the RpcNep5Tracker plugin on our nodes.

Until somebody resolves the issue this is the (again admittedly horribly hacky) way I've worked around this problem.

It's simply fired with a cron job every 15 minutes.

#!/bin/bash

THRESHOLD=90
PERCENT_USED=`df -hT / | grep / | awk '{ print $6}' | sed s'/.$//'`

if (( PERCENT_USED >= THRESHOLD )); then
        echo `/bin/date` "- TIME TO RESTART NEO, PRIMARY PARTION "$PERCENT_USED"% FULL"
        /usr/sbin/service neo stop
        /bin/sleep 1
        /usr/sbin/service neo start
        echo `/bin/date` "- NEO HAS BEEN RESTARTED"
fi

Note: I should also mention this syntax is based around the fact we have the neo-cli rpcserver being maintained as a systemd daemon.

Qiao-Jin · 2019-08-09T02:34:05Z

Some similiar bugs reported by leveldb users:
Level/leveldown#273
google/leveldb#164

shargon · 2023-12-05T13:42:46Z

Old, if remains, please re-open

superboyiii mentioned this issue Jul 30, 2019

High cost of CPU on neo-cli might cause be killed by linux system neo-project/neo#954

Closed

nicolegys mentioned this issue Dec 5, 2019

Use leveldb v1.22 for neo-cli release. #472

Closed

shargon changed the title ~~The number of files in Chain_xxx folder keeps growing rapidly when the plugin RpcNep5Tracker was installed.~~ node: The number of files in Chain_xxx folder keeps growing rapidly when the plugin RpcNep5Tracker was installed. Dec 5, 2023

shargon added the node label Dec 5, 2023

shargon closed this as completed Dec 5, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

node: The number of files in Chain_xxx folder keeps growing rapidly when the plugin RpcNep5Tracker was installed. #419

node: The number of files in Chain_xxx folder keeps growing rapidly when the plugin RpcNep5Tracker was installed. #419

nicolegys commented Jul 30, 2019

shargon commented Jul 30, 2019

nicolegys commented Jul 30, 2019

Qiao-Jin commented Jul 30, 2019 •

edited

Loading

erikzhang commented Jul 31, 2019

Qiao-Jin commented Jul 31, 2019 •

edited

Loading

superboyiii commented Jul 31, 2019

erikzhang commented Jul 31, 2019

superboyiii commented Jul 31, 2019

Qiao-Jin commented Aug 2, 2019

HayesData commented Aug 3, 2019 •

edited

Loading

Qiao-Jin commented Aug 9, 2019

shargon commented Dec 5, 2023

node: The number of files in Chain_xxx folder keeps growing rapidly when the plugin RpcNep5Tracker was installed. #419

node: The number of files in Chain_xxx folder keeps growing rapidly when the plugin RpcNep5Tracker was installed. #419

Comments

nicolegys commented Jul 30, 2019

shargon commented Jul 30, 2019

nicolegys commented Jul 30, 2019

Qiao-Jin commented Jul 30, 2019 • edited Loading

erikzhang commented Jul 31, 2019

Qiao-Jin commented Jul 31, 2019 • edited Loading

superboyiii commented Jul 31, 2019

erikzhang commented Jul 31, 2019

superboyiii commented Jul 31, 2019

Qiao-Jin commented Aug 2, 2019

HayesData commented Aug 3, 2019 • edited Loading

Qiao-Jin commented Aug 9, 2019

shargon commented Dec 5, 2023

Qiao-Jin commented Jul 30, 2019 •

edited

Loading

Qiao-Jin commented Jul 31, 2019 •

edited

Loading

HayesData commented Aug 3, 2019 •

edited

Loading