-
Notifications
You must be signed in to change notification settings - Fork 1.3k
InfluxDB memory leakage #605
Comments
The recommendation is 8GB memory limit for InfluxDB. I usually see InfluxDB I think the underlying problem is not a leak. If we want good performance One more thing to consider is the amount of data being written to InfluxDB. On Thu, Sep 24, 2015 at 9:57 AM, Marcin Wielgus notifications@github.com
|
Personally I would double check whether we are not writing too much data to influxdb or doing it in the suboptimal way. From a brief & tired look at how the influxdb sink is implemented I would suspect that we are are writing samples with the default 5 sec resolution. What about changing it to 30 sec (not perfect but better than switching off infuxdb)? Other idea is to play with how batches are constructed and sink frequency - if the underlying storage is key-value like it might be beneficial to write data less often than every 10s so that more data is written under a single key in one batch and the key lookup/disk seeks happen less often. In influxdb 0.8 there seem to be pluggable backends which deeply differ in terms of write performance https://influxdb.com/blog/2014/06/20/leveldb_vs_rocksdb_vs_hyperleveldb_vs_lmdb_performance.html |
We can increase the sink duration. It is currently set to 10 seconds by InfluxDB v0.8 is now being deprecated. So I don't see value in A meta comment that I want to re-state is that, I don't see value in making As of now, we do not endorse InfluxDB as the recommended backend. Are you On Thu, Sep 24, 2015 at 4:50 PM, Marcin Wielgus notifications@github.com
|
I personally don't care whether we use Influxdb or some other time series database like GCM. I just want to have some kind of permanent storage (cloud-provider permitting) for metrics. And I'm convinced that we should not write another one as there is lots of other, more important challenges in K8s. If influxdb is not the best choice for big customers - lets change it in 1.2 and for now provide an extra flag in kube-up to run a heapster without any storage. And if there are performance issues with Influxdb 0.8, and we cannot upgrade to 0.9 (for K8s 1.1), lets try some simple workarounds first (increasing resolution to 30 sec will decrease the load 6x) so it is usable for 80-90% of customers with small/moderate clusters before dropping it completely. |
After a discussion with @piosz we agreed that 30 sec resolution would be even more handy for him. We also took a look at the metrics that are exported - there are 16 but InitialResources need only 6 of them (but network may also be handy so lets count 8). So if we added a flag to control what metrics are propagated to influxdb we could reduce the load even more. Assuming the K8S 1.0 scalability target of 100 nodes with 30 pods each we have: (100 nodes * 30 pods/node * 8 metrics/pod) / 30 sec resolution = 800 data points per second. |
@mwielgus: We store all 16 of those metrics today in InfluxDB, and possibly more soon (disk io, load stats, tcp stats, etc). In any case, I think we are on the same page when it comes to the level of support for InfluxDB. Ensuring that the default setup doesn't overwhelm InfluxDB makes total sense 👍 @piosz: Thanks for posting the PR! |
closing in favor of kubernetes/kubernetes#27630 |
This issue is to gather information about the memory leakage in InfluxDB and check if there are some immediate steps that can be taken to reduce its size.
@vishh Can you provide more info on how big the leak is?
The text was updated successfully, but these errors were encountered: