-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support High Cardinality Tags and Series #7151
Comments
Proposal added for TSI (Time-Series Index) file format: #7174 |
Problem statement/requirements docs: #7151 |
We are getting hit but this pretty hard and am wondering if there is any way we can prevent influx from consuming all ram and then getting killed. Is there some setting we can tweak to help this. I'd be happy with lowering performance if it meant that the service stayed up |
For reference: |
@sorrison 1.1 has a number of memory improvements related to queries, but high memory usage in queries or writes is usually due to schema design issues. Two common problems are querying across too many shards (e.g. shard duration is too low) as well as writing high cardinality tag values and querying too many series at once. There are a few limits you can enable to prevent high cardinality data from being written or being queried. In 1.0, there is
In 1.1, there is a
For queries, there are a few others:
If you are having performance issues, please log a new issue using the instructions for a bug report. In order to help, we need all the information requested in the instructions. |
Thanks @jwilder I am currently developing a driver for Gnocchi (part of openstack) https://github.com/openstack/gnocchi and am dealing with a large amount of data. Basically I have lots of metrics going into influx, originally I put each metric into it's own measurement but I wanted to do 3 levels of downsampling so I didn't want to have 3 continuous queries per measurement (we have in the order of 100,000s of metrics). I thought having more tag values would be better than having more continuous queries? Sorry for putting this all in this bug. Is there a better place to discuss these kind of things? IRC? Just installed the 1.1 RC and working good so far although it takes about a week for it to die and need restarted at the moment. (We are running on a host with 24 cores and 96G RAM) |
@sorrison I tried doing something similar earlier this year with influx. In the end I have grouped together related metrics into separate measurements. I also moved away from continuous queries and I build the downsampled data at the same time, this seems to work really well. Although I am still looking forward to the tag index being cached to disk as at the moment I am storing the data over three separate influxdb instances. |
@ivanscattergood Thanks in advance. |
Hi, I use a java client to collect the data and I aggregate it within that code. I save one summary of data every minute and then a summary every hour. Currently this allows me to visualise 7 million unique series from 3 months down to 1 minute. We are expecting to treble the amount of data we visualise over the next 3 months. Ivan |
Hi Ivan, Thanks in advance. |
Hi, I cache the data in the java client rather than re-querying the data. I was using an earlier version of Influxdb at the time I made that change (version 0.9) and I did this to work around the DB crashing. |
I see, so no queries to retrieve the data. Thanks. |
Yes still using influxdb |
This appears to be a problem for things such as Heapster (kubernetes-retired/heapster#605) & Kubernetes (kubernetes/kubernetes#27630) metrics which appear to use a lot of tags. Based on the pod memory usage pattern for InfluxDB when running in a Kubernetes cluster with Heapster populating data into InfluxDB, it appears that it begins to use a lot of memory the more activity in the cluster is happening. (Therefore more metrics stored & ephemeral pods are started & stopped creating more tags, using more memory until hitting the OOM limit). At this point Kubernetes shows: |
@trinitronx that's one of the key use cases this is designed to support |
Do you know when this will be available in nightly builds?
|
@ivanscattergood there's been significant work on this so hopefully soon. No set date though. |
This feature would really help with handling clickstream data :) |
Storage and query level support is available in nightly and will be present for opt-in in 1.3.0. There is additional work required to support I'm removing this issue from the 1.3.0 milestone and leaving it open for 1.4 / future work where we will finish up the remaining bits and enable TSI by default. More information on the current state is available on the blog: https://www.influxdata.com/path-1-billion-time-series-influxdb-high-cardinality-indexing-ready-testing/ |
TSI shipped in 1.5. It is not currently enabled by default. |
Feature Request
The database should be able to support higher levels of cardinality for tags and series. Currently, the full tag set is loaded into an in-memory index for fast query planning. When tags with a large number of values are written, the in-memory index can consume more memory than is available on the host.
Proposal:
The database should not require loading the full tag set into an in-memory index. Higher cardinality series and tags should be able to be stored and queried and not be limited by the amount of RAM on the host.
Current behavior:
Currently, high cardinality data causes the process memory usage to grow quickly increasing the chances of an OOM. It also slows startup times as the the index needs to scan all the stored data to re-create the in-memory index.
Users also frequently write high cardinality tag data by mistake causing the server to crash. When in this state, removing the problem data is very difficult as well.
Desired behavior:
Storing high-cardinality data should not cause the process to OOM or adversely affect startup times. Query performance should not be adversely affected by higher cardinality data as well.
Use case:
It is more natural and convenient to be able to store higher cardinality data at times. For example, some tag data is ephemeral in nature (docker containers IDs), but can contribute to high cardinality data issues over time.
Documentation
The text was updated successfully, but these errors were encountered: