-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
enhancement(loki sink): Partition HTTP requests to loki by stream & re-enable concurrency #8615
Conversation
Signed-off-by: ktf <krunotf@gmail.com>
Signed-off-by: ktf <krunotf@gmail.com>
Signed-off-by: ktf <krunotf@gmail.com>
Signed-off-by: ktf <krunotf@gmail.com>
Signed-off-by: ktf <krunotf@gmail.com>
✔️ Deploy Preview for vector-project canceled. 🔨 Explore the source changes: 20a581a 🔍 Inspect the deploy log: https://app.netlify.com/sites/vector-project/deploys/6121181b1023e300082dd781 |
Signed-off-by: ktf <krunotf@gmail.com>
Signed-off-by: ktf <krunotf@gmail.com>
Signed-off-by: ktf <krunotf@gmail.com>
Signed-off-by: ktf <krunotf@gmail.com>
Signed-off-by: ktf <krunotf@gmail.com>
// So, since all of the events are of the same partition, and if there is concurrency, | ||
// then if ordering inside paritions isn't upheld, the big line events will take longer | ||
// time to flush than small line events so loki will receive smaller ones before large | ||
// ones hence out of order events. | ||
test_out_of_order_events(OutOfOrderAction::Drop, batch_size, events.clone(), events).await; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, this test is fundamentally racy, isn't it? Is there any way to provoke out of order events without racing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is.
Is there any way to provoke out of order events without racing
Not really. There are two more ways to provoke them:
- Fail requests and force them to retry.
- This isn't quite testing what it needs too since retry logic can correct this.
- A mock loki server which we can configure to fail requests would be needed.
- Stall the start of the server so that requests pile up.
- This is yet again racy.
What we can do is to increase the number of events to increase the chance of out of order event happening.
Signed-off-by: ktf <krunotf@gmail.com>
Signed-off-by: ktf <krunotf@gmail.com>
@ktff It'd be nice to merge this by Monday if we can to get it into the next release. I see it approved but just want to make sure you didn't want to make any other tweaks. Nice work on re-enabling concurrency here; it'll make a lot of people happy. |
Signed-off-by: ktf <krunotf@gmail.com>
@ktff Hi,partitioning of requests by stream is not suitable for the role of aggregator,because there will be a large number of label combinations in the aggregator, which eventually leads to the slow sending of Loki. In my case, the final events are stacked in the buffer. |
@wgb1990 by buffer if I assume you mean batch. Events shouldn't be in batch more than
If on the other hand you mean that throughput is low because you have a large number of streams but only a few events per stream per second, true, the older solution would be better for that use case. That is doable, so in that case open a feature issue with your use case. I imagine it could be done with a simple enum option or something more dynamic. |
Closes #6041
Closes #6932
This PR adds partitioning of requests by stream, on top of the already partitioning by tenant. To enable this, inner struct
PartitionBatchSink
is also extended with ability to order requests per partition. This ordering guarantee could maybe be useful as a configurable option on some other sinks.As a side effect, to test this, concurrency in the sink is also re-enabled.