Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(new sink): Add possibility to use nats jetstream in nats sink #20834

Merged

Conversation

whatcouldbepizza
Copy link
Contributor

@whatcouldbepizza whatcouldbepizza commented Jul 10, 2024

Add new option jetstream to nats sink configuration that allows using jetstream to sink into nats

@whatcouldbepizza whatcouldbepizza requested a review from a team as a code owner July 10, 2024 09:46
@bits-bot
Copy link

bits-bot commented Jul 10, 2024

CLA assistant check
All committers have signed the CLA.

@github-actions github-actions bot added the domain: sinks Anything related to the Vector's sinks label Jul 10, 2024
@whatcouldbepizza whatcouldbepizza changed the title nats jetstream option feat(new sink): Add possibility to use nats jetstream in nats sink Jul 10, 2024
Copy link
Contributor

@Jarema Jarema left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good from NATS maintainer perspective.
We can also make async JetStream publishing in the future possible here.

src/sinks/nats/config.rs Outdated Show resolved Hide resolved
@jszwedko
Copy link
Member

Thanks @whatcouldbepizza ! We'll review this shortly.

Copy link
Contributor

@StephenWakely StephenWakely left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thanks. Just a couple of minor nits with the code.

I'm having difficulty testing the code, possibly just down to my misunderstanding.

Running NATS in Docker:

docker run -p 4222:4222 -p 8222:8222 -p 6222:6222 --name nats-server -ti nats:latest

I'm pulling messages from NATS with one instance of Vector configured with:

sources:
  nats:
    type: nats
    url: "nats://localhost:4222"
    subject: nork
    connection_name: zork

sinks:
  console:
    type: console
    inputs:
      - nats
    target: stdout
    encoding:
      codec: json

And I'm pushing messages to NATS with another instance of Vector running this PR branch.

sources:
  in:
    type: stdin

sinks:
  nats:
    type: nats
    inputs:
      - in
    subject: nork
    url: "nats://localhost:4222"
    jetstream: true
    encoding:
      codec: json

I type a message into stdin for the NATS sink Vector instance and I get the following:

zork
2024-07-11T12:40:26.908904Z  WARN sink{component_kind="sink" component_id=nats component_type=nats}:request{request_id=1}: vector::sinks::util::retries: Retrying after error. error=NATS Server Error: NATS Publish Error: timed out: didn't receive ack in time internal_log_rate_limit=true
2024-07-11T12:40:32.642235Z  WARN sink{component_kind="sink" component_id=nats component_type=nats}:request{request_id=1}: vector::sinks::util::retries: Internal log [Retrying after error.] is being suppressed to avoid flooding.
2024-07-11T12:40:38.539561Z  WARN sink{component_kind="sink" component_id=nats component_type=nats}:request{request_id=1}: vector::sinks::util::retries: Internal log [Retrying after error.] has been suppressed 1 times.
2024-07-11T12:40:38.539593Z  WARN sink{component_kind="sink" component_id=nats component_type=nats}:request{request_id=1}: vector::sinks::util::retries: Retrying after error. error=NATS Server Error: NATS Publish Error: timed out: didn't receive ack in time internal_log_rate_limit=true
2024-07-11T12:40:44.042736Z  WARN sink{component_kind="sink" component_id=nats component_type=nats}:request{request_id=1}: vector::sinks::util::retries: Internal log [Retrying after error.] is being suppressed to avoid flooding.

The Vector instance running the NATS source receives this message repeatedly since the NATS sink keeps retrying to send:

{"message":"{\"host\":\"stephenwakely\",\"message\":\"zork\",\"source_type\":\"stdin\",\"timestamp\":\"2024-07-11T12:40:21.905962302Z\"}","source_type":"nats","subject":"nork","timestamp":"2024-07-11T12:40:21.908185811Z"}
{"message":"{\"host\":\"stephenwakely\",\"message\":\"zork\",\"source_type\":\"stdin\",\"timestamp\":\"2024-07-11T12:40:21.905962302Z\"}","source_type":"nats","subject":"nork","timestamp":"2024-07-11T12:40:27.641924291Z"}
{"message":"{\"host\":\"stephenwakely\",\"message\":\"zork\",\"source_type\":\"stdin\",\"timestamp\":\"2024-07-11T12:40:21.905962302Z\"}","source_type":"nats","subject":"nork","timestamp":"2024-07-11T12:40:33.539870420Z"}
{"message":"{\"host\":\"stephenwakely\",\"message\":\"zork\",\"source_type\":\"stdin\",\"timestamp\":\"2024-07-11T12:40:21.905962302Z\"}","source_type":"nats","subject":"nork","timestamp":"2024-07-11T12:40:39.042825638Z"}
{"message":"{\"host\":\"stephenwakely\",\"message\":\"zork\",\"source_type\":\"stdin\",\"timestamp\":\"2024-07-11T12:40:21.905962302Z\"}","source_type":"nats","subject":"nork","timestamp":"2024-07-11T12:40:44.557572405Z"}

Do you know where I am going wrong? Or is there an error in this code somewhere?

src/sinks/nats/config.rs Outdated Show resolved Hide resolved
src/sinks/nats/config.rs Outdated Show resolved Hide resolved
@whatcouldbepizza whatcouldbepizza requested review from a team as code owners July 11, 2024 15:03
@github-actions github-actions bot added the domain: external docs Anything related to Vector's external, public documentation label Jul 11, 2024
@whatcouldbepizza
Copy link
Contributor Author

This may be code problem, I'll test it and come back a bit later

Copy link

@aliciascott aliciascott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

approved for docs

Copy link

@aliciascott aliciascott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @whatcouldbepizza I approved this but suggested to remove the """" from the description strings :)

@@ -383,6 +383,15 @@ base: components: sinks: nats: configuration: {
}
}
}
jetstream: {
description: """

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
description: """
description:

actually these should be removed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file has been generated from the doc comments added in the code, so it won't be possible to remove these quotes. They are a part of cue and don't make it all the way through to the final docs on the vector.dev website..

Send messages using [Jetstream][jetstream].

[jetstream]: https://docs.nats.io/nats-concepts/jetstream
"""

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"""

"""
required: false
type: bool: default: false
}
request: {
description: """

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
description: """
description:

@whatcouldbepizza
Copy link
Contributor Author

whatcouldbepizza commented Jul 12, 2024

@StephenWakely unfortunately, I can't reproduce the problem, for me sinking works just fine both with and without jetstream option. I'll try to dive into NATS, maybe it should be configured some way to ack async publish

I can see that jetstream should be configured for nats, did you enable it?

@whatcouldbepizza
Copy link
Contributor Author

Hi @aliciascott
As @StephenWakely had mentioned I did not change this file manually, it was generated by doc script. I can make suggested changes, but after next doc generating it will be overwritten, I think it will be pain to support :)

@StephenWakely
Copy link
Contributor

@StephenWakely unfortunately, I can't reproduce the problem, for me sinking works just fine both with and without jetstream option. I'll try to dive into NATS, maybe it should be configured some way to ack async publish

I can see that jetstream should be configured for nats, did you enable it?

Curious. So I hadn't enabled jetstream. So I try again, this time with it enabled:

〉docker run --rm -p 4222:4222 -p 8222:8222 -p 6222:6222 --name nats-server -ti nats:latest --jetstream                                                         nats-jetstream-option
[1] 2024/07/12 10:30:51.558624 [INF] Starting nats-server
[1] 2024/07/12 10:30:51.558695 [INF]   Version:  2.10.17
[1] 2024/07/12 10:30:51.558697 [INF]   Git:      [b91de03]
[1] 2024/07/12 10:30:51.558698 [INF]   Name:     NDUJEGWDTBUA43AXYO7FQELMWSVNUCGBOP4A7NXVD2R6Z7FZ7XIEIQSA
[1] 2024/07/12 10:30:51.558702 [INF]   Node:     Y0rIweGa
[1] 2024/07/12 10:30:51.558703 [INF]   ID:       NDUJEGWDTBUA43AXYO7FQELMWSVNUCGBOP4A7NXVD2R6Z7FZ7XIEIQSA
[1] 2024/07/12 10:30:51.558912 [INF] Starting JetStream
[1] 2024/07/12 10:30:51.559112 [INF]     _ ___ _____ ___ _____ ___ ___   _   __  __
[1] 2024/07/12 10:30:51.559116 [INF]  _ | | __|_   _/ __|_   _| _ \ __| /_\ |  \/  |
[1] 2024/07/12 10:30:51.559117 [INF] | || | _|  | | \__ \ | | |   / _| / _ \| |\/| |
[1] 2024/07/12 10:30:51.559118 [INF]  \__/|___| |_| |___/ |_| |_|_\___/_/ \_\_|  |_|
[1] 2024/07/12 10:30:51.559120 [INF]
[1] 2024/07/12 10:30:51.559121 [INF]          https://docs.nats.io/jetstream
[1] 2024/07/12 10:30:51.559123 [INF]
[1] 2024/07/12 10:30:51.559123 [INF] ---------------- JETSTREAM ----------------
[1] 2024/07/12 10:30:51.559126 [INF]   Max Memory:      23.26 GB
[1] 2024/07/12 10:30:51.559128 [INF]   Max Storage:     8.00 GB
[1] 2024/07/12 10:30:51.559129 [INF]   Store Directory: "/tmp/nats/jetstream"
[1] 2024/07/12 10:30:51.559131 [INF] -------------------------------------------
[1] 2024/07/12 10:30:51.559406 [INF] Listening for client connections on 0.0.0.0:4222
[1] 2024/07/12 10:30:51.559483 [INF] Server is ready

But I still get the same error. Running NATS in Docker on Ubuntu 22.04.4 LTS.

@StephenWakely
Copy link
Contributor

I still get the error when running Nats outside of docker. Tried with Vector in debug and release mode.

@whatcouldbepizza
Copy link
Contributor Author

Did you create a stream in NATS? Can I see its nats stream info <stream_name> please?

@StephenWakely
Copy link
Contributor

StephenWakely commented Jul 12, 2024

Did you create a stream in NATS? Can I see its nats stream info <stream_name> please?

Ah ok. That was it. It now works great. Would it be possible to add something like this to the doc comments for jetstream, to avoid future confusion:

/// If set, the `subject` must belong to an existing JetStream stream.

It's interesting how it managed to send the message through even though an error was returned.

@whatcouldbepizza
Copy link
Contributor Author

I think that's the way NATS works: it something like accepts messages in publish, and the actual save result can be retrieved via ack.await

P. S. Pushed changes to field description and generated docs

@StephenWakely
Copy link
Contributor

Is there a way we could verify if the stream exists in the healthcheck?

@Jarema
Copy link
Contributor

Jarema commented Jul 16, 2024

There is actually a way to filter stream list by subjects. I just need to expose it in Rust client. Will do that and make it part of a next release. That should solve the issue.

@StephenWakely
Copy link
Contributor

There is actually a way to filter stream list by subjects. I just need to expose it in Rust client. Will do that and make it part of a next release. That should solve the issue.

Oh awesome. That would be amazing!

@whatcouldbepizza can you remove the healthcheck for now and create an issue to remind us to update it when the next version of nats_async is released?

@whatcouldbepizza
Copy link
Contributor Author

Removed checking if stream exists in healthcheck and created the issue (for some reason I can't put any label on it, could you please decorate it with all necessary stuff like enhancement label and anything else if needed?)

@StephenWakely
Copy link
Contributor

@whatcouldbepizza There are some clippy errors, mentioned here. Also can you add a changelog entry to detail the change? Details can be found here.

@whatcouldbepizza
Copy link
Contributor Author

Hi @StephenWakely! Sorry, was on a short vacation :) I've added jetstream option to tests and changelog file, can you run tests again please?

Comment on lines 3 to 18
### Example

#### Config

```
sinks:
nats:
type: nats
inputs:
- in
subject: nork
url: "nats://localhost:4222"
jetstream: true
encoding:
codec: json
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changelog should be fairly succinct, so we don't need the example here. We do need the authors included for external contributors.

Suggested change
### Example
#### Config
```
sinks:
nats:
type: nats
inputs:
- in
subject: nork
url: "nats://localhost:4222"
jetstream: true
encoding:
codec: json
```
authors: whatcouldbepizza

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also can you make sure it ends with a newline please.

@whatcouldbepizza
Copy link
Contributor Author

Thanks for the comments, fixed the changelog

Copy link
Contributor

@StephenWakely StephenWakely left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perfect! Thank you.

@StephenWakely StephenWakely added this pull request to the merge queue Jul 23, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jul 23, 2024
Copy link

Regression Detector Results

Run ID: 039e15d8-5fdf-40b4-b5f3-096e6b93e3d8 Metrics dashboard

Baseline: 3b6a738
Comparison: f59718a

Performance changes are noted in the perf column of each table:

  • ✅ = significantly better comparison variant performance
  • ❌ = significantly worse comparison variant performance
  • ➖ = no significant change in performance

No significant changes in experiment optimization goals

Confidence level: 90.00%
Effect size tolerance: |Δ mean %| ≥ 5.00%

There were no significant changes in experiment optimization goals at this confidence level and effect size tolerance.

Experiments ignored for regressions

Regressions in experiments with settings containing erratic: true are ignored.

perf experiment goal Δ mean % Δ mean % CI links
file_to_blackhole egress throughput +1.91 [-5.21, +9.03]

Fine details of change detection per experiment

perf experiment goal Δ mean % Δ mean % CI links
syslog_regex_logs2metric_ddmetrics ingress throughput +2.05 [+1.89, +2.20]
file_to_blackhole egress throughput +1.91 [-5.21, +9.03]
http_elasticsearch ingress throughput +1.81 [+1.63, +1.99]
datadog_agent_remap_datadog_logs_acks ingress throughput +1.80 [+1.63, +1.98]
syslog_log2metric_humio_metrics ingress throughput +1.75 [+1.65, +1.86]
splunk_hec_route_s3 ingress throughput +1.70 [+1.40, +2.01]
socket_to_socket_blackhole ingress throughput +1.38 [+1.30, +1.46]
otlp_http_to_blackhole ingress throughput +1.27 [+1.09, +1.45]
syslog_log2metric_tag_cardinality_limit_blackhole ingress throughput +1.23 [+1.14, +1.33]
datadog_agent_remap_blackhole ingress throughput +1.23 [+1.14, +1.32]
datadog_agent_remap_blackhole_acks ingress throughput +0.38 [+0.28, +0.48]
datadog_agent_remap_datadog_logs ingress throughput +0.37 [+0.16, +0.57]
http_to_http_noack ingress throughput +0.03 [-0.10, +0.16]
splunk_hec_indexer_ack_blackhole ingress throughput +0.02 [-0.06, +0.11]
splunk_hec_to_splunk_hec_logs_noack ingress throughput +0.01 [-0.10, +0.11]
splunk_hec_to_splunk_hec_logs_acks ingress throughput +0.00 [-0.10, +0.11]
http_to_s3 ingress throughput -0.06 [-0.33, +0.20]
http_to_http_json ingress throughput -0.08 [-0.14, -0.01]
syslog_splunk_hec_logs ingress throughput -0.19 [-0.28, -0.11]
otlp_grpc_to_blackhole ingress throughput -0.24 [-0.37, -0.11]
fluent_elasticsearch ingress throughput -1.09 [-1.64, -0.54]
http_to_http_acks ingress throughput -1.28 [-2.61, +0.05]
syslog_humio_logs ingress throughput -1.36 [-1.45, -1.26]
syslog_log2metric_splunk_hec_metrics ingress throughput -1.55 [-1.65, -1.45]
http_text_to_http_json ingress throughput -2.17 [-2.30, -2.04]
syslog_loki ingress throughput -2.68 [-2.75, -2.60]

Explanation

A regression test is an A/B test of target performance in a repeatable rig, where "performance" is measured as "comparison variant minus baseline variant" for an optimization goal (e.g., ingress throughput). Due to intrinsic variability in measuring that goal, we can only estimate its mean value for each experiment; we report uncertainty in that value as a 90.00% confidence interval denoted "Δ mean % CI".

For each experiment, we decide whether a change in performance is a "regression" -- a change worth investigating further -- if all of the following criteria are true:

  1. Its estimated |Δ mean %| ≥ 5.00%, indicating the change is big enough to merit a closer look.

  2. Its 90.00% confidence interval "Δ mean % CI" does not contain zero, indicating that if our statistical model is accurate, there is at least a 90.00% chance there is a difference in performance between baseline and comparison variants.

  3. Its configuration does not mark it "erratic".

@StephenWakely
Copy link
Contributor

Unfortunately the nats integration tests are now failing. It looks like not all the messages are getting through.. Can you take a look?

Provided you have Docker installed, to run the integration tests on your machine you can run cargo vdev int test nats.

thread 'sinks::nats::integration_tests::nats_tls_valid' panicked at src/sinks/nats/integration_tests.rs:58:5:
assertion `left == right` failed
  left: 8
 right: 10
stack backtrace:
   0: rust_begin_unwind
             at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/std/src/panicking.rs:652:5
   1: core::panicking::panic_fmt
             at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/core/src/panicking.rs:72:14
   2: core::panicking::assert_failed_inner
             at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/core/src/panicking.rs:409:17
   3: core::panicking::assert_failed
             at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/core/src/panicking.rs:364:5
   4: vector::sinks::nats::integration_tests::publish_and_check::{{closure}}
             at ./src/sinks/nats/integration_tests.rs:58:5
   5: vector::sinks::nats::integration_tests::nats_tls_valid::{{closure}}
             at ./src/sinks/nats/integration_tests.rs:311:37
   6: <core::pin::Pin<P> as core::future::future::Future>::poll
             at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/core/src/future/future.rs:123:9
   7: <core::pin::Pin<P> as core::future::future::Future>::poll
             at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/core/src/future/future.rs:123:9
   8: tokio::runtime::scheduler::current_thread::CoreGuard::block_on::{{closure}}::{{closure}}::{{closure}}
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/runtime/scheduler/current_thread/mod.rs:659:57
   9: tokio::runtime::coop::with_budget
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/runtime/coop.rs:107:5
  10: tokio::runtime::coop::budget
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/runtime/coop.rs:73:5
  11: tokio::runtime::scheduler::current_thread::CoreGuard::block_on::{{closure}}::{{closure}}
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/runtime/scheduler/current_thread/mod.rs:659:25
  12: tokio::runtime::scheduler::current_thread::Context::enter
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/runtime/scheduler/current_thread/mod.rs:404:19
  13: tokio::runtime::scheduler::current_thread::CoreGuard::block_on::{{closure}}
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/runtime/scheduler/current_thread/mod.rs:658:36
  14: tokio::runtime::scheduler::current_thread::CoreGuard::enter::{{closure}}
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/runtime/scheduler/current_thread/mod.rs:737:68
  15: tokio::runtime::context::scoped::Scoped<T>::set
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/runtime/context/scoped.rs:40:9
  16: tokio::runtime::context::set_scheduler::{{closure}}
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/runtime/context.rs:180:26
  17: std::thread::local::LocalKey<T>::try_with
             at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/std/src/thread/local.rs:286:12
  18: std::thread::local::LocalKey<T>::with
             at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/std/src/thread/local.rs:262:9
  19: tokio::runtime::context::set_scheduler
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/runtime/context.rs:180:9
  20: tokio::runtime::scheduler::current_thread::CoreGuard::enter
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/runtime/scheduler/current_thread/mod.rs:737:27
  21: tokio::runtime::scheduler::current_thread::CoreGuard::block_on
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/runtime/scheduler/current_thread/mod.rs:646:19
  22: tokio::runtime::scheduler::current_thread::CurrentThread::block_on::{{closure}}
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/runtime/scheduler/current_thread/mod.rs:175:28
  23: tokio::runtime::context::runtime::enter_runtime
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/runtime/context/runtime.rs:65:16
  24: tokio::runtime::scheduler::current_thread::CurrentThread::block_on
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/runtime/scheduler/current_thread/mod.rs:167:9
  25: tokio::runtime::runtime::Runtime::block_on
             at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/runtime/runtime.rs:347:47
  26: vector::sinks::nats::integration_tests::nats_tls_valid
             at ./src/sinks/nats/integration_tests.rs:312:5
  27: vector::sinks::nats::integration_tests::nats_tls_valid::{{closure}}
             at ./src/sinks/nats/integration_tests.rs:286:26
  28: core::ops::function::FnOnce::call_once
             at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/core/src/ops/function.rs:250:5
  29: core::ops::function::FnOnce::call_once
             at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/core/src/ops/function.rs:250:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
test sinks::nats::integration_tests::nats_tls_valid ... FAILED

@Jarema
Copy link
Contributor

Jarema commented Jul 23, 2024

I haven't look at the test itself, but as we removed flush after each call (which is really not needed), it is possible to loose message if last core nats publish is immediately followed up by dropping the client.
In that case, one flush, just at the end of lifetime of the client, is recommended.

@whatcouldbepizza
Copy link
Contributor Author

Flushing messages after driver run(), I think this will do the job

@StephenWakely
Copy link
Contributor

Ah, apologies, I missed that in my earlier review. We do unfortunately need to flush each message after each call. This is because of our E2E acknowledgement guarantees. If someone turns that on we need to be able to guarantee to the connected source that the message has been successfully delivered.

If the message is sent without being flushed we risk passing a success back to the source, but then the flush fails and so the message isn't delivered.

Tiny bit of discussion here.
#18272 is an issue tracking this for this sink.

Could you put the flush back after each message please?

@whatcouldbepizza
Copy link
Contributor Author

Just put back flush after each message

@StephenWakely StephenWakely added this pull request to the merge queue Jul 24, 2024
Copy link

Regression Detector Results

Run ID: d020c965-97d9-4591-8a83-a6114183dadc Metrics dashboard

Baseline: 787e1ec
Comparison: 56bd0dd

Performance changes are noted in the perf column of each table:

  • ✅ = significantly better comparison variant performance
  • ❌ = significantly worse comparison variant performance
  • ➖ = no significant change in performance

No significant changes in experiment optimization goals

Confidence level: 90.00%
Effect size tolerance: |Δ mean %| ≥ 5.00%

There were no significant changes in experiment optimization goals at this confidence level and effect size tolerance.

Experiments ignored for regressions

Regressions in experiments with settings containing erratic: true are ignored.

perf experiment goal Δ mean % Δ mean % CI links
file_to_blackhole egress throughput -20.84 [-26.96, -14.72]

Fine details of change detection per experiment

perf experiment goal Δ mean % Δ mean % CI links
datadog_agent_remap_datadog_logs ingress throughput +1.74 [+1.49, +1.99]
datadog_agent_remap_datadog_logs_acks ingress throughput +1.54 [+1.40, +1.67]
splunk_hec_route_s3 ingress throughput +1.28 [+0.99, +1.57]
otlp_grpc_to_blackhole ingress throughput +1.05 [+0.92, +1.18]
syslog_splunk_hec_logs ingress throughput +1.00 [+0.92, +1.07]
http_to_http_acks ingress throughput +0.52 [-0.80, +1.84]
otlp_http_to_blackhole ingress throughput +0.31 [+0.10, +0.52]
fluent_elasticsearch ingress throughput +0.11 [-0.40, +0.61]
splunk_hec_to_splunk_hec_logs_noack ingress throughput +0.01 [-0.09, +0.11]
splunk_hec_indexer_ack_blackhole ingress throughput +0.01 [-0.07, +0.09]
splunk_hec_to_splunk_hec_logs_acks ingress throughput +0.00 [-0.11, +0.12]
http_to_http_noack ingress throughput -0.02 [-0.14, +0.11]
socket_to_socket_blackhole ingress throughput -0.02 [-0.08, +0.04]
http_to_http_json ingress throughput -0.13 [-0.20, -0.06]
syslog_regex_logs2metric_ddmetrics ingress throughput -0.26 [-0.41, -0.12]
http_elasticsearch ingress throughput -0.32 [-0.49, -0.16]
http_to_s3 ingress throughput -0.40 [-0.67, -0.14]
datadog_agent_remap_blackhole_acks ingress throughput -0.62 [-0.72, -0.52]
http_text_to_http_json ingress throughput -0.67 [-0.80, -0.54]
syslog_log2metric_humio_metrics ingress throughput -0.99 [-1.15, -0.82]
syslog_log2metric_splunk_hec_metrics ingress throughput -0.99 [-1.12, -0.87]
syslog_loki ingress throughput -1.24 [-1.32, -1.16]
syslog_log2metric_tag_cardinality_limit_blackhole ingress throughput -1.41 [-1.50, -1.32]
syslog_humio_logs ingress throughput -1.83 [-1.92, -1.74]
datadog_agent_remap_blackhole ingress throughput -2.43 [-2.53, -2.34]
file_to_blackhole egress throughput -20.84 [-26.96, -14.72]

Explanation

A regression test is an A/B test of target performance in a repeatable rig, where "performance" is measured as "comparison variant minus baseline variant" for an optimization goal (e.g., ingress throughput). Due to intrinsic variability in measuring that goal, we can only estimate its mean value for each experiment; we report uncertainty in that value as a 90.00% confidence interval denoted "Δ mean % CI".

For each experiment, we decide whether a change in performance is a "regression" -- a change worth investigating further -- if all of the following criteria are true:

  1. Its estimated |Δ mean %| ≥ 5.00%, indicating the change is big enough to merit a closer look.

  2. Its 90.00% confidence interval "Δ mean % CI" does not contain zero, indicating that if our statistical model is accurate, there is at least a 90.00% chance there is a difference in performance between baseline and comparison variants.

  3. Its configuration does not mark it "erratic".

Merged via the queue into vectordotdev:master with commit 56bd0dd Jul 24, 2024
49 checks passed
ym pushed a commit to ym/vector that referenced this pull request Aug 18, 2024
…ectordotdev#20834)

* use nats jetstream as an option

* no need to manually flush messages

* fix jetstream option annotation + prettify code

* generate component docs

* add more precise field description + generate docs

* check nats stream existence in healthcheck

* do not check stream existance

* add new field to struct in tests

* add changelog

* add author to changelog

* remove example config from changelog

* flush core messages

* flush after each message
AndrooTheChen pushed a commit to discord/vector that referenced this pull request Sep 23, 2024
…ectordotdev#20834)

* use nats jetstream as an option

* no need to manually flush messages

* fix jetstream option annotation + prettify code

* generate component docs

* add more precise field description + generate docs

* check nats stream existence in healthcheck

* do not check stream existance

* add new field to struct in tests

* add changelog

* add author to changelog

* remove example config from changelog

* flush core messages

* flush after each message
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain: external docs Anything related to Vector's external, public documentation domain: sinks Anything related to the Vector's sinks
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants