Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make TestFluentdCommand tests stable on Windows #4096

Merged
merged 3 commits into from
Mar 14, 2023

Conversation

daipom
Copy link
Contributor

@daipom daipom commented Mar 13, 2023

Which issue(s) this PR fixes:
Fixes #4095

What this PR does / why we need it:
MakeTestFluentdCommand tests stable on Windows.
(Please see #4095 for details.)

Docs Changes:
Not needed.

Release Note:
Not needed.

daipom added 2 commits March 14, 2023 01:54
The surviving processes would make the tests unstable on Windows.

Signed-off-by: Daijiro Fukuda <fukuda@clear-code.com>
Signed-off-by: Daijiro Fukuda <fukuda@clear-code.com>
@daipom daipom changed the title MakeTestFluentdCommand tests stable on Windows Make TestFluentdCommand tests stable on Windows Mar 13, 2023
@daipom
Copy link
Contributor Author

daipom commented Mar 13, 2023

Still fails... 😢
But I hope this improves the stability.

@ashie
Copy link
Member

ashie commented Mar 14, 2023

Thanks, it seems netter than before, but still sometimes went wrong.
It seems that it has still another issue.

https://github.com/fluent/fluentd/actions/runs/4407552329/jobs/7721373310

TestFluentdCommand: 
  config dump: 
    test: all secret parameters in worker section is sealed:	.: (5.693639)
  configuration to load plugin file with syntax error: 
    test: failed to start:				.: (5.598084)
  configuration to load plugin which raises unrecoverable error in #start: 
    test: failed to start:				F
===============================================================================
Failure: test: failed to start(TestFluentdCommand::configuration to load plugin which raises unrecoverable error in #start):
  fluentd failed to start, without specified regular expressions:
  2023-03-13 17:27:59 +0000 [info]: init supervisor logger path=nil rotate_age=nil rotate_size=nil
  2023-03-13 17:27:59 +0000 [info]: parsing config file is succeeded path="D:/a/fluentd/fluentd/test/command/../tmp/commandfluentd/0cf12fe51588e91b8151/crashing_plugin.conf"
  2023-03-13 17:27:59 +0000 [info]: gem 'fluentd' version '1.15.3'
  2023-03-13 17:27:59 +0000 [info]: using configuration file: <ROOT>
    <source>
      @type crashing
      @id dummy
      @label @dummydata
      tag dummy
      dummy {"message":"yay!"}
    </source>
    <label @dummydata>
      <match dummy>
        @type null
        @id blackhole
      </match>
    </label>
  </ROOT>
  2023-03-13 17:27:59 +0000 [info]: starting fluentd-1.15.3 pid=4176 ruby="3.1.3"
  2023-03-13 17:27:59 +0000 [info]: spawn command to main:  cmdline=["C:/hostedtoolcache/windows/Ruby/3.1.3/x64/bin/ruby.exe", "-rC:/hostedtoolcache/windows/Ruby/3.1.3/x64/lib/ruby/gems/3.1.0/gems/bundler-2.4.8/lib/bundler/setup", "-Eascii-8bit:ascii-8bit", "D:/a/fluentd/fluentd/bin/fluentd", "-c", "D:/a/fluentd/fluentd/test/command/../tmp/commandfluentd/0cf12fe51588e91b8151/crashing_plugin.conf", "-p", "D:/a/fluentd/fluentd/test/command/../tmp/commandfluentd/0cf12fe51588e91b8151/plugin", "--under-supervisor"]
  2023-03-13 17:28:04 +0000 [info]: #0 init worker0 logger path=nil rotate_age=nil rotate_size=nil
  2023-03-13 17:28:04 +0000 [info]: adding match in @dummydata pattern="dummy" type="null"
  .
  <false> is not true.
D:/a/fluentd/fluentd/test/command/test_fluentd.rb:260:in `assert_fluentd_fails_to_start'
D:/a/fluentd/fluentd/test/command/test_fluentd.rb:599:in `block (2 levels) in <class:TestFluentdCommand>'
===============================================================================
: (10.033051)
  configuration with wrong plugin type: 
    test: failed to start:				.: (5.462728)
  configured to route log events to plugins: 
    test: by <label @FLUENT_LOG> section:		.: (9.923831)
    test: by <label> section with warning for missing log levels:	.: (10.019644)
    test: by top level <match fluent.*> section:	F
===============================================================================
Failure: test: by top level <match fluent.*> section(TestFluentdCommand::configured to route log events to plugins):
  execution timeout.
  <"2023-03-13 17:28:35 +0000 [info]: init supervisor logger path=nil rotate_age=nil rotate_size=nil\r\n" +
  "2023-03-13 17:28:35 +0000 [info]: parsing config file is succeeded path=\"D:/a/fluentd/fluentd/test/command/../tmp/commandfluentd/6566f5f0db6b250dacec/logevent_1.conf\"\r\n" +
  "2023-03-13 17:28:35 +0000 [info]: gem 'fluentd' version '1.15.3'\r\n" +
  "2023-03-13 17:28:35 +0000 [warn]: [dummy] both of Plugin @id and path for <storage> are not specified. Using on-memory store.\r\n" +
  "2023-03-13 17:28:35 +0000 [warn]: [dummy] both of Plugin @id and path for <storage> are not specified. Using on-memory store.\r\n" +
  "2023-03-13 17:28:35 +0000 [warn]: define <match fluent.**> to capture fluentd logs in top level is deprecated. Use <label @FLUENT_LOG> instead\r\n" +
  "2023-03-13 17:28:35 +0000 [info]: using configuration file: <ROOT>\r\n" +
  "  <source>\r\n" +
  "    @type dummy\r\n" +
  "    @id dummy\r\n" +
  "    tag \"dummy\"\r\n" +
  "    dummy {\"message\":\"yay!\"}\r\n" +
  "  </source>\r\n" +
  "  <match dummy>\r\n" +
  "    @type null\r\n" +
  "    @id blackhole\r\n" +
  "  </match>\r\n" +
  "  <match fluent.**>\r\n" +
  "    @type stdout\r\n" +
  "  </match>\r\n" +
  "</ROOT>\r\n" +
  "2023-03-13 17:28:35 +0000 [info]: starting fluentd-1.15.3 pid=488 ruby=\"3.1.3\"\r\n" +
  "2023-03-13 17:28:35 +0000 [info]: spawn command to main:  cmdline=[\"C:/hostedtoolcache/windows/Ruby/3.1.3/x64/bin/ruby.exe\", \"-rC:/hostedtoolcache/windows/Ruby/3.1.3/x64/lib/ruby/gems/3.1.0/gems/bundler-2.4.8/lib/bundler/setup\", \"-Eascii-8bit:ascii-8bit\", \"D:/a/fluentd/fluentd/bin/fluentd\", \"-c\", \"D:/a/fluentd/fluentd/test/command/../tmp/commandfluentd/6566f5f0db6b250dacec/logevent_1.conf\", \"--under-supervisor\"]\r\n" +
  "2023-03-13 17:28:39 +0000 [info]: #0 init worker0 logger path=nil rotate_age=nil rotate_size=nil\r\n" +
  "2023-03-13 17:28:39 +0000 [info]: adding match pattern=\"dummy\" type=\"null\"\r\n">
  was expected to include:
  <["fluentd worker is now running",
   "fluent.info: {\"worker\":0,\"message\":\"fluentd worker is now running worker=0\"}",
   "define <match fluent.**> to capture fluentd logs in top level is deprecated. Use <label @FLUENT_LOG> instead"]>
  and not include:
  <["[warn]: some tags for log events are not defined in top level (to be ignored) tags=[\"fluent.trace\", \"fluent.debug\"]"]>
D:/a/fluentd/fluentd/test/command/test_fluentd.rb:209:in `assert_log_matches'
D:/a/fluentd/fluentd/test/command/test_fluentd.rb:397:in `block (2 levels) in <class:TestFluentdCommand>'
     394: </match>
     395: CONF
     396:       conf_path = create_conf_file('logevent_1.conf', conf)
  => 397:       assert_log_matches(
     398:         create_cmdline(conf_path),
     399:         "fluentd worker is now running",
     400:         'fluent.info: {"worker":0,"message":"fluentd worker is now running worker=0"}',
===============================================================================
: (10.054901)
    test: by top level <match> section with warning for missing log levels (and warnings for each log event records):	F
===============================================================================
Failure: test: by top level <match> section with warning for missing log levels (and warnings for each log event records)(TestFluentdCommand::configured to route log events to plugins):
  execution timeout.
  <"2023-03-13 17:28:45 +0000 [info]: init supervisor logger path=nil rotate_age=nil rotate_size=nil\r\n" +
  "2023-03-13 17:28:45 +0000 [info]: parsing config file is succeeded path=\"D:/a/fluentd/fluentd/test/command/../tmp/commandfluentd/19990316b67dd118e6c3/logevent_2.conf\"\r\n" +
  "2023-03-13 17:28:45 +0000 [info]: gem 'fluentd' version '1.15.3'\r\n" +
  "2023-03-13 17:28:45 +0000 [warn]: [dummy] both of Plugin @id and path for <storage> are not specified. Using on-memory store.\r\n" +
  "2023-03-13 17:28:45 +0000 [warn]: [dummy] both of Plugin @id and path for <storage> are not specified. Using on-memory store.\r\n" +
  "2023-03-13 17:28:45 +0000 [warn]: define <match fluent.warn>, <match fluent.error>, <match fluent.fatal> to capture fluentd logs in top level is deprecated. Use <label @FLUENT_LOG> instead\r\n" +
  "2023-03-13 17:28:45 +0000 [warn]: match for some tags of log events are not defined in top level (to be ignored) tags=[\"fluent.trace\", \"fluent.debug\", \"fluent.info\"]\r\n" +
  "2023-03-13 17:28:45 +0000 [info]: using configuration file: <ROOT>\r\n" +
  "  <source>\r\n" +
  "    @type dummy\r\n" +
  "    @id dummy\r\n" +
  "    tag \"dummy\"\r\n" +
  "    dummy {\"message\":\"yay!\"}\r\n" +
  "  </source>\r\n" +
  "  <match dummy>\r\n" +
  "    @type null\r\n" +
  "    @id blackhole\r\n" +
  "  </match>\r\n" +
  "  <match fluent.warn fluent.error fluent.fatal>\r\n" +
  "    @type stdout\r\n" +
  "  </match>\r\n" +
  "</ROOT>\r\n" +
  "2023-03-13 17:28:45 +0000 [info]: starting fluentd-1.15.3 pid=5940 ruby=\"3.1.3\"\r\n" +
  "2023-03-13 17:28:45 +0000 [info]: spawn command to main:  cmdline=[\"C:/hostedtoolcache/windows/Ruby/3.1.3/x64/bin/ruby.exe\", \"-rC:/hostedtoolcache/windows/Ruby/3.1.3/x64/lib/ruby/gems/3.1.0/gems/bundler-2.4.8/lib/bundler/setup\", \"-Eascii-8bit:ascii-8bit\", \"D:/a/fluentd/fluentd/bin/fluentd\", \"-c\", \"D:/a/fluentd/fluentd/test/command/../tmp/commandfluentd/19990316b67dd118e6c3/logevent_2.conf\", \"--under-supervisor\"]\r\n" +
  "2023-03-13 17:28:49 +0000 [info]: #0 init worker0 logger path=nil rotate_age=nil rotate_size=nil\r\n" +
  "2023-03-13 17:28:49 +0000 [info]: adding match pattern=\"dummy\" type=\"null\"\r\n">
  was expected to include:
  <["fluentd worker is now running",
   "[warn]: #0 match for some tags of log events are not defined in top level (to be ignored) tags=[\"fluent.trace\", \"fluent.debug\", \"fluent.info\"]",
   "define <match fluent.warn>, <match fluent.error>, <match fluent.fatal> to capture fluentd logs in top level is deprecated. Use <label @FLUENT_LOG> instead",
   "[warn]: #0 no patterns matched tag=\"fluent.info\""]>
D:/a/fluentd/fluentd/test/command/test_fluentd.rb:209:in `assert_log_matches'
D:/a/fluentd/fluentd/test/command/test_fluentd.rb:413:in `block (2 levels) in <class:TestFluentdCommand>'
     410: </match>
     411: CONF
     412:       conf_path = create_conf_file('logevent_2.conf', conf)
  => 413:       assert_log_matches(
     414:         create_cmdline(conf_path),
     415:         "fluentd worker is now running",
     416:         '[warn]: #0 match for some tags of log events are not defined in top level (to be ignored) tags=["fluent.trace", "fluent.debug", "fluent.info"]',

@daipom
Copy link
Contributor Author

daipom commented Mar 14, 2023

It seems that it has still another issue.

It looks like the same phenomenon.
Somehow it could not take the logs of the worker processes, so it didn't match and failed.
Sometimes it seems to be impossible to take the logs of some processes.

I made this fix as I thought it was caused by other surviving processes that were not killed,
but I'm not sure if there are other causes, or if there are still some surviving processes.

@daipom
Copy link
Contributor Author

daipom commented Mar 14, 2023

Last night, I can't reproduce it on my local in this branch, but now it reproduces on my local.
I can dig a little deeper.

@daipom
Copy link
Contributor Author

daipom commented Mar 14, 2023

Somehow it could not take the logs of the worker processes, so it didn't match and failed.

Sorry, this log is from the worker process, so it could take worker logs too,,,

"2023-03-13 17:28:39 +0000 [info]: #0 init worker0 logger path=nil rotate_age=nil rotate_size=nil\r\n

Hmm, now I think this is caused by short timeout interval 10.

On Windows, sometimes 10s is too short and can't take some logs to assert.

Signed-off-by: Daijiro Fukuda <fukuda@clear-code.com>
@daipom
Copy link
Contributor Author

daipom commented Mar 14, 2023

Trying timeout: 20s ...

@daipom
Copy link
Contributor Author

daipom commented Mar 14, 2023

Now looks stable!

@ashie
Copy link
Member

ashie commented Mar 14, 2023

I'll rerun CI a few times to confirm.

@daipom
Copy link
Contributor Author

daipom commented Mar 14, 2023

The failing test is not related to this.

https://github.com/fluent/fluentd/actions/runs/4411233957/jobs/7730096945

test_send_keepalive_packet_can_be_enabled:		E
===============================================================================
Error: test_send_keepalive_packet_can_be_enabled(SyslogInputTest): IOError: stream closed in another thread
C:/hostedtoolcache/windows/Ruby/2.7.7/x64/lib/ruby/gems/2.7.0/gems/cool.io-1.7.1-x64-mingw32/lib/cool.io/listener.rb:65:in `accept'
C:/hostedtoolcache/windows/Ruby/2.7.7/x64/lib/ruby/gems/2.7.0/gems/cool.io-1.7.1-x64-mingw32/lib/cool.io/listener.rb:65:in `on_readable'
C:/hostedtoolcache/windows/Ruby/2.7.7/x64/lib/ruby/gems/2.7.0/gems/cool.io-1.7.1-x64-mingw32/lib/cool.io/loop.rb:88:in `run_once'
C:/hostedtoolcache/windows/Ruby/2.7.7/x64/lib/ruby/gems/2.7.0/gems/cool.io-1.7.1-x64-mingw32/lib/cool.io/loop.rb:88:in `run'
D:/a/fluentd/fluentd/lib/fluent/plugin_helper/event_loop.rb:93:in `block in start'
D:/a/fluentd/fluentd/lib/fluent/plugin_helper/thread.rb:78:in `block in thread_create'

@daipom
Copy link
Contributor Author

daipom commented Mar 14, 2023

Not related to this again.

https://github.com/fluent/fluentd/actions/runs/4411233957/jobs/7731373094

TestFluentCat: 
  json: 
    test_cat_json:					.: (4.912042)
  msgpack: 
    test_cat_secondary_file:				.: (6.441610)
  send specific event time: 
    test_with_event_time:				.: (3.893332)
    test_without_event_time:				F
===============================================================================
Failure: test_without_event_time(TestFluentCat::send specific event time):
  <1678769158.613754> -/+ <3.0> was expected to include
  <1678769162.4269545>.
  
  Relation:
  <<1678769158.613754>-<3.0>[1678769155.613754] <= <1678769158.613754>+<3.0>[1678769161.613754] < <1678769162.4269545>>
D:/a/fluentd/fluentd/test/command/test_cat.rb:111:in `test_without_event_time'
     108:         end
     109:       end
     110:       event = d.events.first
  => 111:       assert_in_delta(event_time.to_f, event[1].to_f, 3.0) # expect command to be finished in 3 seconds
     112:       assert_equal([1, "tag", true, @record],
     113:                    [d.events.size, event.first, event_time.to_f < event[1].to_f, event.last])
     114:     end

https://github.com/fluent/fluentd/actions/runs/4411233957/jobs/7731373003

  keepalive: 
    test: Do not create connection per send_data:	E
===============================================================================
Error: test: Do not create connection per send_data(ForwardOutputTest::keepalive): Errno::EACCES: Permission denied - bind(2) for 127.0.0.1:63697
C:/hostedtoolcache/windows/Ruby/3.2.1/x64/lib/ruby/3.2.0/socket.rb:183:in `bind'
C:/hostedtoolcache/windows/Ruby/3.2.1/x64/lib/ruby/3.2.0/socket.rb:183:in `bind'
D:/a/fluentd/fluentd/lib/fluent/plugin_helper/server.rb:390:in `server_create_udp_socket'
D:/a/fluentd/fluentd/lib/fluent/plugin_helper/server.rb:174:in `server_create'
D:/a/fluentd/fluentd/lib/fluent/plugin/in_forward.rb:183:in `start'
D:/a/fluentd/fluentd/lib/fluent/test/driver/base.rb:114:in `instance_start'
D:/a/fluentd/fluentd/lib/fluent/test/driver/base.rb:77:in `run'
D:/a/fluentd/fluentd/lib/fluent/test/driver/base_owner.rb:130:in `run'
D:/a/fluentd/fluentd/test/plugin/test_out_forward.rb:1301:in `block (2 levels) in <class:ForwardOutputTest>'
===============================================================================
: (0.597100)
    test: create timer of purging obsolete sockets:	.: (0.295891)
    with require_ack_response: 
      test: Create connection per send_data:		E
===============================================================================
Error: test: Create connection per send_data(ForwardOutputTest::keepalive::with require_ack_response): Errno::EACCES: Permission denied - bind(2) for 127.0.0.1:63699
C:/hostedtoolcache/windows/Ruby/3.2.1/x64/lib/ruby/3.2.0/socket.rb:183:in `bind'
C:/hostedtoolcache/windows/Ruby/3.2.1/x64/lib/ruby/3.2.0/socket.rb:183:in `bind'
D:/a/fluentd/fluentd/lib/fluent/plugin_helper/server.rb:390:in `server_create_udp_socket'
D:/a/fluentd/fluentd/lib/fluent/plugin_helper/server.rb:174:in `server_create'
D:/a/fluentd/fluentd/lib/fluent/plugin/in_forward.rb:183:in `start'
D:/a/fluentd/fluentd/lib/fluent/test/driver/base.rb:114:in `instance_start'
D:/a/fluentd/fluentd/lib/fluent/test/driver/base.rb:77:in `run'
D:/a/fluentd/fluentd/lib/fluent/test/driver/base_owner.rb:130:in `run'
D:/a/fluentd/fluentd/test/plugin/test_out_forward.rb:1343:in `block (3 levels) in <class:ForwardOutputTest>'
===============================================================================
: (0.566815)
  verify_connection_at_startup: 
    test: nodes are not available:			.: (2.435514)
    test: nodes_shared_key_match:			E
===============================================================================
Error: test: nodes_shared_key_match(ForwardOutputTest::verify_connection_at_startup): Errno::EACCES: Permission denied - bind(2) for 127.0.0.1:63702
C:/hostedtoolcache/windows/Ruby/3.2.1/x64/lib/ruby/3.2.0/socket.rb:183:in `bind'
C:/hostedtoolcache/windows/Ruby/3.2.1/x64/lib/ruby/3.2.0/socket.rb:183:in `bind'
D:/a/fluentd/fluentd/lib/fluent/plugin_helper/server.rb:390:in `server_create_udp_socket'
D:/a/fluentd/fluentd/lib/fluent/plugin_helper/server.rb:174:in `server_create'
D:/a/fluentd/fluentd/lib/fluent/plugin/in_forward.rb:183:in `start'
D:/a/fluentd/fluentd/lib/fluent/test/driver/base.rb:114:in `instance_start'
D:/a/fluentd/fluentd/lib/fluent/test/driver/base.rb:77:in `run'
D:/a/fluentd/fluentd/lib/fluent/test/driver/base_owner.rb:130:in `run'
D:/a/fluentd/fluentd/test/plugin/test_out_forward.rb:1225:in `block (2 levels) in <class:ForwardOutputTest>'
===============================================================================
: (0.270366)
    test: nodes_shared_key_miss_match:			E
===============================================================================
Error: test: nodes_shared_key_miss_match(ForwardOutputTest::verify_connection_at_startup): Errno::EACCES: Permission denied - bind(2) for 127.0.0.1:63703
C:/hostedtoolcache/windows/Ruby/3.2.1/x64/lib/ruby/3.2.0/socket.rb:183:in `bind'
C:/hostedtoolcache/windows/Ruby/3.2.1/x64/lib/ruby/3.2.0/socket.rb:183:in `bind'
D:/a/fluentd/fluentd/lib/fluent/plugin_helper/server.rb:390:in `server_create_udp_socket'
D:/a/fluentd/fluentd/lib/fluent/plugin_helper/server.rb:174:in `server_create'
D:/a/fluentd/fluentd/lib/fluent/plugin/in_forward.rb:183:in `start'
D:/a/fluentd/fluentd/lib/fluent/test/driver/base.rb:114:in `instance_start'
D:/a/fluentd/fluentd/lib/fluent/test/driver/base.rb:77:in `run'
D:/a/fluentd/fluentd/lib/fluent/test/driver/base_owner.rb:130:in `run'
D:/a/fluentd/fluentd/test/plugin/test_out_forward.rb:1156:in `block (2 levels) in <class:ForwardOutputTest>'
===============================================================================
: (0.290285)
    test: nodes_shared_key_miss_match with TLS:		E
===============================================================================
Error: test: nodes_shared_key_miss_match with TLS(ForwardOutputTest::verify_connection_at_startup): Errno::EACCES: Permission denied - bind(2) for 127.0.0.1:63704
C:/hostedtoolcache/windows/Ruby/3.2.1/x64/lib/ruby/3.2.0/socket.rb:183:in `bind'
C:/hostedtoolcache/windows/Ruby/3.2.1/x64/lib/ruby/3.2.0/socket.rb:183:in `bind'
D:/a/fluentd/fluentd/lib/fluent/plugin_helper/server.rb:390:in `server_create_udp_socket'
D:/a/fluentd/fluentd/lib/fluent/plugin_helper/server.rb:174:in `server_create'
D:/a/fluentd/fluentd/lib/fluent/plugin/in_forward.rb:183:in `start'
D:/a/fluentd/fluentd/lib/fluent/test/driver/base.rb:114:in `instance_start'
D:/a/fluentd/fluentd/lib/fluent/test/driver/base.rb:77:in `run'
D:/a/fluentd/fluentd/lib/fluent/test/driver/base_owner.rb:130:in `run'
D:/a/fluentd/fluentd/test/plugin/test_out_forward.rb:1191:in `block (2 levels) in <class:ForwardOutputTest>'
===============================================================================

@ashie
Copy link
Member

ashie commented Mar 14, 2023

For TestFluentdCommand now they seems stable enough.

@ashie ashie merged commit 54d7ae4 into fluent:master Mar 14, 2023
@ashie
Copy link
Member

ashie commented Mar 14, 2023

Thanks!

@daipom daipom deleted the stable-test-fluentd-command-on-windows branch March 14, 2023 07:03
@daipom
Copy link
Contributor Author

daipom commented Mar 14, 2023

Thanks for your review!

@ashie ashie added this to the v1.16.0 milestone Mar 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Test: TestFluentdCommand often fails on Windows
2 participants