Provide mechanism for streaming logs from modules #92

emonty · 2018-01-04T20:13:26Z

Proposal: Provide mechanism for streaming logs from modules

Author: Monty Taylor <@emonty> IRC: mordred

Date: 2018/01/04

Status: New
Proposal type: core design
Targeted Release: 2.6
Estimated time to implement: not too terribly long - agreeing on the design is the hard part

Motivation

Long running tasks can be a black hole of despair for a user, and waiting until one is complete to be able to see what happened can often be frustrating.

We have a hacked-in version of output streaming for command/shell modules in our ansible-playbook invocations in Zuul v3:

https://github.com/openstack-infra/zuul/blob/feature/zuulv3/zuul/ansible/library/command.py
https://github.com/openstack-infra/zuul/blob/feature/zuulv3/zuul/ansible/callback/zuul_stream.py#L118-L167

but it would be better if such a facility was available to all ansible users, and for a facility to exist for module authors to be able to do something similar.

Assuming we can agree on an overall approach, we can implement the majority of this as an update to the existing zuul streaming code and then forklift it over when it's working well. (most of the code is in standalone classes with minimal amounts of updates to the guts of existing Ansible objects)

Problems

What problems exist that this proposal will solve?

Output from long-running tasks is not available until the task is complete
Modules that use third-party libraries that have python stdlib logging based logging can't wire up those loggers in any way that causes output to return to ansible-playbook.
Some people hate python stdlib logging. Some people make extensive use of it. Both sets of people need to be happy.

Solution proposal

tl;dr - Emit structured log messages over a socket from AnsibleModule back to a port on the calling machine that is optionally forwarded to the remote machine over the Connection. When the messages are received on the calling machine, fire a Callback method to handle them.

Add main ansible.cfg option "realtime_streaming" that defaults to None
Add main ansible.cfg option "realtime_streaming_local_port" that defaults to None
Add main ansible.cfg option "realtime_streaming_remote_port" that defaults to None
Add main ansible.cfg option "realtime_streaming_host" that defaults to None

There is a 2x2 supportability matrix:

Port on calling machine is or is not reachable over the network from the remote machine
Connection plugin supports forwarding arbitrary ports (like ssh) or does not (like network_cli)

Detecting whether the local port is reachable from the remote machine without forwarding is not consistently and efficiently possible, so defaulting realtime_streaming to off is the safest thing. As a follow-on it could be possible to automatically enable streaming for Connection plugins that support it. Thus having the default for realtime_streaming be None rather than False so we can have None mean "on, defaulting to forwarding the port if the connection supports it, otherwise off" and False being explicit disable and True being explicit enable.

Add flag to ConnectionBase "has_port_forwarding" defaulting to False
In ConnectionBase, something like the following:

  if C.REALTIME_STREAMING:
    # Get the LogStreamListener singleton instance
    listener = get_logstream_listener()
    listener.ensure_started()
    local_port = listener.get_local_port()
    if C.REALTIME_STREAMING_REMOTE_PORT:
      # We're going to do port forwarding
      self._forward_streaming_ports(local_port, C.REALTIME_STREAMING_REMOTE_PORT)
      self._play_context._log_streaming_port = C.REALTIME_STREAMING_REMOTE_PORT)
      self._play_context._log_streaming_host = 'localhost'
    else:
      # We're going to have the remote host connect directly
      self._play_context._log_streaming_port = local_port
      self._play_context._log_streaming_host = C.REALTIME_STREAMING_HOST or platform.node()

In the Task executor, pass _play_context._log_streaming_port and _play_context._log_streaming_host to the remote module invocation via _ansible private variables
In AnsibleModule, if _ansible_remote_stream_host and port are set, instantiate a StreamingLogHandler. If the are not set, instantiate a NullHandler.
In AnsibleModule, add methods to connect log stream to existing library logging

  def connect_logging_stream(self, name, level):
    """Configures a named python stdlib logger to emit messages across to the remote socket."""

  def connect_twiggy_stream(self, name, level, filters=None):
    """Configures a named twiggy logger to emit messages across to the remote socket."""

In AnsibleModule, add a ```stream_message`` method (or something with a better name) to allow for sending a single message back over the wire easily from module code.
in modules/command.py, watch stderr/stdout as it happens and call self.stream_message for each line. (Perhaps want to have C.REALTIME_STREAMING cause stdout and stderr to be combined - since it'll ultimately only be one stream - but leave the current split behavior for non-streaming cases so that people have the differentiation.)
Write StreamingLogHandler class either as a subclass of logging.handlers.SocketHandler or from scratch that sends each message and its data as JSON. subclassing would be easiest, but since we want to support richer structured data from twiggy as well, we need to assess whether the existing SockerHandler is robust enough and gets all of the data to the other side.
Add a Callback method that can be called for each log message received
Write LogStreamListener class to listen on the port for log messages, unpack them and call the Callback method with the expanded data.

Testing

Oh dear god yes. There will definitely need to be testing.

The text was updated successfully, but these errors were encountered:

bcoca · 2018-01-05T00:47:20Z

an early attempt to do this https://github.com/bcoca/ansible/tree/update_json

alikins · 2018-01-23T16:41:17Z

The logging socket/connection here sounds like a specific case of more generally supporting additional connection channels between controller/remote. ie, the current 2 channels (stdin/stdout) plus one or more additional channels for the streaming stdout and/or stderr and/or logging.

Making bits that use connection plugins poll/select additional channels provided by the connection plugin seems doable, especially if the connection plugin can make those channels available as fd's.

Probably not too difficult with something like paramiko, but I don't know of an obvious way to do it with the 'ssh' connection plugin. Perhaps using it's port or unix socket forwarding?

alikins · 2018-01-23T17:53:28Z

Vaguely related pr: ansible/ansible#20930 - module logging ('log_records' in module return json)

ansible/ansible#20930 is for supporting a way for modules to use stdlib logging and to return log records in the json results. It also includes an example callback that shows how to then send those log records to a regular python logging handler.

But #20930 does not add any streaming support however, and streaming is the main point of this proposal.

ie of the 3 'Problems' mention in #92 (comment) my #20930 pr only addresses 'Modules that use third-party libraries that have python stdlib logging based logging can't wire up those loggers in any way that causes output to return to ansible-playbook.'

dagwieers · 2018-02-06T08:43:21Z

There is a relation with multiple stream support for Windows as Powershell has 5 output streams and preferably we may want to capture some of those streams on the master as well. (I couldn't find a reference to prior discussions about this)

alikins · 2018-02-09T17:17:31Z

@emonty The links in the description are 404 now. Are these the correct current urls?

https://github.com/openstack-infra/zuul/blob/master/zuul/ansible/library/command.py
https://github.com/openstack-infra/zuul/blob/master/zuul/ansible/callback/zuul_stream.py

bcoca · 2018-02-15T14:58:37Z

I refreshed my https://github.com/bcoca/ansible/tree/update_json branch, almost there, got basics working, just need to hook into callbacks and make sure its always unbuffered.

gundalow · 2018-10-26T11:08:21Z

@bcoca Is this going into Ansible 2.8?
I see you made updates into https://github.com/bcoca/ansible/tree/update_json during September.

bcoca · 2018-10-26T15:39:17Z

idk yet, i have other priorities right now, its was 'done' at the beginning of 2.7 but i hit a huge performance hit when forcing all python to be unbuffered, there have been other users that have sent patches to it, but i have had other priorities and no time to check/confirm that fixes the issue.

berlic · 2019-01-30T19:49:13Z

In case somebody want to debug own modules with "online" logs, here's my workaround: https://stackoverflow.com/a/54448411/2795592

flatline-studios · 2019-02-05T20:20:23Z

Is this likely to make it into 2.8? This would be awesome for checking make and other long running operations.

pkleanthous-zz · 2019-06-20T10:48:04Z

Hey guys,

Any news on that?
what's the best workaround solution?

lucasbasquerotto · 2019-06-28T02:02:12Z

In the meantime, for anyone wanting a solution for this, I achieved a result that was satisfactory for me and created a repository with a minimal demonstration of what it does and the steps to reproduce it:

https://github.com/lucasbasquerotto/ansible-live-output-demo

Screenshot

xeor · 2019-08-02T21:17:18Z

This obviously didn't make 2.8, but I can't see it on the roadmap for 2.9 either (https://github.com/ansible/ansible/projects/34). @lucasbasquerotto made something that works in the meantime, but if the real fix is just a couple of months away, I'll just going to wait instead :)

ajacocks · 2019-11-06T19:22:07Z

I'm quite interested to see this, as well, as I run a lot of long-running tasks, and I get complaints that my playbooks are hanging.

harlowja · 2019-11-09T00:56:14Z

Same, also I'd like this (for the same reasons + debugging long running commands).

vrubiolo · 2019-11-11T18:15:56Z

I fully second this too, I have been using Ansible to migrate existing shell-based infrastructure scripts and the first step is to run them via command/shell. This takes forever and is pretty frustrating when it fails after a long time (not speaking about the ton of output I have to go through after that, esp if the script does not do a stellar job at error checking). Having the ability to stream stdout/stderr would greatly help with that.

This can also be an asset when introducing Ansible to shell-based workflows and how easy it is to run the existing scripts with Ansible. Without streaming output, this is a far less appealing proposal...

allx · 2019-12-07T07:02:03Z

I have created a solution to start a simple socket server on controller node to receive realtime logs from managed node. It's not considering security issue and performance. I have used it for trigger and monitor long running scripts and it's working fine for the purpose.

remote_logging

bobmacks · 2019-12-30T00:22:11Z

+1

bpar476 · 2020-01-03T04:48:55Z

+1 I'm using ansible to manage the backup of a database and a file system and ansible hides the progress that is logged to stdout.

wilalalee · 2020-02-03T03:28:25Z

It's necessary. my custom module reports realtime status, but i can't see info at ansible stdout

paeolo · 2020-04-01T21:41:11Z

+1. I think it's very valuable for anyone trying to use ansible as a CI/CD. Sometimes one wants to see the logs during the build process.

AssafKatz3 · 2022-02-09T06:24:13Z

Hi,
This proposal is open for four years with issues from 2014 (at least) points to it. Is there any progress about it? Is just waiting for a programmer or need something else?
Thanks

bcoca · 2022-02-09T15:09:13Z

See my PR/branch linked above, I hit a problem with python buffering and have not had time to revisit it.

rgadwagner · 2022-04-07T18:43:03Z

I...guess I don't understand. And this may be a massive oversimplification but...

When I "async"/poll 0 a command that prints output, there appears to already be a STDOUT captured with that command (I assume STDERR also). When I then use "async_status" the output from the async'd job prints upon job completion. That seems to me that there's a stdout/stderr for each of these executions and it's at some point accessible to whatever is running async/poll > 0 or async/poll = 0/async_status.

In a real language, each "poll" iteration you'd simply flush stdout and debug/print the resulting buffer for status. The problem with ansible is that currently a command run async/poll > 0 or async_status sets up a loop that we can't interact with. While "async/poll > 0" might be an issue to deal with since it's not really it's own module, I don't see why "async_status" couldn't be setup with a "show_stdout: yes" and "show_stderr: yes" argument that simply turns on a debug or flush/debug of the buffers each time the loop iterates the check.

I get the concept of the buffering problem, but ANY output, right now, would be better than what we have (which is absolutely nothing till a process completes). Even if the buffer display is behind the actual output it's still better than having absolutely nothing till the end. Users can control their own behavior by forcing their own commands (such as those run shell or command) to flush buffers periodically so that up to date status is acquired by Ansible.

bcoca · 2022-04-08T17:35:05Z

@rgadwagner stdout/stderr in the case of async is just a proxy for the normal module response, it is not 'stdout/stderr' from commands run by the module (if any, many modules do not run system commands) . Async in that case just proxies the 'final result' while having an intermediate poll asking 'are you done yet?' which is not even a 'progress' but a binary state (finshed/not finished).

This could be part of a system to report 'current feedback' but modules would still need to be modified to actually provide that feedback.

jbguerraz · 2022-05-03T16:02:51Z

@bcoca what's the performance issue with the unbuffered python ? I was wondering if we couldn't keep it buffered but flush the buffer at regular interval instead if somehow that would help with performance (?). I believe that would be an acceptable tradeoff (get an update each N millisecond or seconds is much better than at the command completion)

bcoca · 2022-05-03T16:08:27Z

basically the issue was that starting python in an unbuffered state made everything take a lot longer, fact gathering was really painfully slow.

Yes setting up a flush timer would be a workaround, but it needs to affect everything, other options I was exploring was limiting teh 'unbuffered' state to stdout instead of all python file handles.

something like this:

flags = fcntl.fcntl(sys.stdout.fileno(), fcntl.F_GETFL)
flags |= os.O_SYNC
fcntl.fcntl(sys.stdout.fileno(), fcntl.F_SETFL, flags)

willzhang · 2023-01-17T03:00:48Z

need it in 2023

ehrenmann1977 · 2023-02-13T08:51:17Z

i use screen -d -m for a detached screen, then i run the script, Ansible returns it is done, although it is still running, if i want to trace the progress, i will login to the server and attach to the screen and see the output, or tail -f the output log file. Here is my work arround.


`
   
    - name: Install Screen
      become: yes
      package:
        name: screen
        state: present

    - name: Enable logging for a screen session
      command: |
        screen -L -dmS mysession -Logfile /tmp/install_fusionpbx.txt

    - name: Run pre-install script
      command: sh -c "screen -S mysession -X stuff '/tmp/pre-install.sh\n'"

    - name: Install Fusion PBX using screen
      command: sh -c "screen -S mysession -X stuff 'cd /usr/src/fusionpbx-install.sh/debian && ./install.sh\n'"
      register: fusion_pbx_install
`
[update: i updated the code and reduced to 3 tasks only, this is tested and is working]

huyz · 2023-02-13T09:21:40Z

Tip: if you've forgotten to run something in tmux or screen, it's not too late. Check out: https://github.com/nelhage/reptyr

micisse · 2023-02-13T14:23:22Z

@ehrenmann1977 cool but it's a bit of a shame to get to that point (6 tasks just for that)... Everyone proposes techniques here and there. Imagine if we need the final result of the (long) script before going to the next task (used other techniques again?). The best would be to have a default solution included in ansible and beneficial to all. A lot of hype around this feature🤞. +1

retpolanne · 2023-06-08T11:52:43Z

Just wanted to chime in with a use case for this:

I have a fido2 device that I want to automate doing cryptenroll. However, it asks for user presence at some specific moments otherwise the command times out. Streaming these logs would be great for this automation.

+1

pavetok · 2023-06-30T19:48:53Z

This feature can greatly improve developer experience when ansible used as universal build tool (like make).

erlangparasu · 2023-09-10T22:52:29Z

I'm quite interested to see this, as well, as I run a lot of long-running tasks, and I get complaints that my playbooks are hanging.

Agree

Harliff · 2023-10-17T15:00:30Z

+1

Detavern · 2023-11-27T07:54:38Z

+1

maxpain · 2024-03-03T12:43:41Z

Any updates on this?

man0s · 2024-06-07T11:48:24Z

+1

mfld · 2024-08-22T07:23:31Z

+1

cerdman · 2024-08-23T20:03:28Z

+1

Nklya · 2024-08-24T12:59:39Z

if this feature is something you are interested in

Rather than wasting everyone's time with +1, if someone really wants it implemented, they can create a PR with this feature or push RedHat via support if they're paid customers. It's open source.

JakkuSakura · 2024-08-24T13:27:30Z

I no longer use Ansible. Rather, I use pyinfra which has this feature built-in and has nicer syntax. No need to mess with YAML DSL anymore

lujinke · 2024-08-30T01:31:41Z

Any updates here? Still open in Aug. 2024.

bcoca mentioned this issue Feb 15, 2018

It would be nice for copy to optionally show a progress bar during upload ansible/ansible#20579

Closed

webknjaz mentioned this issue Jun 26, 2018

Progress Bar for Uploads and Downloads ansible/ansible#41977

Closed

sivel mentioned this issue Aug 30, 2018

Optionally show progress during a synchonise operation ansible/ansible#44899

Closed

bcoca mentioned this issue Aug 31, 2018

update_json for module intermediate comms ansible/ansible#13620

Draft

jborean93 mentioned this issue Oct 26, 2018

show feedback when using "ec2_asg" and doing a rolling restart with "replace_all_instances" ansible/ansible#47555

Closed

acozine mentioned this issue Nov 26, 2018

update_json to stream logs from modules ansible/ansible#49159

Closed

dagwieers mentioned this issue Jan 29, 2019

Expose realtime output from 'shell' ansible/ansible#3887

Closed

bcoca mentioned this issue Jun 24, 2019

Revisiting realtime stdout/stderr ansible/ansible#17350

Closed

rsmekala mentioned this issue Jan 3, 2020

juniper_junos_software: NSSU doesn't log progress Juniper/ansible-junos-stdlib#432

Open

bcoca mentioned this issue Oct 5, 2021

Run command outside ansible and come back if finished succesfully ansible/ansible#75880

Closed

1 task

felixfontein mentioned this issue Oct 28, 2021

[Feature request] wait_for_txt - show failed lookups ansible-collections/community.dns#72

Open

lmm-git mentioned this issue Jul 5, 2022

Synchronize is using a lot of memory during sync of many files ansible-collections/ansible.posix#377

Open

igsilya mentioned this issue Sep 23, 2022

Switch from ovn-nbctl, ovn-sbctl, and ovs-vsctl to using ovsdbapp ovn-org/ovn-heater#136

Closed

This comment was marked as off-topic.

Sign in to view

Provide mechanism for streaming logs from modules #92

Provide mechanism for streaming logs from modules #92

Comments

emonty commented Jan 4, 2018

Proposal: Provide mechanism for streaming logs from modules

Motivation

Problems

Solution proposal

Testing

bcoca commented Jan 5, 2018

alikins commented Jan 23, 2018

alikins commented Jan 23, 2018 • edited Loading

dagwieers commented Feb 6, 2018

alikins commented Feb 9, 2018

bcoca commented Feb 15, 2018

gundalow commented Oct 26, 2018

bcoca commented Oct 26, 2018

berlic commented Jan 30, 2019

flatline-studios commented Feb 5, 2019

pkleanthous-zz commented Jun 20, 2019

lucasbasquerotto commented Jun 28, 2019

xeor commented Aug 2, 2019

ajacocks commented Nov 6, 2019

harlowja commented Nov 9, 2019

vrubiolo commented Nov 11, 2019

allx commented Dec 7, 2019 • edited Loading

bobmacks commented Dec 30, 2019

bpar476 commented Jan 3, 2020

wilalalee commented Feb 3, 2020

paeolo commented Apr 1, 2020 • edited Loading

AssafKatz3 commented Feb 9, 2022

bcoca commented Feb 9, 2022

rgadwagner commented Apr 7, 2022

bcoca commented Apr 8, 2022

jbguerraz commented May 3, 2022

bcoca commented May 3, 2022 • edited Loading

willzhang commented Jan 17, 2023

ehrenmann1977 commented Feb 13, 2023 • edited Loading

huyz commented Feb 13, 2023

micisse commented Feb 13, 2023 • edited Loading

retpolanne commented Jun 8, 2023

pavetok commented Jun 30, 2023

erlangparasu commented Sep 10, 2023 • edited Loading

Harliff commented Oct 17, 2023

Detavern commented Nov 27, 2023

maxpain commented Mar 3, 2024

man0s commented Jun 7, 2024

mfld commented Aug 22, 2024

cerdman commented Aug 23, 2024

This comment was marked as off-topic.

This comment was marked as off-topic.

Nklya commented Aug 24, 2024

JakkuSakura commented Aug 24, 2024

This comment was marked as off-topic.

lujinke commented Aug 30, 2024

alikins commented Jan 23, 2018 •

edited

Loading

allx commented Dec 7, 2019 •

edited

Loading

paeolo commented Apr 1, 2020 •

edited

Loading

bcoca commented May 3, 2022 •

edited

Loading

ehrenmann1977 commented Feb 13, 2023 •

edited

Loading

micisse commented Feb 13, 2023 •

edited

Loading

erlangparasu commented Sep 10, 2023 •

edited

Loading