Actions with large outputs fail with OSError: [Errno 7] Argument list too long #1043

kot0dama · 2023-10-12T06:23:22Z

We use an internal ops charm allowing the operator to run an action that would display the contents of a file.
If the output of the file exceeds a certain limit the action fails with:

Traceback (most recent call last):                                                                                                                                                                                                                           
  File "/var/lib/juju/agents/unit-test-1/charm/./src/charm.py", line 545, in <module>                                                                                                                                                            
    main(TestCharm)                                                                                                                                                                                                                                
  File "/var/lib/juju/agents/unit-test-1/charm/venv/ops/main.py", line 441, in main                                                                                                                                                              
    _emit_charm_event(charm, dispatcher.event_name)
  File "/var/lib/juju/agents/unit-test-1/charm/venv/ops/main.py", line 149, in _emit_charm_event
    event_to_emit.emit(*args, **kwargs)
  File "/var/lib/juju/agents/unit-test-1/charm/venv/ops/framework.py", line 354, in emit
    framework._emit(event)
  File "/var/lib/juju/agents/unit-test-1/charm/venv/ops/framework.py", line 830, in _emit
    self._reemit(event_path)
  File "/var/lib/juju/agents/unit-test-1/charm/venv/ops/framework.py", line 919, in _reemit
    custom_handler(event)
  File "/var/lib/juju/agents/unit-test-1/charm/./src/charm.py", line 250, in _on_show_application_logs
    event.set_results({"result": logs})
  File "/var/lib/juju/agents/unit-test-1/charm/venv/ops/charm.py", line 217, in set_results
    self.framework.model._backend.action_set(results)   # pyright: reportPrivateUsage=false
  File "/var/lib/juju/agents/unit-test-1/charm/venv/ops/model.py", line 2859, in action_set
    self._run('action-set', *[f"{k}={v}" for k, v in flat_results.items()])
  File "/var/lib/juju/agents/unit-test-1/charm/venv/ops/model.py", line 2616, in _run
    result = run(args, **kwargs)
  File "/usr/lib/python3.10/subprocess.py", line 503, in run
    with Popen(*popenargs, **kwargs) as process:
  File "/usr/lib/python3.10/subprocess.py", line 971, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "/usr/lib/python3.10/subprocess.py", line 1863, in _execute_child
    raise child_exception_type(errno_num, err_msg, err_filename)
OSError: [Errno 7] Argument list too long: '/var/lib/juju/tools/unit-test-1/action-set'

The limit seems to be per-character around 131060 chars, but this would need confirmation.

As I understand it might be better to limit the output to a reasonable length, this should be handled and not result in an error for the operator. Also, if there is such a limitation, it should probably be documented.

Thank you,
Loïc

The text was updated successfully, but these errors were encountered:

tonyandrewmeyer · 2023-10-12T20:08:38Z

The limit seems to be per-character around 131060 chars, but this would need confirmation.

It's the OS's limit for command line arguments. For example:

$ grep ARG_MAX /usr/include/linux/limits.h
#define ARG_MAX       131072	/* # bytes of args + environ for exec() */

(This is different than getconf ARG_MAX sometimes for reasons I don't fully understand). The result= takes up a few more of those bytes.

As I understand it might be better to limit the output to a reasonable length, this should be handled and not result in an error for the operator.

I don't think we would want to arbitrarily truncate results. We could raise something more specific, but I believe it's better to have the operation fail if it can't be executed as requested.

Also, if there is such a limitation, it should probably be documented.

+1

For your use case, can you perhaps have the action provide a juju scp command that the user could run that would allow retrieving the file?

benhoyt · 2023-10-12T20:27:58Z

Yeah, I agree with @tonyandrewmeyer. I think the exception message is relatively clear, but we could document this. I think it'd make sense to document it in the set_results docstring, and perhaps here or here?

kot0dama · 2023-10-13T08:09:42Z

The limit seems to be per-character around 131060 chars, but this would need confirmation.

It's the OS's limit for command line arguments. For example:
$ grep ARG_MAX /usr/include/linux/limits.h
#define ARG_MAX       131072	/* # bytes of args + environ for exec() */
(This is different than getconf ARG_MAX sometimes for reasons I don't fully understand). The result= takes up a few more of those bytes.

Ah I see! Is there a reason for the result of the juju action to be passed as an argument of an exec() call? Can't we pass on the result of an action through a named pipe or something else that would allow working around this limitation? This might also increase security as we won't be running exec() with random texts.

As I understand it might be better to limit the output to a reasonable length, this should be handled and not result in an error for the operator.

I don't think we would want to arbitrarily truncate results. We could raise something more specific, but I believe it's better to have the operation fail if it can't be executed as requested.

Oh I didn't mean truncate the results arbitrarily, but present to the user running the juju action a clear error.

At the moment, running an action that results in this stacktrace yields some unfriendly error:

$ juju run-action test/leader my-action --wait
test-1:
  UnitId: test/1
  id: "90"
  message: exit status 1
  results:
    ReturnCode: 1
  status: failed
  timing:
    completed: 2023-10-13 08:03:55 +0000 UTC
    enqueued: 2023-10-13 08:03:53 +0000 UTC
    started: 2023-10-13 08:03:53 +0000 UTC

But then maybe this needs filing another bug against juju itself? I think we might need the return code to be specific to a size limit error though for juju command line tool to be able to present a user-friendly message to the operator.

Also, if there is such a limitation, it should probably be documented.

+1

For your use case, can you perhaps have the action provide a juju scp command that the user could run that would allow retrieving the file?

I actually updated the action to tail the file instead of returning it altogether, but that's also an idea, thank you!
Although, the juju action itself wouldn't know that exception happened as it occurs after it has returned the output, so it cannot catch it and suggest an scp command instead.

jameinel · 2023-10-14T14:12:33Z

Actions are not intended to share arbitrary length file contents. You need to look more into streaming content in some other fashion. Note that even if we didnt use args to a command, we still cache the arg result in the database (you can tun juju commands to get the results of actions in the future.) Actions are not "ephemeral content" the way that `ssh command` would be. There has been some discussion about streaming content and actions, but it is out of scope at this time. John =:->

…

On Fri, Oct 13, 2023, 04:09 kot0dama ***@***.***> wrote: The limit seems to be per-character around 131060 chars, but this would need confirmation. It's the OS's limit for command line arguments. For example: $ grep ARG_MAX /usr/include/linux/limits.h #define ARG_MAX 131072 /* # bytes of args + environ for exec() */ (This is different than getconf ARG_MAX sometimes for reasons I don't fully understand). The result= takes up a few more of those bytes. Ah I see! Is there a reason for the result of the juju action to be passed as an argument of a command line? Can't we pass on the result of a action through a named pipe or something else that would allow working around this limitation? This might also increase security as we won't be running commands with random texts. As I understand it might be better to limit the output to a reasonable length, this should be handled and not result in an error for the operator. I don't think we would want to arbitrarily truncate results. We could raise something more specific, but I believe it's better to have the operation fail if it can't be executed as requested. Oh I didn't mean truncate the results arbitrarily, but present to the user running the juju action a clear error. At the moment, running an action that results in this stacktrace yields some unfriendly error: $ juju run-action test/leader my-action --wait test-1: UnitId: test/1 id: "90" message: exit status 1 results: ReturnCode: 1 status: failed timing: completed: 2023-10-13 08:03:55 +0000 UTC enqueued: 2023-10-13 08:03:53 +0000 UTC started: 2023-10-13 08:03:53 +0000 UTC But then maybe this needs filing another bug against juju itself? I think we might need the return code to be specific to a size limit error though for juju command line tool to be able to present a user-friendly message to the operator. Also, if there is such a limitation, it should probably be documented. +1 For your use case, can you perhaps have the action provide a juju scp command that the user could run that would allow retrieving the file? I actually updated the action to tail the file instead of returning it altogether, but that's also an idea, thank you! — Reply to this email directly, view it on GitHub <#1043 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AABRQ7OV7UBA6BG472MU5V3X7DZNDANCNFSM6AAAAAA55CVAAE> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

jameinel · 2023-10-14T14:14:11Z

Given all the other limitations and abuse, I would be happy to hard cap the action-set command at something much smaller (say 16kb) and be clearer about why it is failing. (I think we have a hard cap of 16MB for a mongo document, but you're already abusing the system as designed in the 100kb range). John =:->

…

On Sat, Oct 14, 2023, 10:12 John Meinel ***@***.***> wrote: Actions are not intended to share arbitrary length file contents. You need to look more into streaming content in some other fashion. Note that even if we didnt use args to a command, we still cache the arg result in the database (you can tun juju commands to get the results of actions in the future.) Actions are not "ephemeral content" the way that `ssh command` would be. There has been some discussion about streaming content and actions, but it is out of scope at this time. John =:-> On Fri, Oct 13, 2023, 04:09 kot0dama ***@***.***> wrote: > The limit seems to be per-character around 131060 chars, but this would > need confirmation. > > It's the OS's limit for command line arguments. For example: > > $ grep ARG_MAX /usr/include/linux/limits.h > #define ARG_MAX 131072 /* # bytes of args + environ for exec() */ > > (This is different than getconf ARG_MAX sometimes for reasons I don't > fully understand). The result= takes up a few more of those bytes. > > Ah I see! Is there a reason for the result of the juju action to be > passed as an argument of a command line? Can't we pass on the result of a > action through a named pipe or something else that would allow working > around this limitation? This might also increase security as we won't be > running commands with random texts. > > As I understand it might be better to limit the output to a reasonable > length, this should be handled and not result in an error for the operator. > > I don't think we would want to arbitrarily truncate results. We could > raise something more specific, but I believe it's better to have the > operation fail if it can't be executed as requested. > > Oh I didn't mean truncate the results arbitrarily, but present to the > user running the juju action a clear error. > > At the moment, running an action that results in this stacktrace yields > some unfriendly error: > > $ juju run-action test/leader my-action --wait > test-1: > UnitId: test/1 > id: "90" > message: exit status 1 > results: > ReturnCode: 1 > status: failed > timing: > completed: 2023-10-13 08:03:55 +0000 UTC > enqueued: 2023-10-13 08:03:53 +0000 UTC > started: 2023-10-13 08:03:53 +0000 UTC > > But then maybe this needs filing another bug against juju itself? I think > we might need the return code to be specific to a size limit error though > for juju command line tool to be able to present a user-friendly message to > the operator. > > Also, if there is such a limitation, it should probably be documented. > > +1 > > For your use case, can you perhaps have the action provide a juju scp > command that the user could run that would allow retrieving the file? > > I actually updated the action to tail the file instead of returning it > altogether, but that's also an idea, thank you! > > — > Reply to this email directly, view it on GitHub > <#1043 (comment)>, > or unsubscribe > <https://github.com/notifications/unsubscribe-auth/AABRQ7OV7UBA6BG472MU5V3X7DZNDANCNFSM6AAAAAA55CVAAE> > . > You are receiving this because you are subscribed to this thread.Message > ID: ***@***.***> >

benhoyt · 2023-10-15T22:24:40Z

I've opened #1047 to tweak the set_results docstring, and Tony and I have added notes about max size to those linked docs. Per above discussion, I don't think there's anything further to do here, so I'm going to close this. If file transfer is needed, it's best to use another technique like juju scp.

kot0dama · 2023-10-16T03:33:30Z

Thank you for your replies, this is much clearer now.

It would be interesting to add details in the SDK documentation maybe, or somewhere you find it useful, that actions output is stored in the database. This can be important for the charm writers comprehension of the system, and also to avoid storing potentially sensitive data.

Thanks again,
Loïc

### More extensively document the exceptions that Charmers should be aware of and potentially catching If we raise an exception when checking an argument's type, and the type hinting covers this, I have not included it in the `Raises` list. I feel that the docs (and a type checker) cover this already, and the `Raises` section should be more focused on exceptions that the Charmer should consider handling. #908 goes back-and-forth on documenting very common exceptions (like `APIError` and `CommunicationError` in nearly all pebble `Client` methods), and I have as well. ~I'm currently +0 on documenting them, and it seems like it's simpler to add them all in the PR and them remove them than decide to add them after the initial PR, so they are included at the moment. I could easily be convinced that they should be removed and a central doc note added (but I'm not sure where that would be best).~ These have been removed and are centralised in the `Container` and `pebble.Client` docstrings. Exceptions that are possible but should be rare and indicate a broken system (such as Juju or Pebble returning invalid data or being unresponsive) are not included. It feels like this tips *too* far in the "document everything" direction, and I don't feel like it would be worth Charmers trying to handle these exceptional (no pun intended) circumstances. Where it was unclear whether it was worth documenting a potential exception - such as not providing any commands in `exec`, which seems more likely to be a coding error than something the Charm should handle, but I could see happening with quite dynamic code - I have generally taken the more defensive approach of including the potential raise. I wrote a terrible, horrible, no good, very bad charm that causes a large number of these errors, and examined the pebble & Juju code to find information about what might cause issues. However, since I'm still only lightly familiar with Juju/ops, I could have missed some cases. ### Minor tweaks to exception chaining * If we are raising an exception in an exception handler but the original exception is just an implementation detail, `raise x from None` * If we are converting an exception to a more specific/localised one and the original exception is relevant, `raise x from e` * If we are filtering an exception and then re-raising, `raise` or `raise e` (preferably the former, but leaving the latter if already in place) ### Miscellanous One minor type hinting improvement: if an exception is always raised, use `typing.NoReturn` as the return type. Drive-bys: * Remove one `type: ignore` that is no longer required. * Add `ops.` to a couple of examples in the docstrings that were missed in #921 Fixes #908, #1043

benhoyt added docs Improvements or additions to documentation small item labels Oct 12, 2023

tonyandrewmeyer mentioned this issue Oct 12, 2023

docs: document more of the exceptions that may be raised #1044

Merged

tonyandrewmeyer self-assigned this Oct 12, 2023

benhoyt closed this as not planned Won't fix, can't repro, duplicate, stale Oct 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Actions with large outputs fail with OSError: [Errno 7] Argument list too long #1043

Actions with large outputs fail with OSError: [Errno 7] Argument list too long #1043

kot0dama commented Oct 12, 2023 •

edited

Loading

tonyandrewmeyer commented Oct 12, 2023 •

edited

Loading

benhoyt commented Oct 12, 2023

kot0dama commented Oct 13, 2023 •

edited

Loading

jameinel commented Oct 14, 2023 via email

jameinel commented Oct 14, 2023 via email

benhoyt commented Oct 15, 2023

kot0dama commented Oct 16, 2023

Actions with large outputs fail with OSError: [Errno 7] Argument list too long #1043

Actions with large outputs fail with OSError: [Errno 7] Argument list too long #1043

Comments

kot0dama commented Oct 12, 2023 • edited Loading

tonyandrewmeyer commented Oct 12, 2023 • edited Loading

benhoyt commented Oct 12, 2023

kot0dama commented Oct 13, 2023 • edited Loading

jameinel commented Oct 14, 2023 via email

jameinel commented Oct 14, 2023 via email

benhoyt commented Oct 15, 2023

kot0dama commented Oct 16, 2023

kot0dama commented Oct 12, 2023 •

edited

Loading

tonyandrewmeyer commented Oct 12, 2023 •

edited

Loading

kot0dama commented Oct 13, 2023 •

edited

Loading