You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
# /usr/local/lib/python3.9/dist-packages/ansible_collections
Collection Version
------------- -------
ansible.posix 1.4.0
OS / ENVIRONMENT
Linux, but should affect all OS
STEPS TO REPRODUCE
- name: Synchronization of OS imageansible.posix.synchronize:
src: /imager/image/dest: "{{ imager_mount_dir_new_image }}/"archive: yeschecksum: yesverify_host: yesdelay_updates: norsync_timeout: 60rsync_opts:
# use rsync `--delete-during` instead the default `delete`, which results in `--delete-after` imposing higher ram usage
- '--delete-during'# use --ignore-times to calculate the checksum for all files even if size and time is equal
- '--ignore-times'
EXPECTED RESULTS
No significant RAM increase
ACTUAL RESULTS
RAM usage rises up to a few hundred Megabytes for ~240k files.
All RAM gets freed when there is no error or the log output gets printed (which was > 50MB in some cases) on error.
it rsync is called synchronously, which results in the whole output written to the variable out or err. As the output might get really large for many files, it might be better to stream it to a temporary file or directly process it line by line (also streaming).
Furthermore, the processing of the output in the following lines might be quite inefficient depending on the number of empty lines:
One possible solution would be streaming the run_command output, but unfortunately this is not yet merged / incorporated into ansible, see ansible/proposals#92
Just did a few more tests regarding the memory consumption and the real culprit seems to be in the handling of Ansible itself:
The log output generated with my test sample is about 30MB (variable out written to a text file).
All tests were run in a container and numbers were gathered with GNU time.
Run
Maximum memory consumption in kB
Original
467700
Omitting all output to Ansible (just do all the processing, but do not pass anything back to Ansible, see option 3)
165776
Streaming all rsync output to tail and just evaluating the last 50 lines, see option 2
113164
In order to solve this issue I came up with three possible options (all together with introducing a new flag for omitting the list of changed files):
Implement a proper streaming run_command call with Ansible (might be the most clean option, but also the most work intensive one). With this solution neither Ansible nor the module should use a significant amount of memory. See Provide mechanism for streaming logs from modules ansible/proposals#92
Just use tail to process only a sample of changed files. As far as I have seen this should be sufficient for all features but returning the full list of changed files. Downside here is that it is necessary to use the option use_unsafe_shell=True when running run_command. As the name implies this might incur security issues, which I would not like to take.
Just process everything as it is right now but do not pass it to Ansible. This kinda feels like a hacky solution but reduces the required memory significantly.
SUMMARY
Synchronize is using a lot of memory during sync (some may also call it leaking)
ISSUE TYPE
COMPONENT NAME
Synchronize
ANSIBLE VERSION
COLLECTION VERSION
OS / ENVIRONMENT
Linux, but should affect all OS
STEPS TO REPRODUCE
EXPECTED RESULTS
No significant RAM increase
ACTUAL RESULTS
RAM usage rises up to a few hundred Megabytes for ~240k files.
All RAM gets freed when there is no error or the log output gets printed (which was > 50MB in some cases) on error.
Technical Details
I suspect the RAM leakage coming from
ansible.posix/plugins/modules/synchronize.py
Line 582 in 6da0cbb
ansible.posix/plugins/modules/synchronize.py
Line 609 in 6da0cbb
out
orerr
. As the output might get really large for many files, it might be better to stream it to a temporary file or directly process it line by line (also streaming).Furthermore, the processing of the output in the following lines might be quite inefficient depending on the number of empty lines:
ansible.posix/plugins/modules/synchronize.py
Lines 622 to 623 in 6da0cbb
Just adding a quiet switch to rsync does not help directly, as this will impose the loss of the changed status of the job.
The text was updated successfully, but these errors were encountered: