Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

keep_jobs integer intervals are too large #55295

Open
squidpickles opened this issue Nov 13, 2019 · 6 comments
Open

keep_jobs integer intervals are too large #55295

squidpickles opened this issue Nov 13, 2019 · 6 comments
Assignees
Labels
Confirmed Salt engineer has confirmed bug/feature - often including a MCVE Pending-Discussion The issue or pull request needs more discussion before it can be closed or merged
Milestone

Comments

@squidpickles
Copy link
Contributor

Description of Issue

For a large installation (>3000 minions) running frequent operations, the job cache grows quite large. In our case, we don't need a job cache beyond about 5 minutes. It would be helpful to be able to specify keep_jobs as a fraction of an hour, to keep the cache small. (We keep ours in RAM to reduce disk IO.)

Setup

1 master, 3000 minions

Steps to Reproduce Issue

Set keep_jobs: 1, run test.ping every 60 seconds, job cache grows to 10 GB over 24 hours.

Versions Report

Salt Version:
           Salt: 2019.2.2

Dependency Versions:
           cffi: Not Installed
       cherrypy: Not Installed
       dateutil: 2.6.1
      docker-py: Not Installed
          gitdb: 2.0.3
      gitpython: 2.1.8
          ioflo: Not Installed
         Jinja2: 2.10
        libgit2: 0.26.0
        libnacl: Not Installed
       M2Crypto: Not Installed
           Mako: Not Installed
   msgpack-pure: Not Installed
 msgpack-python: 0.5.6
   mysql-python: Not Installed
      pycparser: Not Installed
       pycrypto: 2.6.1
   pycryptodome: Not Installed
         pygit2: 0.26.2
         Python: 3.6.8 (default, Oct  7 2019, 12:59:55)
   python-gnupg: 0.4.1
         PyYAML: 3.12
          PyZMQ: 16.0.2
           RAET: Not Installed
          smmap: 2.0.3
        timelib: Not Installed
        Tornado: 4.5.3
            ZMQ: 4.2.5

System Versions:
           dist: Ubuntu 18.04 bionic
         locale: UTF-8
        machine: x86_64
        release: 4.15.0-66-generic
         system: Linux
        version: Ubuntu 18.04 bionic
@xeacott
Copy link
Contributor

xeacott commented Nov 14, 2019

Thanks for submitting this ticket as well as getting a PR together. Pinring @saltstack/team-core about this one so we can review the PR and address this. 😄

@xeacott xeacott added fixed-pls-verify fix is linked, bug author to confirm fix Pending-Discussion The issue or pull request needs more discussion before it can be closed or merged labels Nov 14, 2019
@xeacott xeacott added this to the Approved milestone Nov 14, 2019
@dwoz
Copy link
Contributor

dwoz commented Nov 14, 2019

@squidpickles I'm wondering if there is not a more elegant way of accomplishing what you are trying to do. What is the motivation for running test.ping every 60 seconds?

@squidpickles
Copy link
Contributor Author

We've noticed a number of hosts lose salt connectivity, despite being reachable via SSH. This occasionally predicts localized network outages. Our systems team wrote a check for the monitoring system using a ping to check for responsive hosts.

@stale
Copy link

stale bot commented Jan 7, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

If this issue is closed prematurely, please leave a comment and we will gladly reopen the issue.

@stale stale bot added the stale label Jan 7, 2020
@sagetherage sagetherage added the Confirmed Salt engineer has confirmed bug/feature - often including a MCVE label Jan 9, 2020
@stale
Copy link

stale bot commented Jan 9, 2020

Thank you for updating this issue. It is no longer marked as stale.

@stale stale bot removed the stale label Jan 9, 2020
@sagetherage
Copy link
Contributor

@dwoz can you take a look at this one, please?

@sagetherage sagetherage modified the milestones: Approved, Blocked Mar 26, 2020
@sagetherage sagetherage removed the fixed-pls-verify fix is linked, bug author to confirm fix label Mar 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Confirmed Salt engineer has confirmed bug/feature - often including a MCVE Pending-Discussion The issue or pull request needs more discussion before it can be closed or merged
Projects
None yet
Development

No branches or pull requests

4 participants