Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pods are running but registry is unresponsive at some point after installation #144

Open
SalaryTheft opened this issue Mar 26, 2024 · 2 comments

Comments

@SalaryTheft
Copy link

SalaryTheft commented Mar 26, 2024

All the pods are running but registry server is unresponsive at some point after installation.
(no response at curl https://localhost:8443)

I have to restart the pods or even have to reboot the host to get it working.

All the pods are running:

[root@bastion ~]# podman ps -a
CONTAINER ID  IMAGE                                                    COMMAND         CREATED       STATUS       PORTS                   NAMES
db266da38b9c  registry.access.redhat.com/ubi8/pause:8.7-6              infinity        13 hours ago  Up 13 hours  0.0.0.0:8443->8443/tcp  5e70ee01733b-infra
767d8f665354  registry.redhat.io/rhel8/redis-6:1-92.1669834635         run-redis       13 hours ago  Up 13 hours  0.0.0.0:8443->8443/tcp  quay-redis
73b03983db2f  registry.redhat.io/rhel8/postgresql-10:1-203.1669834630  run-postgresql  13 hours ago  Up 13 hours  0.0.0.0:8443->8443/tcp  quay-postgres
41c21e84bb3e  registry.redhat.io/quay/quay-rhel8:v3.8.14               registry        13 hours ago  Up 13 hours  0.0.0.0:8443->8443/tcp  quay-app

New logs are comming up, so the containers are running fine... I guess?

[root@bastion ~]# podman logs --tail=10 -f quay-app
exportactionlogsworker stdout | 2024-03-26 00:28:00,067 [52] [INFO] [apscheduler.executors.default] Running job "QueueWorker.poll_queue (trigger: interval[0:01:00], next run at: 2024-03-26 00:29:00 UTC)" (scheduled at 2024-03-26 00:28:00.067443+00:00)
exportactionlogsworker stdout | 2024-03-26 00:28:00,071 [52] [INFO] [apscheduler.executors.default] Job "QueueWorker.poll_queue (trigger: interval[0:01:00], next run at: 2024-03-26 00:29:00 UTC)" executed successfully
notificationworker stdout | 2024-03-26 00:28:04,724 [63] [INFO] [apscheduler.executors.default] Running job "QueueWorker.poll_queue (trigger: interval[0:00:10], next run at: 2024-03-26 00:28:14 UTC)" (scheduled at 2024-03-26 00:28:04.724010+00:00)
notificationworker stdout | 2024-03-26 00:28:04,727 [63] [INFO] [apscheduler.executors.default] Job "QueueWorker.poll_queue (trigger: interval[0:00:10], next run at: 2024-03-26 00:28:14 UTC)" executed successfully
repositorygcworker stdout | 2024-03-26 00:28:11,768 [75] [INFO] [apscheduler.executors.default] Running job "QueueWorker.run_watchdog (trigger: interval[0:01:00], next run at: 2024-03-26 00:29:11 UTC)" (scheduled at 2024-03-26 00:28:11.767795+00:00)
repositorygcworker stdout | 2024-03-26 00:28:11,769 [75] [INFO] [apscheduler.executors.default] Job "QueueWorker.run_watchdog (trigger: interval[0:01:00], next run at: 2024-03-26 00:29:11 UTC)" executed successfully
gcworker stdout | 2024-03-26 00:28:12,861 [53] [INFO] [apscheduler.executors.default] Running job "GarbageCollectionWorker._garbage_collection_repos (trigger: interval[0:00:30], next run at: 2024-03-26 00:28:42 UTC)" (scheduled at 2024-03-26 00:28:12.860612+00:00)
gcworker stdout | 2024-03-26 00:28:12,868 [53] [INFO] [apscheduler.executors.default] Job "GarbageCollectionWorker._garbage_collection_repos (trigger: interval[0:00:30], next run at: 2024-03-26 00:28:42 UTC)" executed successfully
notificationworker stdout | 2024-03-26 00:28:14,724 [63] [INFO] [apscheduler.executors.default] Running job "QueueWorker.poll_queue (trigger: interval[0:00:10], next run at: 2024-03-26 00:28:24 UTC)" (scheduled at 2024-03-26 00:28:14.724010+00:00)
notificationworker stdout | 2024-03-26 00:28:14,731 [63] [INFO] [apscheduler.executors.default] Job "QueueWorker.poll_queue (trigger: interval[0:00:10], next run at: 2024-03-26 00:28:24 UTC)" executed successfully

Nothing strange on the quay-app container deatails.

[root@bastion ~]# podman inspect quay-app
[
     {
          "Id": "41c21e84bb3e90a2ae46b480d9ca00e1a924a27e2c20157f09d21d29c9b4a389",
          "Created": "2024-03-25T07:50:17.451450987-04:00",
          "Path": "dumb-init",
          "Args": [
               "--",
               "/quay-registry/quay-entrypoint.sh",
               "registry"
          ],
          "State": {
               "OciVersion": "1.1.0-rc.3",
               "Status": "running",
               "Running": true,
               "Paused": false,
               "Restarting": false,
               "OOMKilled": false,
               "Dead": false,
               "Pid": 7577,
               "ConmonPid": 7575,
               "ExitCode": 0,
               "Error": "",
               "StartedAt": "2024-03-25T07:50:17.61683645-04:00",
               "FinishedAt": "0001-01-01T00:00:00Z",
               "Health": {
                    "Status": "",
                    "FailingStreak": 0,
                    "Log": null
               },
               "CgroupPath": "/machine.slice/machine-libpod_pod_5e70ee01733b02f854d79d85dd78dc5c8ecdb2c50de7472a314441897f9296dc.slice/libpod-41c21e84bb3e90a2ae46b480d9ca00e1a924a27e2c20157f09d21d29c9b4a389.scope",
               "CheckpointedAt": "0001-01-01T00:00:00Z",
               "RestoredAt": "0001-01-01T00:00:00Z"
          },
          "Image": "93b30dda302e3554fcfea484da1fc7b981dc4ac173b195def4ab79b86dfaf616",
          "ImageDigest": "sha256:19e0709632a860dc93e54e9d79b8da9b02334122775932eaefaccf4783524ef4",
          "ImageName": "registry.redhat.io/quay/quay-rhel8:v3.8.14",
          "Rootfs": "",
          "Pod": "5e70ee01733b02f854d79d85dd78dc5c8ecdb2c50de7472a314441897f9296dc",
          "ResolvConfPath": "/run/containers/storage/overlay-containers/db266da38b9c0ffd99a27f0873934a79cbf7776dd8996aa0e4b839f98f0b25ec/userdata/resolv.conf",
          "HostnamePath": "/run/containers/storage/overlay-containers/41c21e84bb3e90a2ae46b480d9ca00e1a924a27e2c20157f09d21d29c9b4a389/userdata/hostname",
          "HostsPath": "/run/containers/storage/overlay-containers/db266da38b9c0ffd99a27f0873934a79cbf7776dd8996aa0e4b839f98f0b25ec/userdata/hosts",
          "StaticDir": "/var/lib/containers/storage/overlay-containers/41c21e84bb3e90a2ae46b480d9ca00e1a924a27e2c20157f09d21d29c9b4a389/userdata",
          "OCIConfigPath": "/var/lib/containers/storage/overlay-containers/41c21e84bb3e90a2ae46b480d9ca00e1a924a27e2c20157f09d21d29c9b4a389/userdata/config.json",
          "OCIRuntime": "crun",
          "ConmonPidFile": "/run/quay-app.service-pid",
          "PidFile": "/run/containers/storage/overlay-containers/41c21e84bb3e90a2ae46b480d9ca00e1a924a27e2c20157f09d21d29c9b4a389/userdata/pidfile",
          "Name": "quay-app",
          "RestartCount": 0,
          "Driver": "overlay",
          "MountLabel": "system_u:object_r:container_file_t:s0:c273,c984",
          "ProcessLabel": "system_u:system_r:container_t:s0:c273,c984",
          "AppArmorProfile": "",
          "EffectiveCaps": null,
          "BoundingCaps": [
               "CAP_CHOWN",
               "CAP_DAC_OVERRIDE",
               "CAP_FOWNER",
               "CAP_FSETID",
               "CAP_KILL",
               "CAP_NET_BIND_SERVICE",
               "CAP_SETFCAP",
               "CAP_SETGID",
               "CAP_SETPCAP",
               "CAP_SETUID",
               "CAP_SYS_CHROOT"
          ],
          "ExecIDs": [],
          "GraphDriver": {
               "Name": "overlay",
               "Data": {
                    "LowerDir": "/var/lib/containers/storage/overlay/19dbf084110759a3d249cd4ec487e83f55eca64deafc5d51d04787a3716fadb8/diff",
                    "MergedDir": "/var/lib/containers/storage/overlay/fc1f2d2a88e454e8c41e3aa22e5d91e18001506f13821dd60eee47a918b1bc50/merged",
                    "UpperDir": "/var/lib/containers/storage/overlay/fc1f2d2a88e454e8c41e3aa22e5d91e18001506f13821dd60eee47a918b1bc50/diff",
                    "WorkDir": "/var/lib/containers/storage/overlay/fc1f2d2a88e454e8c41e3aa22e5d91e18001506f13821dd60eee47a918b1bc50/work"
               }
          },
          "Mounts": [
               {
                    "Type": "volume",
                    "Name": "f19507ef7f837c63cb92f116e042f12daa4c00a0c37c444cb1c7988687e66a0d",
                    "Source": "/var/lib/containers/storage/volumes/f19507ef7f837c63cb92f116e042f12daa4c00a0c37c444cb1c7988687e66a0d/_data",
                    "Destination": "/tmp",
                    "Driver": "local",
                    "Mode": "",
                    "Options": [
                         "nodev",
                         "exec",
                         "nosuid",
                         "rbind"
                    ],
                    "RW": true,
                    "Propagation": "rprivate"
               },
               {
                    "Type": "volume",
                    "Name": "63e0413f366aa2f74f9370d04014e48038006bb4cf1b2ff5435fc9cb724de3ce",
                    "Source": "/var/lib/containers/storage/volumes/63e0413f366aa2f74f9370d04014e48038006bb4cf1b2ff5435fc9cb724de3ce/_data",
                    "Destination": "/var/log",
                    "Driver": "local",
                    "Mode": "",
                    "Options": [
                         "nodev",
                         "exec",
                         "nosuid",
                         "rbind"
                    ],
                    "RW": true,
                    "Propagation": "rprivate"
               },
               {
                    "Type": "volume",
                    "Name": "097a7e8bf2e6d0a80a575d14bd6bdfa58d16919ff83a9b403d6dc06915ae20bc",
                    "Source": "/var/lib/containers/storage/volumes/097a7e8bf2e6d0a80a575d14bd6bdfa58d16919ff83a9b403d6dc06915ae20bc/_data",
                    "Destination": "/conf/stack",
                    "Driver": "local",
                    "Mode": "",
                    "Options": [
                         "nodev",
                         "exec",
                         "nosuid",
                         "rbind"
                    ],
                    "RW": true,
                    "Propagation": "rprivate"
               },
               {
                    "Type": "bind",
                    "Source": "/opt/quay/config/quay-config",
                    "Destination": "/quay-registry/conf/stack",
                    "Driver": "",
                    "Mode": "",
                    "Options": [
                         "rbind"
                    ],
                    "RW": true,
                    "Propagation": "rprivate"
               },
               {
                    "Type": "bind",
                    "Source": "/opt/quay/data",
                    "Destination": "/datastorage",
                    "Driver": "",
                    "Mode": "",
                    "Options": [
                         "rbind"
                    ],
                    "RW": true,
                    "Propagation": "rprivate"
               }
          ],
          "Dependencies": [
               "db266da38b9c0ffd99a27f0873934a79cbf7776dd8996aa0e4b839f98f0b25ec"
          ],
          "NetworkSettings": {
               "EndpointID": "",
               "Gateway": "10.88.0.1",
               "IPAddress": "10.88.0.2",
               "IPPrefixLen": 16,
               "IPv6Gateway": "",
               "GlobalIPv6Address": "",
               "GlobalIPv6PrefixLen": 0,
               "MacAddress": "a6:9c:af:e1:1b:a7",
               "Bridge": "",
               "SandboxID": "",
               "HairpinMode": false,
               "LinkLocalIPv6Address": "",
               "LinkLocalIPv6PrefixLen": 0,
               "Ports": {
                    "8443/tcp": [
                         {
                              "HostIp": "",
                              "HostPort": "8443"
                         }
                    ]
               },
               "SandboxKey": "/run/netns/netns-67bc251f-bac0-1817-c280-f49b54fda5bc",
               "Networks": {
                    "podman": {
                         "EndpointID": "",
                         "Gateway": "10.88.0.1",
                         "IPAddress": "10.88.0.2",
                         "IPPrefixLen": 16,
                         "IPv6Gateway": "",
                         "GlobalIPv6Address": "",
                         "GlobalIPv6PrefixLen": 0,
                         "MacAddress": "a6:9c:af:e1:1b:a7",
                         "NetworkID": "podman",
                         "DriverOpts": null,
                         "IPAMConfig": null,
                         "Links": null,
                         "Aliases": [
                              "db266da38b9c",
                              "quay-pod"
                         ]
                    }
               }
          },
          "Namespace": "",
          "IsInfra": false,
          "IsService": false,
          "KubeExitCodePropagation": "invalid",
          "lockNumber": 37,
          "Config": {
               "Hostname": "quay-pod",
               "Domainname": "",
               "User": "1001",
               "AttachStdin": false,
               "AttachStdout": false,
               "AttachStderr": false,
               "Tty": false,
               "OpenStdin": false,
               "StdinOnce": false,
               "Env": [
                    "LANG=C.UTF-8",
                    "QUAYDIR=/quay-registry",
                    "PYTHONUNBUFFERED=1",
                    "RED_HAT_QUAY=true",
                    "TERM=xterm",
                    "container=oci",
                    "PYTHONIOENCODING=UTF-8",
                    "LC_ALL=C.UTF-8",
                    "TZ=UTC",
                    "PYTHONUSERBASE=/app",
                    "QUAYPATH=/quay-registry",
                    "QUAYCONF=/quay-registry/conf",
                    "PATH=/app/bin/:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
                    "QUAYRUN=/quay-registry/conf",
                    "PYTHONPATH=/quay-registry",
                    "HOME=/quay-registry",
                    "HOSTNAME=quay-pod"
               ],
               "Cmd": [
                    "registry"
               ],
               "Image": "registry.redhat.io/quay/quay-rhel8:v3.8.14",
               "Volumes": null,
               "WorkingDir": "/quay-registry",
               "Entrypoint": "dumb-init -- /quay-registry/quay-entrypoint.sh",
               "OnBuild": null,
               "Labels": null,
               "Annotations": {
                    "io.container.manager": "libpod",
                    "io.kubernetes.cri-o.SandboxID": "db266da38b9c0ffd99a27f0873934a79cbf7776dd8996aa0e4b839f98f0b25ec",
                    "io.podman.annotations.cid-file": "/run/quay-app.service-cid",
                    "org.opencontainers.image.stopSignal": "15"
               },
               "StopSignal": 15,
               "HealthcheckOnFailureAction": "none",
               "CreateCommand": [
                    "/usr/bin/podman",
                    "run",
                    "--name",
                    "quay-app",
                    "-v",
                    "/opt/quay/config/quay-config:/quay-registry/conf/stack:Z",
                    "-v",
                    "/opt/quay/data:/datastorage:Z",
                    "--pod=quay-pod",
                    "--conmon-pidfile",
                    "/run/quay-app.service-pid",
                    "--cidfile",
                    "/run/quay-app.service-cid",
                    "--cgroups=no-conmon",
                    "--replace",
                    "registry.redhat.io/quay/quay-rhel8:v3.8.14"
               ],
               "Umask": "0022",
               "Timeout": 0,
               "StopTimeout": 10,
               "Passwd": true,
               "sdNotifyMode": "container"
          },
          "HostConfig": {
               "Binds": [
                    "f19507ef7f837c63cb92f116e042f12daa4c00a0c37c444cb1c7988687e66a0d:/tmp:rprivate,rw,nodev,exec,nosuid,rbind",
                    "63e0413f366aa2f74f9370d04014e48038006bb4cf1b2ff5435fc9cb724de3ce:/var/log:rprivate,rw,nodev,exec,nosuid,rbind",
                    "097a7e8bf2e6d0a80a575d14bd6bdfa58d16919ff83a9b403d6dc06915ae20bc:/conf/stack:rprivate,rw,nodev,exec,nosuid,rbind",
                    "/opt/quay/config/quay-config:/quay-registry/conf/stack:rw,rprivate,rbind",
                    "/opt/quay/data:/datastorage:rw,rprivate,rbind"
               ],
               "CgroupManager": "systemd",
               "CgroupMode": "private",
               "ContainerIDFile": "/run/quay-app.service-cid",
               "LogConfig": {
                    "Type": "journald",
                    "Config": null,
                    "Path": "",
                    "Tag": "",
                    "Size": "0B"
               },
               "NetworkMode": "container:db266da38b9c0ffd99a27f0873934a79cbf7776dd8996aa0e4b839f98f0b25ec",
               "PortBindings": {},
               "RestartPolicy": {
                    "Name": "",
                    "MaximumRetryCount": 0
               },
               "AutoRemove": false,
               "VolumeDriver": "",
               "VolumesFrom": null,
               "CapAdd": [],
               "CapDrop": [],
               "Dns": [],
               "DnsOptions": [],
               "DnsSearch": [],
               "ExtraHosts": [],
               "GroupAdd": [],
               "IpcMode": "container:db266da38b9c0ffd99a27f0873934a79cbf7776dd8996aa0e4b839f98f0b25ec",
               "Cgroup": "",
               "Cgroups": "default",
               "Links": null,
               "OomScoreAdj": 0,
               "PidMode": "private",
               "Privileged": false,
               "PublishAllPorts": false,
               "ReadonlyRootfs": false,
               "SecurityOpt": [],
               "Tmpfs": {},
               "UTSMode": "container:db266da38b9c0ffd99a27f0873934a79cbf7776dd8996aa0e4b839f98f0b25ec",
               "UsernsMode": "",
               "ShmSize": 65536000,
               "Runtime": "oci",
               "ConsoleSize": [
                    0,
                    0
               ],
               "Isolation": "",
               "CpuShares": 0,
               "Memory": 0,
               "NanoCpus": 0,
               "CgroupParent": "machine.slice/machine-libpod_pod_5e70ee01733b02f854d79d85dd78dc5c8ecdb2c50de7472a314441897f9296dc.slice",
               "BlkioWeight": 0,
               "BlkioWeightDevice": null,
               "BlkioDeviceReadBps": null,
               "BlkioDeviceWriteBps": null,
               "BlkioDeviceReadIOps": null,
               "BlkioDeviceWriteIOps": null,
               "CpuPeriod": 0,
               "CpuQuota": 0,
               "CpuRealtimePeriod": 0,
               "CpuRealtimeRuntime": 0,
               "CpusetCpus": "",
               "CpusetMems": "",
               "Devices": [],
               "DiskQuota": 0,
               "KernelMemory": 0,
               "MemoryReservation": 0,
               "MemorySwap": 0,
               "MemorySwappiness": 0,
               "OomKillDisable": false,
               "PidsLimit": 2048,
               "Ulimits": [
                    {
                         "Name": "RLIMIT_NPROC",
                         "Soft": 4194304,
                         "Hard": 4194304
                    }
               ],
               "CpuCount": 0,
               "CpuPercent": 0,
               "IOMaximumIOps": 0,
               "IOMaximumBandwidth": 0,
               "CgroupConf": null
          }
     }
]
@BadgerOps
Copy link
Contributor

Hey team, we just ran into this same exact issue, same symptoms as well. I thought perhaps we just had a one-off issue, but then noticed this issue, so I thought I'd add a comment. I'll get some troubleshooting logs posted here. I can connect via netcat to port 8443 and have ruled out selinux, fapolicyd, etc as potential contributors.

It just.... stops responding to http traffic.

@BadgerOps
Copy link
Contributor

I should have captured the output, but failed to - I did notice that a curl results in something similar to the following:

 curl -vvv https://<quay-server>:8443 | head
* Rebuilt URL to: https://<quay-server>:8443/

* TCP_NODELAY set
* Connected to <quay-server> port 8443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: /etc/pki/tls/certs/ca-bundle.crt
  CApath: none
} [5 bytes data]
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
} [512 bytes data]
< hangs right here where we should get a Server hello>

We never get the server hello back, nor anything beyond that - and, as noted above the port is open and responds via nc and the logs keep on rolling by for journalctl -fu quay-app.service or podman logs -f <pod_id>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants