-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
stuck in futex #11749
Comments
2.0.1 had #11463 but that was fixed (according to all the reporters...) in 2.0.2 and newer, so if you're sure you're still reproducing on 2.0.3, that means it's not that...probably. You could follow what was done in #11463 (comment) and see what it says (assuming it says anything). An alternate test you could run to exclude an issue like #11463 would be to try running a newer kernel (e.g. the kernel-ml package from EPEL would work), since that issue was specific to older versions of the kernel. |
On Mon, Mar 15, 2021 at 08:04:21PM -0700, Rich Ercolani wrote:
2.0.1 had #11463 but that was fixed (according to all the reporters...) in 2.0.2 and newer, so if you're sure you're still reproducing on 2.0.3, that means it's not that...probably.
It's occurring right now for a customer running zfs 2.0.3 under linux RHEL
3.10.0-1160.15.2.el7.x86_64.
Note, #11463 talks about 100% sys CPU, and that's not the case here.
In the cases I've seen, postgres autovacuum/autoanalyze is stuck in futex, and
fails to stop, but I *am* able to kill -9 it.
The stuck process is like this:
***@***.*** ~]$ sudo strace -p 17087
strace: Process 17087 attached
futex(0x7fdbf9e42760, FUTEX_WAIT_PRIVATE, 2, NULL
You could follow what was done in #11463 (comment) and see what it says (assuming it says anything).
I got just:
[474919.233221] Loglevel set to 1
I tried some more sysrqs and got this (it looks like my strace is visible here
as ptrace, even though I've cancelled the strace??).
[475154.098057] postmaster S ffff8ae7bfc5acc0 0 17087 24559 0x00000080
[475154.098059] Call Trace:
[475154.098062] [<ffffffff98787dd9>] schedule+0x29/0x70
[475154.098064] [<ffffffff98112436>] futex_wait_queue_me+0xc6/0x130
[475154.098066] [<ffffffff981131db>] futex_wait+0x17b/0x280
[475154.098068] [<ffffffff980b049b>] ? recalc_sigpending+0x1b/0x70
[475154.098071] [<ffffffff980b05e6>] ? dequeue_signal+0x86/0x180
[475154.098073] [<ffffffff98114f46>] do_futex+0x106/0x5a0
[475154.098075] [<ffffffff980b303b>] ? ptrace_notify+0x5b/0x90
[475154.098078] [<ffffffff9803bdf5>] ? tracehook_report_syscall_exit+0x45/0xe0
[475154.098080] [<ffffffff98115460>] SyS_futex+0x80/0x190
[475154.098083] [<ffffffff9813e8e6>] ? __audit_syscall_exit+0x1f6/0x2b0
[475154.098086] [<ffffffff98794f92>] system_call_fastpath+0x25/0x2a
[475154.098087] postmaster S ffff8ae42a8c8640 0 21830 24559 0x00000080
It's probably not useful to you, but I was surprised that I was able to get a
user backtrace. This involves: autovacuum, extended statistics, postgis
extension, and apparently a signal..
(gdb) bt
#0 0x00007fdbf9b875cc in __lll_lock_wait_private () from /lib64/libc.so.6
#1 0x00007fdbf9b03b12 in _L_lock_16654 () from /lib64/libc.so.6
#2 0x00007fdbf9b00753 in malloc () from /lib64/libc.so.6
#3 0x00007fdbfc21e018 in tls_get_addr_tail () from /lib64/ld-linux-x86-64.so.2
#4 0x00007fdbfc2239e8 in __tls_get_addr () from /lib64/ld-linux-x86-64.so.2
#5 0x00007fdbecf503e0 in lwgeom_wagyu_interruptRequest () from /usr/pgsql-13/lib/postgis-3.so
#6 0x00007fdbecf0d7f1 in handleInterrupt () from /usr/pgsql-13/lib/postgis-3.so
#7 <signal handler called>
#8 0x00007fdbf9b701ec in brk () from /lib64/libc.so.6
#9 0x00007fdbf9b70286 in sbrk () from /lib64/libc.so.6
#10 0x00007fdbf9b03e49 in __default_morecore () from /lib64/libc.so.6
#11 0x00007fdbf9afa49a in systrim.isra.2 () from /lib64/libc.so.6
#12 0x00007fdbf9afc15a in _int_free () from /lib64/libc.so.6
#13 0x00000000008cbc90 in AllocSetReset (context=0x2c70ec0) at aset.c:607
#14 0x00000000008d1aa1 in MemoryContextResetOnly ***@***.***=0x2c70ec0) at mcxt.c:174
#15 0x00000000008d1b1b in MemoryContextResetOnly ***@***.***=0x2c70ec0) at mcxt.c:160
#16 0x00000000008cbc20 in AllocSetDelete (context=0x2c70ec0) at aset.c:653
#17 0x00000000007588ce in BuildRelationExtStatistics ***@***.***=0x31f32b8, totalrows=83498, ***@***.***=30000, ***@***.***=0x3294448, ***@***.***=933, ***@***.***=0x328bce8)
at extended_stats.c:197
#18 0x00000000005aa91d in do_analyze_rel ***@***.***=0x31f32b8, ***@***.***=0x0, acquirefunc=<optimized out>, relpages=39622, ***@***.***=false, ***@***.***=false,
***@***.***=13, params=0x3012c7c, params=0x3012c7c) at analyze.c:606
#19 0x00000000005aacd1 in analyze_rel (relid=<optimized out>, relation=<optimized out>, ***@***.***=0x3012c7c, va_cols=0x0, in_outer_xact=<optimized out>, bstrategy=<optimized out>) at analyze.c:263
#20 0x000000000061dde3 in vacuum (relations=0x3292568, ***@***.***=0x3012c7c, bstrategy=<optimized out>, ***@***.***=0x3014ec8, ***@***.***=true) at vacuum.c:466
#21 0x00000000007093a0 in autovacuum_do_vac_analyze (bstrategy=0x3014ec8, tab=0x3012c78) at autovacuum.c:3137
#22 do_autovacuum () at autovacuum.c:2467
#23 0x00000000007097d6 in AutoVacWorkerMain (argv=0x0, argc=0) at autovacuum.c:1694
#24 0x0000000000709929 in StartAutoVacWorker () at autovacuum.c:1488
#25 0x0000000000718118 in StartAutovacuumWorker () at postmaster.c:5593
#26 sigusr1_handler (postgres_signal_arg=<optimized out>) at postmaster.c:5300
#27 <signal handler called>
#28 0x00007fdbf9b70983 in __select_nocancel () from /lib64/libc.so.6
#29 0x0000000000487370 in ServerLoop () at postmaster.c:1703
#30 0x0000000000719108 in PostmasterMain ***@***.***=3, ***@***.***=0x2b45280) at postmaster.c:1412
#31 0x0000000000488bbd in main (argc=3, argv=0x2b45280) at main.c:210
…--
Justin
|
Okay, so that's one idea eliminated, now for...everything else. You might find strace -f more informative, since one thread waiting on a futex() call basically means "I'm waiting for another thread to finish", and I don't believe strace will show you the other threads of a process without -f. So it's probably another thread, not the "main" thread of 17087, that's stuck somewhere interesting. (Which one? Dunno. Possibly whichever one is also waiting on some call. Possibly one that's stuck in an endless loop and not making any progress. Story unclear.) In particular, I'd really like to find a thread that's "stuck" on something in OpenZFS, otherwise it's something of a hard sell that OpenZFS is to blame for the problem. (Other evidence like "we're running this setup without OpenZFS on a number of systems and this never happens" would also be reasonably compelling.) You suggested in your comment that at least one of the systems reproducing this right now is a production system - do you have any non-production systems that you can reproduce this on, that we might be able to try more exciting things with down the line? Were these systems running fine for a long time and then this problem suddenly cropped up, or is this the initial setup period of the systems and you have no long-running history to judge whether this would have happened on earlier software versions? Do other reads/writes to the relevant zpool still work normally while this is going on? |
On Mon, Mar 15, 2021 at 10:06:38PM -0700, Rich Ercolani wrote:
> It's occurring right now for a customer running zfs 2.0.3 under linux RHEL 3.10.0-1160.15.2.el7.x86_64. Note, #11463 talks about 100% sys CPU, and that's not the case here. In the cases I've seen, postgres autovacuum/autoanalyze is stuck in futex, and fails to stop, but I *am* able to kill -9 it. The stuck process is like this: ***@***.*** ~]$ sudo strace -p 17087 strace: Process 17087 attached futex(0x7fdbf9e42760, FUTEX_WAIT_PRIVATE, 2, NULL
Okay, so that's one idea eliminated, now for...everything else.
You might find strace -f more informative, since one thread waiting on a futex() call basically means "I'm waiting for another thread to finish", and I don't believe strace will show you the other threads of a process without -f. So it's probably another thread, not the "main" thread of 17087, that's stuck somewhere interesting. (Which one? Dunno. Possibly whichever one is also waiting on some call. Possibly one that's stuck in an endless loop and not making any progress. Story unclear.)
postgres doesn't use threads.
$ ps -Lf 17087
UID PID PPID LWP C NLWP STIME TTY STAT TIME CMD
postgres 17087 24559 17087 0 1 Mar10 ? Ss 1:03 postgres: autovacuum worker ts
In particular, I'd really like to find a thread that's "stuck" on something in OpenZFS, otherwise it's something of a hard sell that OpenZFS is to blame for the problem. (Other evidence like "we're running this setup without OpenZFS on a number of systems and this never happens" would also be reasonably compelling.)
You suggested in your comment that at least one of the systems reproducing this right now is a production system - do you have any non-production systems that you can reproduce this on, that we might be able to try more exciting things with down the line?
Yes, I'm seeing this several times on production systems since upgrading to
openzfs-2 beginning in January.
I don't have a reproduction recipe, otherwise I could use a test VM.
Depending on how exciting, I can try some things on the production system, in
particular if it's already stuck (like now).
Were these systems running fine for a long time and then this problem suddenly cropped up, or is this the initial setup period of the systems and you have no long-running history to judge whether this would have happened on earlier software versions?
Do other reads/writes to the relevant zpool still work normally while this is going on?
Yes, everything else worked fine, except in cases that the postgres process had
previously locked a table, in which case something else that tries to use it
will be suck waiting.
…--
Justin
|
Okay, great, then it's waiting on another process. Time to go look at the different Postgres processes and see whether any of them are stuck in an exciting way.
Yeah, but one of my questions was going to be "does this happen with, say, MD+XFS", which is understandably not something you can try on a production system (though it sounds like it didn't happen pre-OpenZFS 2.0, so it's probably reasonable to assume for now it wouldn't). If you need these systems working Now(tm), it might be reasonable to downgrade them to 0.8.6 and try to reproduce this on a testbed even if you don't have a perfect reproduction recipe premade. (I believe it's still the case that the main "zfs" CentOS/RHEL repo ships 0.8.X RPMs and you need "zfs-testing" to get 2.0.X, so this should be pretty straightforward as long as you didn't already run "zpool upgrade".) |
On Mon, Mar 15, 2021 at 10:37:18PM -0700, Rich Ercolani wrote:
> postgres doesn't use threads.
Okay, great, then it's waiting on another process. Time to go look at the different Postgres processes and see whether any of them are stuck in an exciting way.
I don't think so. The stuck autoanalyze is the earliest-started process
started that's still running, except for postgres' own processes - if they were
stuck, I think the whole system would be unusable.
postgres=# SELECT datname, backend_start, wait_event, state, backend_type, query FROM pg_stat_activity ORDER BY backend_start LIMIT 11;
| 2021-03-10 10:01:55.623695-06 | CheckpointerMain | | checkpointer |
| 2021-03-10 10:01:55.623971-06 | BgWriterHibernate | | background writer |
| 2021-03-10 10:01:55.624275-06 | WalWriterMain | | walwriter |
| 2021-03-10 10:01:55.624573-06 | AutoVacuumMain | | autovacuum launcher |
| 2021-03-10 10:01:55.625218-06 | LogicalLauncherMain | | logical replication launcher |
ts | 2021-03-10 14:23:20.093884-06 | | active | autovacuum worker | autovacuum: ANALYZE child.eric_enodeb_cell_20210310
I'm not 100% sure it's an issue with zfs.
I found a bunch of similar bug reports regarding issue in malloc/free and
signal handlers. I'm planning to upgrade libc6 and postgis packages.
Thanks for your suggestions so far.
…--
Justin
|
Did you look, or just conclude that it's not possible? I'm not particularly convinced it's a ZFS issue either, since nothing has pointed to ZFS. Could you share some of those bug report links, just for reference by anyone reading this thread in the future? |
On Tue, Mar 16, 2021 at 04:00:20PM -0700, Rich Ercolani wrote:
> On Mon, Mar 15, 2021 at 10:37:18PM -0700, Rich Ercolani wrote: > postgres doesn't use threads. Okay, great, then it's waiting on another process. Time to go look at the different Postgres processes and see whether any of them are stuck in an exciting way.
> I don't think so. The stuck autoanalyze is the earliest-started process started that's still running, except for postgres' own processes - if they were stuck, I think the whole system would be unusable.
Did you look, or just conclude that it's not possible?
I didn't understand what you mean here ?
In response to your question, I had straced those postgres processes and saw
none of them were stuck.
I tried to shut down postgres, and 5 processes were left running - the parent
"postmaster", 3 helper processes, and the "stuck" process 17087.
I'm not particularly convinced it's a ZFS issue either, since nothing has pointed to ZFS.
Could you share some of those bug report links, just for reference by anyone reading this thread in the future?
Searching for "__lll_lock_wait_private" I found various bug reports and patches
for glibc. Most of our servers are running centos7, and frequently not the
latest updates, so these are plausibly related.
Other servers hit this issue, and there's not much that's changed other than
upgrading to zfs-2.0. I upgraded most people to postgres 13 late last year, so
the only other change is upgrading to postgres 13.2, which is essentially
limited to bugfixes. I don't see any commits related to signal handling. But
it's possible that code changes caused code to be compiled differently and
expose some other issue.
~~So far, no server hit this issue after upgrading glibc.~~ I'm not planning on
digging into it further than that, on the hypothesis that it's a bug fixed
years ago for which we hadn't applied the update.
Edit: servers *did* hit this after upgrading glibc, but did not recur after upgrading postgis, which was sometimes running 3.1alpha pre-release.
You could close this issue and I'll come back if there's some more interesting
development. Sorry for the noise and pointing the finger here.
…--
Justin
|
I haven't heard back about how to debug the previous issue (11641), so here goes anyway.
Postgres autovacuum worker process stuck in futex. I have seen this
twice3 times before on another server.The process does not die with SIGINT, and I imagine if I SIGKILL it, it won't die, and I'll need to reboot the server.
[pryzbyj@ts-db-new ~]$ ps -O lstart,wchan=wwwwwwwwwwwwwwwwwwww 12583
PID STARTED wwwwwwwwwwwwwwwwwwww S TTY TIME COMMAND
12583 Wed Mar 3 03:41:23 2021 futex_wait_queue_me S ? 00:00:59 postgres: autovacuum worker ....
Distribution Name | Centos
Distribution Version | 7.8
Linux Kernel | kernel-3.10.0-1127.18.2.el7 and 3.10.0-1160.15.2.el7
Architecture | x86_64
ZFS Version | 2.0.1-1 and 2.0.3-1
SPL Version | 2.0.1-1
The text was updated successfully, but these errors were encountered: