Multiple games miss wakeups from PulseEvent (Crysis (1) is very slow to play intro videos, Sonic Generations and Pro Evolution Soccer have audio distortion) #10

Tk-Glitch · 2018-08-14T23:41:57Z

With esync enabled, Crysis seemingly hangs on a black screen. Sometimes, after a while, the first frame of the opening video will show. 30-60 seconds later the second frame may or may not appear, ultimately ending up in what looks like a freeze.

Here's a log : https://drive.google.com/file/d/1vPyCKEb-FwH9otxYMFJWguwj64z9RPPq/view?usp=sharing

There's no reason these should be global, and in particular, this means that esync_pulse_event() might end up writing 0, which raises the likelihood of a missed wakeup from "probable" to "certain". Fixes #10.

zfigura · 2018-08-15T01:01:46Z

Fixed by a7faa85.

pingubot · 2018-08-15T07:31:17Z

sadly not fixed for me, using the 32 bit crysis executable.

zfigura · 2018-08-15T21:26:37Z

This isn't fixed (and I'm not sure why we thought it was). I can reproduce it even after a7faa85.

The game seems to use PulseEvent() as a makeshift timer or semaphore; not quite sure. In esync this is implemented as a write() + read(). I think what's happening here is that both of these go through to the kernel before the polling thread has time to return success (i.e. it wakes up the polling thread, but the count is 0 by the time we get to eventfd_poll()). But this is a guess, since I don't understand very well how the kernel-side code actually works, and unfortunately I can't reproduce the bug with strace to rule out that possibility.

In any case, if that's what's going wrong, this is a WONTFIX, and the correct solution is "just disable esync". It would kind of surprise me if the game benefits greatly from it anyway.

Tk-Glitch · 2018-08-15T21:48:19Z

I'm taking back my statement. It's indeed not fixed. I'm not sure what happened during testing yesterday as I can reproduce the issue with supposedly the exact same wine version today (even though it's been rebuilt in the meantime).

zfigura · 2018-08-15T22:13:29Z

It does indeed look like this is a kernel insufficiency: https://gist.github.com/zfigura/6f55688732728d7e1e452188014ec523

On my system the thread never wakes up from poll(), unless I add some sort of delay between the write() and read() calls.

Tk-Glitch · 2018-08-16T15:25:38Z

Ok that's interesting.. That also explains my results (was running an experimental kernel at the time). I'll try to use the exact same configuration next time :D

May help with #10, although the real fix there is just not to use esync.

zfigura · 2018-09-05T15:27:31Z

For what it's worth, I think all of these games are going through timeSetEvent(). Not sure that this helps us, though.

There's no reason these should be global, and in particular, this means that esync_pulse_event() might end up writing 0, which raises the likelihood of a missed wakeup from "probable" to "certain". Fixes #10.

May help with #10, although the real fix there is just not to use esync.

There's no reason these should be global, and in particular, this means that esync_pulse_event() might end up writing 0, which raises the likelihood of a missed wakeup from "probable" to "certain". Fixes #10.

May help with #10, although the real fix there is just not to use esync.

Hello71 · 2018-12-12T01:50:49Z

I think the only way to correctly implement PulseEvent is to have one eventfd per client, and making it the client's responsibility to read the data. However, if that is too hard, I think you can bodge it by making it the client's responsibility to read the data, even if there are multiple clients. If you make the eventfd non-blocking and ignore EAGAIN, I reckon it'll work much better than the existing system. Even better, it'll work perfectly if there happens to be only one waiter.

zfigura · 2018-12-12T16:15:50Z

I think the only way to correctly implement PulseEvent is to have one eventfd per client, and making it the client's responsibility to read the data.

I'm not sure what's meant by this. One eventfd per process? Per handle? That can't work, we need events to be able to be woken up from different processes and handles.

However, if that is too hard, I think you can bodge it by making it the client's responsibility to read the data, even if there are multiple clients. If you make the eventfd non-blocking and ignore EAGAIN, I reckon it'll work much better than the existing system. Even better, it'll work perfectly if there happens to be only one waiter.

Waking up multiple waiters on an auto-reset event seems extremely risky. Programs could easily be using them as a stupid semaphore or mutex.

Hello71 · 2018-12-12T18:30:27Z

I think the only way to correctly implement PulseEvent is to have one eventfd per client, and making it the client's responsibility to read the data.

I'm not sure what's meant by this. One eventfd per process? Per handle? That can't work, we need events to be able to be woken up from different processes and handles.

One eventfd per waiter process per handle.

However, if that is too hard, I think you can bodge it by making it the client's responsibility to read the data, even if there are multiple clients. If you make the eventfd non-blocking and ignore EAGAIN, I reckon it'll work much better than the existing system. Even better, it'll work perfectly if there happens to be only one waiter.

Waking up multiple waiters on an auto-reset event seems extremely risky. Programs could easily be using them as a stupid semaphore or mutex.

Oh, you didn't specify auto-reset. Then just set EFD_SEMAPHORE and use eventfd in the regular way? i.e. write the data, then instead of immediately reading it, let the client consume it?

zfigura · 2018-12-12T20:16:43Z

I think the only way to correctly implement PulseEvent is to have one eventfd per client, and making it the client's responsibility to read the data.

I'm not sure what's meant by this. One eventfd per process? Per handle? That can't work, we need events to be able to be woken up from different processes and handles.

One eventfd per waiter process per handle.

How would that work, then? How would you wake up handle A by signaling handle B to the same object, if they are using different eventfds?

However, if that is too hard, I think you can bodge it by making it the client's responsibility to read the data, even if there are multiple clients. If you make the eventfd non-blocking and ignore EAGAIN, I reckon it'll work much better than the existing system. Even better, it'll work perfectly if there happens to be only one waiter.

Waking up multiple waiters on an auto-reset event seems extremely risky. Programs could easily be using them as a stupid semaphore or mutex.

Oh, you didn't specify auto-reset. Then just set EFD_SEMAPHORE and use eventfd in the regular way? i.e. write the data, then instead of immediately reading it, let the client consume it?

To have PulseEvent() behave like SetEvent()? That seems too risky.

Hello71 · 2018-12-12T22:05:50Z

How would that work, then? How would you wake up handle A by signaling handle B to the same object, if they are using different eventfds?

Yeah, that doesn't actually fix PulseEvent at all, my mistake.

To have PulseEvent() behave like SetEvent()? That seems too risky.

Ehm... I thought too hard about this issue and mixed up PulseEvent with SetEvent. I think it is possible to implement if there is only WaitForSingleObject. First, define an atomic counter on each auto-reset esync primitive, call it W.

PulseEvent:

if E is an auto-reset event and E.W > 0, then write(E, 1).
if E is a manual-reset event, then write(E, 1).

WaitForSingleObject:

if E is a manual-reset event, do as before. otherwise:
increment E.W.
poll on E.
decrement E.W.
non-blocking read(E). if successful, return. else, go to 2.

It is possible to lose PulseEvent if there are multiple waiters and PulseEvent is used in quick succession. I think it is probably possible to fix this by checking the value of W, or adding another counter.

I tried to think about it but couldn't find any way to make mixed WaitForMultipleObjects work; I think you need two atomic counters, or possibly to lock the event with a mutex (maybe only if PulseEvent is in use)? Is it common to use WaitForMultipleObjects with PulseEvent?

zfigura · 2018-12-12T23:48:37Z

Your idea is essentially to signal the event if and only if there are waiters, and not to unsignal it in that case. This could work, but would be difficult to synchronize, and it would break with wait-all. I'll have to give it some thought.

Hello71 · 2018-12-13T00:52:54Z

I think if it's possible to get it to work with wait-multiple, the same tactic should apply with wait-all. Personally, I don't really care about this issue, but if you say it's blocking upstreaming then I'm interested in helping.

zfigura · 2018-12-13T02:25:40Z

PulseEvent() is tricky, but the biggest problem with upstreaming is, I think, wait-all. That's not something that's exposed by Linux polling APIs, and I think it would probably need either kernel support or excessive locking on the Wine side. The other problem is use of shared memory (reading object state, reading object state atomically with releasing a semaphore).

Hello71 · 2018-12-13T19:27:36Z

I don't see the problem with the existing wait-all algorithm, except that it potentially breaks ordering in some specific cases. Shared memory is also highly portable, even Windows has it. I don't understand what you mean by "volatile state of semaphores and mutexes", but if you need a shared memory across all processes on the same WINEPREFIX, just open a shared file in that prefix?

zfigura · 2018-12-13T20:27:34Z

I don't see the problem with the existing wait-all algorithm, except that it potentially breaks ordering in some specific cases.

I'm not sure about ordering, but I guess the main problem is that we can grab objects incorrectly, and then they're unsignaled for a period of time until we release them.

Shared memory is also highly portable, even Windows has it. I don't understand what you mean by "volatile state of semaphores and mutexes", but if you need a shared memory across all processes on the same WINEPREFIX, just open a shared file in that prefix?

As in the owner thread, recursion count, etc. The problem is not portability but rather safety; we can't let one process corrupt the state of another process's objects, even by accident.

I think the proper way forward is to get support for these things in the kernel. It just needs a lot of work.

Hello71 · 2018-12-14T14:26:47Z

I don't see the problem with the existing wait-all algorithm, except that it potentially breaks ordering in some specific cases.

I'm not sure about ordering, but I guess the main problem is that we can grab objects incorrectly, and then they're unsignaled for a period of time until we release them.

I thought about this and I think you want a RW lock, where W is wait-all and R is clearing a single event (whether by WaitForSingleObject or single WaitForMultipleObjects). So it would look like:

wait-any:

poll
take rdlock
do the needful
release rdlock

wait-all:

poll
take wrlock
poll again with timeout=0. if not all readable, release wrlock and go to 1.
do the needful
release wrlock

reset:

take rdlock
do the needful
release rdlock

I think this is probably the most efficient possible implementation. I'm pretty sure it's not possible to implement in a completely lock-free manner. I tried reading the ReactOS code, but I didn't really understand it.

May help with zfigura/wine#10, although the real fix there is just not to use esync.

There's no reason these should be global, and in particular, this means that esync_pulse_event() might end up writing 0, which raises the likelihood of a missed wakeup from "probable" to "certain". Fixes: zfigura/wine#10

May help with zfigura/wine#10, although the real fix there is just not to use esync.

May help with #10, although the real fix there is just not to use esync.

aufkrawall · 2019-08-27T19:06:27Z

Just stumbled over this. I guess it can't hurt to report that this is still an issue with both esync and fsync, Crysis 1 demo freezes after starting (wine-staging-tkg ff10ae6e74a8f090f89a217e0ff6da862b6b022b ).

zfigura closed this as completed Aug 15, 2018

zfigura reopened this Aug 15, 2018

zfigura added the wontfix This will not be worked on label Aug 15, 2018

zfigura changed the title ~~Crysis (1) hanging~~ Crysis (1) is very slow to play intro videos [missed wakeups from PulseEvent()] Aug 15, 2018

zfigura added a commit that referenced this issue Aug 19, 2018

ntdll: Yield during PulseEvent().

a8179f1

May help with #10, although the real fix there is just not to use esync.

zfigura mentioned this issue Aug 25, 2018

Sound issue in Sonic Generations #14

Closed

romulasry mentioned this issue Aug 25, 2018

Sonic Generations (71340) ValveSoftware/Proton#380

Open

zfigura mentioned this issue Sep 5, 2018

[Pro Evolution Soccer 2013] Menu SFX distorted/loops, but background music is fine #16

Closed

zfigura changed the title ~~Crysis (1) is very slow to play intro videos [missed wakeups from PulseEvent()]~~ Multiple games miss wakeups from PulseEvent (Crysis (1) is very slow to play intro videos, Sonic Generations and Pro Evolution Soccer have audio distortion) Sep 5, 2018

zfigura added a commit that referenced this issue Oct 7, 2018

ntdll: Yield during PulseEvent().

b8ddaf2

May help with #10, although the real fix there is just not to use esync.

zfigura added a commit that referenced this issue Nov 1, 2018

ntdll: Yield during PulseEvent().

7d21da1

May help with #10, although the real fix there is just not to use esync.

kakra pushed a commit to kakra/wine-proton that referenced this issue Mar 9, 2019

ntdll: Yield during PulseEvent().

2c270b2

May help with zfigura/wine#10, although the real fix there is just not to use esync.

kakra pushed a commit to kakra/wine-proton that referenced this issue Mar 17, 2019

ntdll: Yield during PulseEvent().

803f191

May help with zfigura/wine#10, although the real fix there is just not to use esync.

zfigura closed this as completed in 77e5758 Jul 29, 2019

zfigura added a commit that referenced this issue Jul 29, 2019

ntdll: Yield during PulseEvent().

f112ca2

May help with #10, although the real fix there is just not to use esync.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multiple games miss wakeups from PulseEvent (Crysis (1) is very slow to play intro videos, Sonic Generations and Pro Evolution Soccer have audio distortion) #10

Multiple games miss wakeups from PulseEvent (Crysis (1) is very slow to play intro videos, Sonic Generations and Pro Evolution Soccer have audio distortion) #10

Tk-Glitch commented Aug 14, 2018

zfigura commented Aug 15, 2018

pingubot commented Aug 15, 2018 •

edited

Loading

zfigura commented Aug 15, 2018 •

edited

Loading

Tk-Glitch commented Aug 15, 2018

zfigura commented Aug 15, 2018 •

edited

Loading

Tk-Glitch commented Aug 16, 2018

zfigura commented Sep 5, 2018

Hello71 commented Dec 12, 2018

zfigura commented Dec 12, 2018

Hello71 commented Dec 12, 2018

zfigura commented Dec 12, 2018

Hello71 commented Dec 12, 2018

zfigura commented Dec 12, 2018

Hello71 commented Dec 13, 2018

zfigura commented Dec 13, 2018

Hello71 commented Dec 13, 2018 •

edited

Loading

zfigura commented Dec 13, 2018

Hello71 commented Dec 14, 2018

aufkrawall commented Aug 27, 2019

Multiple games miss wakeups from PulseEvent (Crysis (1) is very slow to play intro videos, Sonic Generations and Pro Evolution Soccer have audio distortion) #10

Multiple games miss wakeups from PulseEvent (Crysis (1) is very slow to play intro videos, Sonic Generations and Pro Evolution Soccer have audio distortion) #10

Comments

Tk-Glitch commented Aug 14, 2018

zfigura commented Aug 15, 2018

pingubot commented Aug 15, 2018 • edited Loading

zfigura commented Aug 15, 2018 • edited Loading

Tk-Glitch commented Aug 15, 2018

zfigura commented Aug 15, 2018 • edited Loading

Tk-Glitch commented Aug 16, 2018

zfigura commented Sep 5, 2018

Hello71 commented Dec 12, 2018

zfigura commented Dec 12, 2018

Hello71 commented Dec 12, 2018

zfigura commented Dec 12, 2018

Hello71 commented Dec 12, 2018

zfigura commented Dec 12, 2018

Hello71 commented Dec 13, 2018

zfigura commented Dec 13, 2018

Hello71 commented Dec 13, 2018 • edited Loading

zfigura commented Dec 13, 2018

Hello71 commented Dec 14, 2018

aufkrawall commented Aug 27, 2019

pingubot commented Aug 15, 2018 •

edited

Loading

zfigura commented Aug 15, 2018 •

edited

Loading

zfigura commented Aug 15, 2018 •

edited

Loading

Hello71 commented Dec 13, 2018 •

edited

Loading