-
-
Notifications
You must be signed in to change notification settings - Fork 157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
3DFX Voodoo emulation improvements #3115
Comments
For "1. Optimized emulation for better speed (perhaps using GPU)." - do you mean something like https://github.com/kjliew/qemu-3dfx or http://dosbox-x.com/wiki/Guide%3ASetting-up-3dfx-Voodoo-in-DOSBox%E2%80%90X#_high_level_emulation or something else? |
Based on information given to me by @kcgen and the other developers, DOSBox Staging currently emulates a Voodoo 1 (4 mb) and a Voodoo 1 on steroids (12 mb), and it doesn't accurately emulate the performance of a Voodoo 1, but it depends on speed of the Host CPU (and that's a good thing).
Yes, something like that or whatever would improve performance. |
Related comment from @PoloniumRain:
|
Carmageddon now works perfectly with Voodoo. Can be disregarded. |
OK, and what's the status/decision on |
There's nothing confusing about it @Torinde. There is no OpenGL passthrough, no Glide, nothing like that. The Voodoo is emulated "in software" in Staging, just like the GUS is, the MT-32, the OPL, and so on. So the entire Voodoo hardware is emulated accurately at low-level, which then produces the frames entirely by using the host CPU, without any host GPU involvement. Hope that makes it crystal clear. Also, forget about DOSBox-X, they do different things (e.g. support Glide). We do not support Glide wrappers, just authentic low-level emulation of the Voodoo hardware done 100% on the host CPU. That's it. |
OK, then this card should be moved to 'Rejected'? |
I've deleted it, we don't need it. |
Relatively easy way to improve performance is to enable Voodoo emulation to run with multiple worker threads like DOSBox Pure does. Mostly useful for people running Win32 games in unofficial capacity but that basically increases speed in linear fashion with each additional worker thread. Voodoo 5 6000 Prototype already did this in 2000: |
Here is the performance data for 4 to 23 threads Effect depends both on host hardware and game used (~ 5-50%) and Pure decided to go with 7 threads.
The following is what confused me: #3040 (comment)
|
We're not going to do any of that Glide wrapper stuff since its Windows-only and buggy. DOSBox Pure multiple worker threads model is the best way to squeeze more performance out in a cross-platform friendly way. |
That's good data with actual FPS boosts across a bunch of games. Thanks for passing that on, @Torinde. We do have the Are we beholden to keep using 3 threads? (Maybe for the pi?) We could make I think SDL gives us some cross-platform CPU APIs to get this value. Edit: https://wiki.libsdl.org/SDL2/SDL_GetCPUCount gives us logical cores, and we also have |
Yeah we hardcoded it to 3 for the Pi, and we thought there is no benefit to using more threads (but did not measure it AFAIK).
That's the perfect solution combined with Great find @Torinde, btw 🎖️ We should re-test the Voodoo emulation after the threading change as we've cleaned up the threading code significantly compared to DOSBox Pure, but we should in theory get similar performance improvements. So yeah, we need to conduct our own measurements to be sure. |
How about using this algorithm as a default (auto setting), but allowing the user to specify the thread count manually (preferably changeable at runtime)? Rationale:
Let’s face it: we have no resources to determine the best possible thread count selection algorithm, IMHO we should provide a sane default (like the one proposed above), but allow the power users to determine the value which works the best for them. |
@FeralChild64 IMO, you're overthinking it; it seems the overall best performance is at thread count of 7. Surely performance may vary per level, or even per room in a game, or based on the number of enemies on the screen, etc. As long as long we're in the best performance bracket within +/- 10-15%, that's good enough (I doubt many people care about playing a game at 50 vs 55 FPS...) So dunno, I don't think CPU or OS variations matter much; overall parallelism of 7 seems best. Or 6. Or 8. Something like that, it's not an exact science 😏 I'm happy to nuke the setting, but I also won't spend time arguing if you want to keep it, setting it to |
It was just my opinion; I won’t be implementing the change, and I definitely won’t quarrel about it :) |
I agree @FeralChild64; the concept of a cpu core is starting to lose all meaning ... Intel Atom 15305 (Formerly known as RicketyTrail)
I'm going to try to get the number of physical cores first (and ignore the PhonY cores and threads), and then only fall back to C++'s concurrency numbers. |
Can somebody please share what "the removed OpenGL backend in the Voodoo emulation patch" is (since apparently it's not a pass-through to Glide wrappers) - was that a redundant render output path (Staging already uses OpenGL by default anyway)? I think @interloper98 idea for auto/default is good. I also agree with @FeralChild64 to modify the current user setting to be setting # of threads is also good - there are too many combinations (with more coming in the future) and while from the tests so far it seems 7 is good in the majority of cases (and doesn't hurt even weak CPUs with fewer cores, etc. - see further comments at the link, it's not only about Threadripper) - there are game/hardware combos where more than 7 helps. (and leave the custom maximum # outrageously big - desktop chips have up to 32 threads, while exotic workstation/server models will go above 256... yes half of those are "phony-hyper-tiny"... so I suggest 256 limit - nice and round, well above what majority of users would be able to use mid term) And what about even more involved algorithm? TLDR - adjust threads dynamically based on performance measurement
Feel free to shot this down - too complex, measuring FPS is cumbersome, too much work needed, etc. |
Yeah I like the idea of auto OR number of threads on the same value field. Auto can use SDL to figure out a good number and still leave us an option to dial it in manually so I don't have battery anxiety when I'm returning from the city on a train. Maybe I should upgrade to ARM64 finally... 😁 |
Ok... lots of effort to for those handful of Voodoo games. Do the On the fly benchmarking is a bit ridiculous... Keep it simple, and the FPS improvements between 3 and more threads are rarely earth shattering. I don't even think you can vary the number of Voodoo threads after startup anyway. When in doubt, always underengineer. I'd take the simple and good enough code any day (the 80-90% solution) vs the 100% solution that comes with an order of magnitude more complexity 😏 But I've been coding for 30+ years, I'm not after impressing anyone anymore, just get the job done with minimum fuss 😏 Keep the big picture in mind -- there are far more important things to do than improve 5 out of the 20 Voodoo games by 5 FPS... maybe. |
Dunno man. Multithreading helps, and we just want generic good enough solutions, not maximalist solutions with 5x the effort. I'm a sweet spot, bang for the buck guy. Minimum change for maximum profit, and move on 😆 |
Note that if you don't play a Voodoo game, the CPU cost is zero. |
EF 2000 happens to be one of my top 5 games on DOS so that's not a luxury I have. |
Are you using the latest Dosbox-Staging Version?
Different version than latest?
No response
What Operating System are you using?
Windows 11
If Other OS, please describe
No response
Is your feature request related to a problem? Please describe.
Hi, I would like to list here some possible improvements for 3DFX Voodoo emulation in DOSBox Staging (for future releases):
If I have any other ideas, I'll add it.
As already said these are just ideas, if a developer comes out and wants to implement them in the next few years, it will be a great thing. Thanks
Describe the solution you'd like
No response
Describe alternatives you've considered
No response
Add any other context or screenshots about the feature request here.
No response
Code of Conduct & Contributing Guidelines
The text was updated successfully, but these errors were encountered: