-
Notifications
You must be signed in to change notification settings - Fork 173
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[graphics] port generic VU1 to OpenGL #1221
Conversation
this so far looks very promising. The robot boss frame went from 9.3 ms to 4.7 ms. It isn't drawing correctly yet, but I don't see why it should get much slower? It looks like there is some additional adgif-style data included in the vertices. There's 5 qw of normal adgif, but there's 7 in It'll be somewhat annoying to match these up, but it should be possible. Remaining todo:
|
The actual merc renderer is pretty confusing, so I'm starting here. There's a few places where we're pretty heavily bottlenecked by generic (final boss, jungle water, etc) so we'll have to do this eventually. We also need this to be OpenGL in order to do clipping right (I don't want to try to figure out their scissoring crap and the current workaround doesn't always work).
The input to the generic renderer is a pretty simple format, so I'm hoping that once I port generic, I'll be able to understand the merc -> generic conversion on the EE (which does the same thing as merc VU1, but is less awful to understand).
I'd also like to better understand the generic format before getting environment mapping going in TIE. After what I see here, I think we'll need to do TIE with env map as part of Tie3 and skip the whole tie->generic->generic envmap thing.
I think I've finished the tricky part, which is figuring out what math the actual generic VU1 program does. I have an unpipelined loop for vertex transforming (21 VU1 instructions) that works (only for ones that pass the guard band check, but eventually we'll do it all in opengl which clips for us so this version will work). So far no surprises (and I kind already an idea how it works from all the lighting debug).
There is a nice speedup in generic from this change, but there's a lot more coming if we can eliminate the GIF packet unpacking. Ideally we can go straight from DMA to a single large vertex buffer and draw info list. The vertex buffer can go straight to the GPU. Also, if we do it right, we can reorder draws to merge the two-pass env map draws and massively cut down on draws - the current approaches does 2 draws for each env mapped fragment.