Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimisation pass #1365

Open
wants to merge 7 commits into
base: develop
Choose a base branch
from
Open

Optimisation pass #1365

wants to merge 7 commits into from

Conversation

TW1STaL1CKY
Copy link
Collaborator

I figured I'd continue on after #1364 and look at replacing all player.GetAll calls where applicable, which led into replacing all ents.GetAll calls with ents.Iterator. That brings us to now where I figured I'd keep going with every optimisation method I could think of within reason.

Instead of pushing this straight to develop, I've made this PR to give everyone interested a chance to check I haven't missed anything obvious. I've tried usual PAC usage on my own game with these changes and it seems fine.
If there's no objections, I'll merge this within a few days.

Commit description:

Had a scanthrough and made the following changes in the name of optimisation (even micro-optimisations) where possible:

  • player.Get* -> player.Iterator
  • ents.GetAll -> ents.Iterator
  • ipairs -> for i=1,#tbl (only main uses of ipairs has been replaced, not every use)
  • Vector:Distance -> Vector:DistToSqr
  • Vector:Length -> Vector:LengthSqr
  • net.Read/WriteEntity -> net.Read/WritePlayer where it is definitely intended to be a player
  • Saved some GetConVar calls to local vars
  • Localised/reused some Vector and Color objects
  • Localised some functions where they are frequently used in a few files
  • Misc minor styling cleanup (mostly spacing)
  • BONUS: Fixed pac.ColorToNames being incorrect when used by some parts (some now use the new pac.VectorColorToNames func)

Had a scanthrough and made the following changes in the name of optimisation (even micro-optimisations) where possible:
- player.Get* -> player.Iterator
- ents.GetAll -> ents.Iterator
- ipairs -> for i=1,#tbl (only main uses of ipairs has been replaced, not every use)
- Vector:Distance -> Vector:DistToSqr
- Vector:Length -> Vector:LengthSqr
- net.Read/WriteEntity -> net.Read/WritePlayer where it is definitely intended to be a player
- Saved some GetConVar calls to local vars
- Localised/reused some Vector and Color objects
- Localised some functions where they are frequently used in a few files
- Misc minor styling cleanup (mostly spacing)
- BONUS: Fixed pac.ColorToNames being incorrect when used by some parts (some now use the new pac.VectorColorToNames func)
@@ -458,7 +461,7 @@ function PART:DecideTarget()

--sort for the closest
for i,ent_candidate in ipairs(ents_candidates) do
local test_distance = (ent_candidate:GetPos()):Distance( self:GetWorldPosition())
local test_distance = (ent_candidate:GetPos()):Distance(self:GetWorldPosition())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This Distance can be optimized

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, I missed this file
Thanks 😄

@thegrb93
Copy link
Collaborator

I feel like for loops that don't need optimization should use ipairs (which is faster in luajit). Each for loop now being 3 lines is pretty messy and uncomfortable to look at.

@TW1STaL1CKY
Copy link
Collaborator Author

I agree it's nicer to look at when it's just ipairs. However when I test the run times between them, for i = 1, #tbl is consistently the fastest, even when luajit is enabled.

Results:

pairs		0.061870000008639
ipairs		0.023996999989322
for k = 1, #tbl	0.0083710000012616

The code I'm using to test the run times:

local tbl = {"a", "b", "c", "a", "b", "c", "a", "b"}

jit.on() -- Make sure its on

local runTimes = 100

local time = 0
for x = 1, runTimes do
	local start = SysTime()
	
	for i = 1, 1000 do
		for k, v in pairs(tbl) do end
	end
	
	local e = SysTime()
	
	time = time + (e - start)
end

print("pairs", (time / runTimes) * 1000)

time = 0
for x = 1, runTimes do
	local start = SysTime()
	
	for i = 1, 1000 do
		for k, v in ipairs(tbl) do end
	end
	
	local e = SysTime()
	
	time = time + (e - start)
end

print("ipairs", (time / runTimes) * 1000)

time = 0
for x = 1, runTimes do
	local start = SysTime()
	
	for i = 1, 1000 do
		for k = 1, #tbl do
			local v = tbl[k]
		end
	end
	
	local e = SysTime()
	
	time = time + (e - start)
end

print("for k = 1, #tbl", (time / runTimes) * 1000)

@thegrb93
Copy link
Collaborator

That makes me really sad for glua. Still, compromising clean code in spots that don't need the performance is a bit of a headache. I'm fine with the changes either way.

@CapsAdmin
Copy link
Owner

I don't wanna dictate too much here as I'm not very active in the project at the moment, but how does this affect startup time, framerate of rendering a complex outfit, and so on?

In other words, if it doesn't improve the performance users care about then all we're really doing is making the code more complex.

@Astralcircle
Copy link

Astralcircle commented Jun 18, 2024

I don't wanna dictate too much here as I'm not very active in the project at the moment, but how does this affect startup time, framerate of rendering a complex outfit, and so on?

In other words, if it doesn't improve the performance users care about then all we're really doing is making the code more complex.

Of most significant, these changes can reduce the number of calls to player.GetAll() and ents.GetAll(), for i=1,#tbl can go through the table faster.
Given that in some cases ents.GetAll() and player.Iterator() are called in loops, this can greatly increase performance in them

@thegrb93
Copy link
Collaborator

thegrb93 commented Jun 18, 2024

That's stating the obvious. He just wants to know if the fps gain is worth the messy code. Imo only the hot loops should be optimized, but it doesn't really matter.

@pingu7867
Copy link
Collaborator

Having to resolve the conflicts in that many files changed will be an absolute pain for me who hoards heaps and heaps of changes before publishing them, but I'll manage.

As for the style, yeah it adds some lines and it's a different way but I'll accept it if it makes pac a little less bulky in the crucial areas that need their inevitable loops. Especially Think and OnDraw ones, that's good to optimize. Even if ultimately there's always a bigger source of cost out there. How much of pac's runtime cost are we fighting over? If it turns out to be less than a percent, fighting over a tiny fraction is useless in my opinion. Just don't make the code unreadable. Which these changes don't.

If we're gonna talk about performance, I'm more interested in numbers and quantifying the impact.

Of most significant, these changes can reduce the number of calls to player.GetAll() and ents.GetAll(), for i=1,#tbl can go through the table faster. Given that in some cases ents.GetAll() and player.Iterator() are called in loops, this can greatly increase performance in them

Neither the word "can" nor "greatly increase performance" don't mean anything by themselves. For any theoretical "can", you have to show the real truth. I can trust your intent but it's not enough to establish. For any "greatly increase", you must specify how much. What's the run time in milliseconds (or microseconds) before, and what's the runtime after? What's the FPS before and after? If you have numbers, I'm sure they'll speak for themselves.

@Astralcircle
Copy link

That's stating the obvious. He just wants to know if the fps gain is worth the messy code. Imo only the hot loops should be optimized, but it doesn't really matter.

Perhaps I didn’t express myself clearly enough. What I meant was that it could improve performance in certain situations. For example, if used on a larger server with a bunch of complex PAC3 outfits, the difference would definitely be noticeable. However, if you were to scrutinize it under a magnifying glass in single-player, there would hardly be any noticeable change.

About complicating the code, I don’t think that anything can complicate it except for k = 1, #tbl, because the rest of the changes simply consist of using a different function

Comment on lines +60 to +69
local function get_enable_convar(classname)
local convar = pac_enable_convars[classname]

if not convar then
convar = GetConVar("pac_enable_" .. string.Replace(classname, " ", "_"):lower())
pac_enable_convars[classname] = convar
end

return convar
end
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be the __index metamethod of pac_enable_convars. Then replace calls to this with table indexes.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tested the perf difference between using the local function and using __index for this, it's pretty much the same. It's a 50-50 chance that one will run faster than the other, so I feel inclined to leave this as is.

Test results (ran code 50 times, console output of last run):

use __index 	0.00066750000542015
use function	0.00056749999657768
* times won:
[1]	=	21
[2]	=	24

Code:

local tbl = {}

setmetatable(tbl, {
	__index = function(t, k)
		local convar = GetConVar("pac_enable_" .. string.Replace(k, " ", "_"):lower())
		t[k] = convar
		
		return convar
	end
})

local tbl2 = {}
local function get_enable_convar(classname)
	local convar = tbl2[classname]

	if not convar then
		convar = GetConVar("pac_enable_" .. string.Replace(classname, " ", "_"):lower())
		tbl2[classname] = convar
	end

	return convar
end

local a
local str1, str2 = "model", "light" -- Just in case creating strings was causing a discrepency

jit.on() -- Make sure its on

timesWon = timesWon or {}

local runTimes = 200

local time = 0
for x = 1, runTimes do
	local start = SysTime()
	
	for i = 1, 1000 do
		a = tbl[str1]
		a = tbl[str2]
		a = tbl[str1]
		a = tbl[str1]
		a = tbl[str2]
	end
	
	local e = SysTime()
	
	time = time + (e - start)
end

local result = (time / runTimes) * 1000
print("use __index ", result)

time = 0
for x = 1, runTimes do
	local start = SysTime()
	
	for i = 1, 1000 do
		a = get_enable_convar(str1)
		a = get_enable_convar(str2)
		a = get_enable_convar(str1)
		a = get_enable_convar(str1)
		a = get_enable_convar(str2)
	end
	
	local e = SysTime()
	
	time = time + (e - start)
end

local result2 = (time / runTimes) * 1000
print("use function", result2)
	
if result < result2 then
	timesWon[1] = (timesWon[1] or 0) + 1
else
	timesWon[2] = (timesWon[2] or 0) + 1
end

print("* times won:")
PrintTable(timesWon)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Speed doesn't matter for this function. it's just cleaner to use __index. And yeah jit is kinda random about what it blacklists or compiles

@@ -2673,7 +2716,7 @@ function pace.UltraCleanup(obj)
if IsSafe(part) or Important(part) then
return
elseif IsMildlyRisky(part) then
if table.Count(part:GetChildren()) == 0 then
if #part:GetChildren() == 0 then
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure these table.Count replacements will always work? vgui can be pretty inconsistent with the type of table this returns iirc.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe use table.Empty instead

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In theory part:GetChildren() should always return a numerically indexed table. I'm not sure what's up with this code, but I guess in theory is just in theory..

Also I believe [1] is faster than getting the tables length, which part:HasChildren() should do.

@thegrb93
Copy link
Collaborator

I finished reviewing. Just a couple more suggestions. I think a big potential performance improvement is minimizing __index calls on entities.

@wrefgtzweve
Copy link
Collaborator

wrefgtzweve commented Jun 19, 2024

ive seen people complain that player.Iterator ents.Iterator can rarely return NULL entities on clientside, im not exactly how it works and how to fix it but its enough of a issue that i can recommend not using it clientside when possible

@wrefgtzweve
Copy link
Collaborator

wrefgtzweve commented Jun 19, 2024

I finished reviewing. Just a couple more suggestions. I think a big potential performance improvement is minimizing __index calls on entities.

Yeah ent, weapon, player __index calls are significantly laggy. Running something like fprofiler and a __index counter would probably help narrow down the lag. Small optimization passes like these arent bad but they wont make significiant differences

Basically running FProfiler and seeing what I can do to make the most expensive and most ran functions run faster
lua/autorun/netstream.lua Outdated Show resolved Hide resolved
@pingu7867
Copy link
Collaborator

I've had some updates I've been meaning to push for a while and had decided to wait. But it's been a month, which is a far cry from what the "within a few days" initially stated. So I'm doing some of my updates and you can deal with the git conflicts later.

@TW1STaL1CKY
Copy link
Collaborator Author

I've had some updates I've been meaning to push for a while and had decided to wait. But it's been a month, which is a far cry from what the "within a few days" initially stated. So I'm doing some of my updates and you can deal with the git conflicts later.

That's fine go ahead. I may exclude my changes to do with combat related things for the time being anyway.

And sorry, there were a lot of things pointed out that I could try improving, so I kept this back to keep working on it.

@@ -272,7 +276,7 @@ do -- scene graph
-- this will handle cases like if a part is removed and added again
for _, key in pairs(self.PartReferenceKeys) do
if self[key] and self[key].UniqueID == part.UniqueID then
self["Set" .. key](self, part)
self[prefix_set .. key](self, part)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making this a variable is pointless. Getting rid of the concatenation would be a good performance improvement.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also I don't like making these constant strings into variables since it fragments and complicates the code.

@thegrb93
Copy link
Collaborator

This PR is nearly unreviewable with the amount of changes. You're going to just have to test it on a server probably.

@thegrb93
Copy link
Collaborator

Conflicts need to be resolved too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants