Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reworked BlueprintHookManager to better-prevent hooks from corrupting the code or each other #320

Open
wants to merge 6 commits into
base: dev
Choose a base branch
from

Conversation

Epp-code
Copy link

@Epp-code Epp-code commented Nov 7, 2024

When testing blueprint hooks, I learned that hooking a function return after hooking a function start would corrupt the start hook and crash upon blueprint function call. I also realized that, because the existing approach sometimes needs to move original function instructions, there's a chance it could move an instruction that an unrelated part of the function tries to jump to, which would also crash. There's a high-level description of the issue and thoughts on a fix in Discord here:

https://discord.com/channels/555424930502541343/862002356626128907/1303244570708152360

This PR implements what I hinted at in Discord - it's a fix that should support arbitrary amounts of arbitrary offset hooking (though the offsets must still be aligned with instructions, of course). Every new hook entry point now moves all instructions after it just enough to make room for the hook jump. It then scans the entire function and updates any affected jump instructions to point to their updated locations. It then installs the hook jump and the new hook code after the function's existing code.

Thoughts/explanations for the reviewer:

  1. I tried to mimic the existing code style but please let me know if anything looks out of place or needs more comments/etc. I am fairly code-style-agnostic and am happy to clarify things.
  2. The most complex part of this PR is ModifyJumpTargetsForNewHookOffset, which attempts to find jump targets for every applicable opcode and needs careful review.
  3. Because we cannot know how to update the jump target expression of an EX_ComputedJump, I simply assert if the function contains an EX_ComputedJump. As far as I can tell, there is no way to make hooking such a function safe and it was not safe before this PR.
  4. I removed KismetBytecodeDisassembler::GetStatementLength as each statement Json now has its size included and this no longer seem to be used; theoretically, some mod author might be using it but I'd rather err on the side of removing dead code.
  5. The checkf() macros in the affected classes are currently being compiled to no-ops when built for shipping by Alpakit. According to Archengius, compiling out checkf's is not intended and looks like the result of a partially-reverted patch. I don't know what the long-term solution is in the engine code but, for now, I changed them to fgcheckf macros, which are currently not ever compiled out. This could result in crashes for anybody who was unwittingly violating these checks in a way that didn't already crash.
  6. The hook manager stores hooks by what their offset would have been in the original, unmodified function, regardless of the how the function has been edited by existing hooks. This ensures users have a consistent experience and never have to know how the function is modified.

More thoughts

An alternate approach could have been to simply insert the hook calls directly into the function and update affected jump targets in the function. The main downside would be that the amount to update the jump targets would depend on the length of the code needed to invoke the hook, which would make changing that code fragile and error-prone. An unconditional jump will always be the same size even if the hook invocation code changes or needs to vary based on the hook itself for some reason.

Comment on lines 179 to 193
// Now we search all the children of this node for jump instructions that need to be updated
for (auto& Pair : Expression->Values) {
TSharedPtr<FJsonObject>* ExpressionValue;
if (Pair.Value->TryGetObject(ExpressionValue)) {
ModifyJumpTargetsForNewHookOffset(Script, *ExpressionValue, HookOffset);
continue;
}
TArray<TSharedPtr<FJsonValue>>* ArrayValue;
if (Pair.Value->TryGetArray(ArrayValue)) {
for (TSharedPtr<FJsonValue> Value : *ArrayValue) {
if (Value->TryGetObject(ExpressionValue)) {
ModifyJumpTargetsForNewHookOffset(Script, *ExpressionValue, HookOffset);
}
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you found any case in which this would do anything? I don't think jump instructions can appear in expressions (only in statements)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I quickly looked over the Kismet compiler in the engine and it only emits EX_JumpIfNot from compiled statements (KCST_ enumerators). I think the same applies to the other instructions with jump offsets. So this doesn't seem to be necessary.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, my first pass didn't have the array handling here and was crashing. Each case in a switch statement contains an OffsetToNextCase that is an absolute offset and needs to be updated, as well. Those are technically jump targets but perhaps the name of the function adds some confusion. I changed it to ModifyOffsetsForNewHookOffset, which is hopefully a little more on the nose?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OffsetToNextCase is an absolute offset? That's weird.

Copy link
Author

@Epp-code Epp-code Nov 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was surprised, too. Here's a snippet to show it and illustrate why recursing is necessary. This LetBool's Expression is a switch and you can see each CaseValue's OffsetToNextCase is exactly the OpcodeIndex of the subsequent case. Also, the OffsetToNextCase values are way too big to be relative offsets, which was my first clue.

{
		"Opcode": 20,
		"OpcodeIndex": 258,
		"Inst": "LetBool",
		"Variable":
		{
			"Opcode": 72,
			"OpcodeIndex": 259,
			"Inst": "LocalOutVariable",
			"VariableType":
			{
				"PinCategory": "bool",
				"PinSubCategory": "bool"
			},
			"VariableName": "ReturnValue",
			"OpSizeInBytes": 9
		},
		"Expression":
		{
			"Opcode": 105,
			"OpcodeIndex": 268,
			"Inst": "SwitchValue",
			"Expression":
			{
				"Opcode": 0,
				"OpcodeIndex": 275,
				"Inst": "LocalVariable",
				"VariableType":
				{
					"PinCategory": "byte",
					"PinSubCategory": "byte",
					"PinSubCategoryObject": "/Script/FactoryGame.ERepresentationType"
				},
				"VariableName": "Temp_byte_Variable",
				"OpSizeInBytes": 9
			},
			"OffsetToSwitchEnd": 608,
			"Cases": [
				{
					"CaseValue":
					{
						"Opcode": 36,
						"OpcodeIndex": 284,
						"Inst": "ByteConst",
						"Value": 0,
						"OpSizeInBytes": 2
					},
					"OffsetToNextCase": 299,
					"CaseResult":
					{
						"Opcode": 0,
						"OpcodeIndex": 290,
						"Inst": "LocalVariable",
						"VariableType":
						{
							"PinCategory": "bool",
							"PinSubCategory": "bool"
						},
						"VariableName": "Temp_bool_Variable_20",
						"OpSizeInBytes": 9
					}
				},
				{
					"CaseValue":
					{
						"Opcode": 36,
						"OpcodeIndex": 299,
						"Inst": "ByteConst",
						"Value": 1,
						"OpSizeInBytes": 2
					},
					"OffsetToNextCase": 314,
					"CaseResult":
					{
						"Opcode": 0,
						"OpcodeIndex": 305,
						"Inst": "LocalVariable",
						"VariableType":
						{
							"PinCategory": "bool",
							"PinSubCategory": "bool"
						},
						"VariableName": "Temp_bool_Variable_19",
						"OpSizeInBytes": 9
					}
				},
				{
					"CaseValue":
					{
						"Opcode": 36,
						"OpcodeIndex": 314,
						"Inst": "ByteConst",
						"Value": 2,
						"OpSizeInBytes": 2
					},
					"OffsetToNextCase": 329,
					"CaseResult":
					{
						"Opcode": 0,
						"OpcodeIndex": 320,
						"Inst": "LocalVariable",
						"VariableType":
						{
							"PinCategory": "bool",
							"PinSubCategory": "bool"
						},
						"VariableName": "Temp_bool_Variable_18",
						"OpSizeInBytes": 9
					}
				},
			....

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which asset is this snippet from? I'd like to inspect it myself

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is from BPW_MapFilterCategories::CanBeSeenOnCompass and it's the first BeforeHook dump that is created with DEBUG_BLUEPRINT_HOOKING set to 1 (to ensure it was unmodified).

…redefinedHookOffset::Return would result in two separate hooks in the map - though they would be called correctly, if a function from the first hook set a return jump, it would skip the functions from the second hook, which would violate the contract that all hook functions at a location are called when a return jump is set.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

Successfully merging this pull request may close these issues.

2 participants