Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1.17 beta crash mystery thread #18705

Closed
hrydgard opened this issue Jan 15, 2024 · 15 comments
Closed

1.17 beta crash mystery thread #18705

hrydgard opened this issue Jan 15, 2024 · 15 comments

Comments

@hrydgard
Copy link
Owner

hrydgard commented Jan 15, 2024

Now with the beta program, we can do these before the release instead of after! There's enough beta testers already (I think because I once enabled registration in the past without actually having any builds) that we get a usable amount of crash reports.

I've fixed a bunch of low hanging fruit, here come the tricky ones.

First, an oldie but a goodie assert, I really don't understand this one, it should not be possible for b.originalAddress to be null in FinalizeBlock. Though on the other hande, block number should be equal to b.num, no? weird.

(JitBlockCache.cpp:FinalizeBlock:250): [Memory::IsValidAddress(b.originalAddress)] (ULUS10543 WWE SmackDown vs. RAW 2011, 1884.2s) FinalizeBlock: Bad originalAddress 00000000 in block 107438 (b.num: 41902) proxy: n sz: 12

  #00  pc 0x0000000000038880  /apex/com.android.runtime/lib/bionic/libc.so (abort+172)
  #01  pc 0x00000000003fef8d  /apex/com.android.art/lib/libart.so (art::Runtime::Abort(char const*)+1768)
  #02  pc 0x000000000000d97f  /system/lib/libbase.so (android::base::SetAborter(std::__1::function<void (char const*)>&&)::$_3::__invoke(char const*)+46)
  #03  pc 0x00000000000052eb  /system/lib/liblog.so (__android_log_assert+174)
  #04  pc 0x000000000064f92f  arm/libppsspp_jni.so (HandleAssert(char const*, char const*, int, char const*, char const*, ...)+194) (BuildId: 12d5e65fd2137fa97249f8b4a259e9608afa3c7f)
  #05  pc 0x0000000000360f8f  arm/libppsspp_jni.so (JitBlockCache::FinalizeBlock(int, bool)+286) (BuildId: 12d5e65fd2137fa97249f8b4a259e9608afa3c7f)
  #06  pc 0x000000000034b7e9  arm/libppsspp_jni.so (MIPSComp::ArmJit::Compile(unsigned int)+192) (BuildId: 12d5e65fd2137fa97249f8b4a259e9608afa3c7f)
  #07  pc 0x0000000000000106 

Next, there's another oldie I've seen before but never figured out. Might simply be some kind of memory corruption, but I think we can add some checks.

SIGSEGV
  #01  pc 0x00000000003796c5  arm/libppsspp_jni.so (CoreTiming::Advance()+136)

This crashes here:

void ProcessFifoWaitEvents()
{
	while (first)
	{
		if (first->time <= (s64)GetTicks())
		{
//			LOG(CPU, "[Scheduler] %s		 (%lld, %lld) ",
//				first->name ? first->name : "?", (u64)GetTicks(), (u64)first->time);
			Event* evt = first;
			first = first->next;
/////////////////////// THE BELOW LINE CRASHES /////////////////////
			event_types[evt->type].callback(evt->userdata, (int)(GetTicks() - evt->time));
			FreeEvent(evt);
		}
		else
		{
			break;
		}
	}
}

So I suppose evt->type might have gotten corrupted?

(A curiosity here is how the name of the function has survived from pre-open-source Dolphin, which I took the original timing system from.. there is no fifo :) )

@hrydgard hrydgard added this to the v1.17.0 milestone Jan 15, 2024
@hrydgard
Copy link
Owner Author

There are also a couple of shutdown hangs. Believe I've solved one already related to ManagedTexture, but here's one where EmuThread is stuck somewhere in __NetShutdown, unfortunately the stack is missing some detail:

  #02  pc 0x0000000000fd623c  arm64/libppsspp_jni.so (std::__ndk1::thread::join()+28) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #03  pc 0x00000000005d219c  arm64/libppsspp_jni.so (__NetShutdown()+92) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #04  pc 0x00000000005854cc  arm64/libppsspp_jni.so (__KernelShutdown()+376) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #05  pc 0x000000000067d5ec  arm64/libppsspp_jni.so (CPU_Shutdown()+148) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #06  pc 0x000000000067ded8  arm64/libppsspp_jni.so (PSP_Shutdown()+184) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #07  pc 0x0000000000895c0c  arm64/libppsspp_jni.so (EmuScreen::~EmuScreen()+136) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #08  pc 0x0000000000895db8  arm64/libppsspp_jni.so (EmuScreen::~EmuScreen()+16) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)

There are a number of threads that are joined by various functions called by __NetShutdown, seems one of them is stuck. It seems to be the upnp thread:

  #00  pc 0x00000000000e1c1c  /apex/com.android.runtime/lib64/bionic/libc.so (__ppoll+12)
  #01  pc 0x000000000009a36c  /apex/com.android.runtime/lib64/bionic/libc.so (poll+96)
  #02  pc 0x0000000000db1688  arm64/libppsspp_jni.so (receivedata+84) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #03  pc 0x0000000000db0394  arm64/libppsspp_jni.so (getHTTPResponse+204) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #04  pc 0x0000000000daf39c  arm64/libppsspp_jni.so (simpleUPnPcommand+632) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #05  pc 0x0000000000db2080  arm64/libppsspp_jni.so (UPNP_AddPortMapping+292) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #06  pc 0x0000000000688138  arm64/libppsspp_jni.so (PortManager::Add(char const*, unsigned short, unsigned short)+1404) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #07  pc 0x000000000068907c  arm64/libppsspp_jni.so (upnpService(unsigned int)+376) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #08  pc 0x000000000068b9a4  arm64/libppsspp_jni.so (void* std::__ndk1::__thread_proxy<std::__ndk1::tuple<std::__ndk1::unique_ptr<std::__ndk1::__thread_struct, std::__ndk1::default_delete<std::__ndk1::__thread_struct>>, int (*)(unsigned int), unsigned int>>(void*)+48) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)

@hrydgard
Copy link
Owner Author

hrydgard commented Jan 15, 2024

And here's another one where it appears stuck in vsnprintf, or perhaps more likely the exception is getting triggered over and over:

  #00  pc 0x00000000000e0710  /apex/com.android.runtime/lib64/bionic/libc.so (__sfvwrite+224)
  #01  pc 0x00000000000d64d8  /apex/com.android.runtime/lib64/bionic/libc.so (__vfprintf+9688)
  #02  pc 0x00000000000f5ea0  /apex/com.android.runtime/lib64/bionic/libc.so (vsnprintf+192)
  #03  pc 0x00000000000bafec  /apex/com.android.runtime/lib64/bionic/libc.so (__vsnprintf_chk+60)
  #04  pc 0x000000000049a4f0  arm64/libppsspp_jni.so (snprintf(char*, unsigned long pass_object_size1, char const*, ...)) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #05  pc 0x0000000000498c28  arm64/libppsspp_jni.so (Arm64Dis(unsigned long, unsigned int, char*, int, bool, bool (*)(char*, int, unsigned char*))+2892) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #06  pc 0x00000000004a8378  arm64/libppsspp_jni.so (DisassembleArm64(unsigned char const*, int)+380) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #07  pc 0x00000000006617bc  arm64/libppsspp_jni.so (Memory::HandleFault(unsigned long, void*)+512) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #08  pc 0x000000000085b878  arm64/libppsspp_jni.so (sigsegv_handler(int, siginfo*, void*)) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)

The snprintf is just this one:

line 331
} else if (index_pre) {
	snprintf(instr->text, sizeof(instr->text), "%s%s%s %c%d, [x%d, #%d]!", opname[opc], signExt, sizeSuffix[size], r, Rt, Rn, SignExtend9(imm9));

which is from a loadstore, which checks out.

@hrydgard
Copy link
Owner Author

Additional hang, DrainAndBlockCompileQueue vs CompileThread seem to have a possible deadlock:

  #01  pc 0x000000000008dab4  /apex/com.android.runtime/lib64/bionic/libc.so (__futex_wait_ex_owner(void volatile*, bool, int, bool, timespec const*, unsigned int)+432)
  #02  pc 0x00000000000f51f0  /apex/com.android.runtime/lib64/bionic/libc.so (NonPI::MutexLockWithTimeout(pthread_mutex_internal_t*, bool, timespec const*)+252)
  #03  pc 0x0000000000fcdc38  arm64/libppsspp_jni.so (std::__ndk1::mutex::lock()+8) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #04  pc 0x000000000082c934  arm64/libppsspp_jni.so (VulkanRenderManager::CompileThreadFunc()+156) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #05  pc 0x000000000083271c  arm64/libppsspp_jni.so (void* std::__ndk1::__thread_proxy<std::__ndk1::tuple<std::__ndk1::unique_ptr<std::__ndk1::__thread_struct, std::__ndk1::default_delete<std::__ndk1::__thread_struct>>, void (VulkanRenderManager::*)(), VulkanRenderManager*>>(void*)+64) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #03  pc 0x0000000000f93640  /data/app/~~y7ZUpix72C30XvPkv_BYKw==/org.ppsspp.ppsspp-QKHiP-DL2wluIBuv1ZMg_w==/lib/arm64/libppsspp_jni.so (std::__ndk1::condition_variable::wait(std::__ndk1::unique_lock<std::__ndk1::mutex>&)+20) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #04  pc 0x000000000082d770  /data/app/~~y7ZUpix72C30XvPkv_BYKw==/org.ppsspp.ppsspp-QKHiP-DL2wluIBuv1ZMg_w==/lib/arm64/libppsspp_jni.so (VulkanRenderManager::DrainAndBlockCompileQueue()+136) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #05  pc 0x00000000006aa4b8  arm64/libppsspp_jni.so (GPU_Vulkan::DeviceLost()+60) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #06  pc 0x00000000008780e0  arm64/libppsspp_jni.so (NativeShutdownGraphics()+104) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #07  pc 0x0000000000871a00  arm64/libppsspp_jni.so (VulkanEmuThread(ANativeWindow*)) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)

@anr2me
Copy link
Collaborator

anr2me commented Jan 16, 2024

There are a number of threads that are joined by various functions called by __NetShutdown, seems one of them is stuck. It seems to be the upnp thread:

Hmm.. the connection to the router might be stalled or have problem, thus it waited until timeout (i set the default timeout to 2000 ms as there are slow routers that need at least 1 second to be detected).
This kind of issue shouldn't be consistent or easily reproduced, otherwise an actual bug existed.

@hrydgard
Copy link
Owner Author

Yeah, likely it's some kind of one-off - I only see a single report of this. I tried to see if I could find a path where there would be more than 1 timeout between two checks of the thread-exit variable, but couldn't find such a path, so not convinced there's anything we can do about it...

@hrydgard
Copy link
Owner Author

hrydgard commented Jan 17, 2024

Beta 2 from now on.

The shutdown race condition still doesn't seem completely cured, and I got a shutdown hang I haven't seen before:

  #03  pc 0x0000000000f958c0  arm64/libppsspp_jni.so (std::__ndk1::condition_variable::wait(std::__ndk1::unique_lock<std::__ndk1::mutex>&)+20) (BuildId: a751278204da14ef743a7c2df8e1342d71318f70)
  #04  pc 0x0000000000858e18  arm64/libppsspp_jni.so (WaitableCounter::Wait()+76) (BuildId: a751278204da14ef743a7c2df8e1342d71318f70)
  #05  pc 0x0000000000858a58  arm64/libppsspp_jni.so (ParallelRangeLoop(ThreadManager*, std::__ndk1::function<void (int, int)> const&, int, int, int, TaskPriority)+124) (BuildId: a751278204da14ef743a7c2df8e1342d71318f70)
  #06  pc 0x000000000073686c  arm64/libppsspp_jni.so (GPURecord::mymemmem(unsigned char const*, unsigned long, unsigned long, unsigned char const*, unsigned long, unsigned long)) (BuildId: a751278204da14ef743a7c2df8e1342d71318f70)
  #07  pc 0x0000000000734ca0  arm64/libppsspp_jni.so (GPURecord::EmitCommandWithRAM(GPURecord::CommandType, void const*, unsigned int, unsigned int)) (BuildId: a751278204da14ef743a7c2df8e1342d71318f70)
  #08  pc 0x0000000000734608  arm64/libppsspp_jni.so (GPURecord::NotifyCommand(unsigned int)+1348) (BuildId: a751278204da14ef743a7c2df8e1342d71318f70)
  #09  pc 0x000000000073c078  arm64/libppsspp_jni.so (GPUCommon::SlowRunLoop(DisplayList&)+244) (BuildId: a751278204da14ef743a7c2df8e1342d71318f70)
  #10  pc 0x000000000073be24  arm64/libppsspp_jni.so (GPUCommon::InterpretList(DisplayList&)+644) (BuildId: a751278204da14ef743a7c2df8e1342d71318f70)
  #11  pc 0x000000000073b32c  arm64/libppsspp_jni.so (GPUCommon::ProcessDLQueue()+100) (BuildId: a751278204da14ef743a7c2df8e1342d71318f70)
  #12  pc 0x000000000073b1b0  arm64/libppsspp_jni.so (GPUCommon::EnqueueList(unsigned int, unsigned int, int, PSPPointer<PspGeListArgs>, bool)+1852) (BuildId: a751278204da14ef743a7c2df8e1342d71318f70)
  #13  pc 0x00000000005681e4  arm64/libppsspp_jni.so (void WrapU_UUIU<&sceGeListEnQueue(unsigned int, unsigned int, int, unsigned int)>()+60) (BuildId: a751278204da14ef743a7c2df8e1342d71318f70)
  #14  pc 0x0000000000542a68  arm64/libppsspp_jni.so (CallSyscallWithoutFlags(HLEFunction const*)+52) (BuildId: a751278204da14ef743a7c2df8e1342d71318f70)

Some pool worker stacks:

  #00  pc 0x0000000000070080  /apex/com.android.runtime/lib64/bionic/libc.so (je_tcache_bin_flush_small)
  #01  pc 0x0000000000064ae4  /apex/com.android.runtime/lib64/bionic/libc.so (ifree+720)
  #02  pc 0x0000000000064ce4  /apex/com.android.runtime/lib64/bionic/libc.so (je_free+112)
  #03  pc 0x0000000000858ee8  /data/app/~~5Qw_xZval2LqhbpV-oJ_qg==/org.ppsspp.ppsspp-hUUVp3_A9lz8K6QmRWQQ0g==/lib/arm64/libppsspp_jni.so (LoopRangeTask::~LoopRangeTask()+72) (BuildId: a751278204da14ef743a7c2df8e1342d71318f70)
  #04  pc 0x000000000085a0f8  /data/app/~~5Qw_xZval2LqhbpV-oJ_qg==/org.ppsspp.ppsspp-hUUVp3_A9lz8K6QmRWQQ0g==/lib/arm64/libppsspp_jni.so (WorkerThreadFunc(GlobalThreadContext*, TaskThreadContext*)) (BuildId: a751278204da14ef743a7c2df8e1342d71318f70)
  #05  pc 0x000000000085bbf8  /data/app/~~5Qw_xZval2LqhbpV-oJ_qg==/org.ppsspp.ppsspp-hUUVp3_A9lz8K6QmRWQQ0g==/lib/arm64/libppsspp_jni.so (void* std::__ndk1::__thread_proxy<std::__ndk1::tuple<std::__ndk1::unique_ptr<std::__ndk1::__thread_struct, std::__ndk1::default_delete<std::__ndk1::__thread_struct>>, void (*)(GlobalThreadContext*, TaskThreadContext*), GlobalThreadContext*, TaskThreadContext*>>(void*)+48) (BuildId: a751278204da14ef743a7c2df8e1342d71318f70)
  #00  pc 0x0000000000064cc4  /apex/com.android.runtime/lib64/bionic/libc.so (je_free+80)
  #01  pc 0x0000000000858ee8  /data/app/~~5Qw_xZval2LqhbpV-oJ_qg==/org.ppsspp.ppsspp-hUUVp3_A9lz8K6QmRWQQ0g==/lib/arm64/libppsspp_jni.so (LoopRangeTask::~LoopRangeTask()+72) (BuildId: a751278204da14ef743a7c2df8e1342d71318f70)
  #02  pc 0x000000000085a0f8  /data/app/~~5Qw_xZval2LqhbpV-oJ_qg==/org.ppsspp.ppsspp-hUUVp3_A9lz8K6QmRWQQ0g==/lib/arm64/libppsspp_jni.so (WorkerThreadFunc(GlobalThreadContext*, TaskThreadContext*)) (BuildId: a751278204da14ef743a7c2df8e1342d71318f70)
  #03  pc 0x000000000085bbf8  /data/app/~~5Qw_xZval2LqhbpV-oJ_qg==/org.ppsspp.ppsspp-hUUVp3_A9lz8K6QmRWQQ0g==/lib/arm64/libppsspp_jni.so (void* std::__ndk1::__thread_proxy<std::__ndk1::tuple<std::__ndk1::unique_ptr<std::__ndk1::__thread_struct, std::__ndk1::default_delete<std::__ndk1::__thread_struct>>, void (*)(GlobalThreadContext*, TaskThreadContext*), GlobalThreadContext*, TaskThreadContext*>>(void*)+48) (BuildId: a751278204da14ef743a7c2df8e1342d71318f70)
#00  pc 0x0000000000075dc0  /apex/com.android.runtime/lib64/bionic/libc.so (__memchr_aarch64)
  #01  pc 0x0000000000736a54  /data/app/~~5Qw_xZval2LqhbpV-oJ_qg==/org.ppsspp.ppsspp-hUUVp3_A9lz8K6QmRWQQ0g==/lib/arm64/libppsspp_jni.so (std::__ndk1::__function::__func<GPURecord::mymemmem(unsigned char const*, unsigned long, unsigned long, unsigned char const*, unsigned long, unsigned long)::$_0, std::__ndk1::allocator<GPURecord::mymemmem(unsigned char const*, unsigned long, unsigned long, unsigned char const*, unsigned long, unsigned long)::$_0>, void (int, int)>::operator()(int&&, int&&)) (BuildId: a751278204da14ef743a7c2df8e1342d71318f70)
  #02  pc 0x0000000000858f50  /data/app/~~5Qw_xZval2LqhbpV-oJ_qg==/org.ppsspp.ppsspp-hUUVp3_A9lz8K6QmRWQQ0g==/lib/arm64/libppsspp_jni.so (LoopRangeTask::Run()+68) (BuildId: a751278204da14ef743a7c2df8e1342d71318f70)
  #03  pc 0x000000000085a0e8  /data/app/~~5Qw_xZval2LqhbpV-oJ_qg==/org.ppsspp.ppsspp-hUUVp3_A9lz8K6QmRWQQ0g==/lib/arm64/libppsspp_jni.so (WorkerThreadFunc(GlobalThreadContext*, TaskThreadContext*)) (BuildId: a751278204da14ef743a7c2df8e1342d71318f70)
  #04  pc 0x000000000085bbf8  /data/app/~~5Qw_xZval2LqhbpV-oJ_qg==/org.ppsspp.ppsspp-hUUVp3_A9lz8K6QmRWQQ0g==/lib/arm64/libppsspp_jni.so (void* std::__ndk1::__thread_proxy<std::__ndk1::tuple<std::__ndk1::unique_ptr<std::__ndk1::__thread_struct, std::__ndk1::default_delete<std::__ndk1::__thread_struct>>, void (*)(GlobalThreadContext*, TaskThreadContext*), GlobalThreadContext*, TaskThreadContext*>>(void*)+48) (BuildId: a751278204da14ef743a7c2df8e1342d71318f70)
  #00  pc 0x0000000000070080  /apex/com.android.runtime/lib64/bionic/libc.so (je_tcache_bin_flush_small)
  #01  pc 0x0000000000064ae4  /apex/com.android.runtime/lib64/bionic/libc.so (ifree+720)
  #02  pc 0x0000000000064ce4  /apex/com.android.runtime/lib64/bionic/libc.so (je_free+112)
  #03  pc 0x0000000000858ee8  /data/app/~~5Qw_xZval2LqhbpV-oJ_qg==/org.ppsspp.ppsspp-hUUVp3_A9lz8K6QmRWQQ0g==/lib/arm64/libppsspp_jni.so (LoopRangeTask::~LoopRangeTask()+72) (BuildId: a751278204da14ef743a7c2df8e1342d71318f70)
  #04  pc 0x000000000085a0f8  /data/app/~~5Qw_xZval2LqhbpV-oJ_qg==/org.ppsspp.ppsspp-hUUVp3_A9lz8K6QmRWQQ0g==/lib/arm64/libppsspp_jni.so (WorkerThreadFunc(GlobalThreadContext*, TaskThreadContext*)) (BuildId: a751278204da14ef743a7c2df8e1342d71318f70)
  #05  pc 0x000000000085bbf8  /data/app/~~5Qw_xZval2LqhbpV-oJ_qg==/org.ppsspp.ppsspp-hUUVp3_A9lz8K6QmRWQQ0g==/lib/arm64/libppsspp_jni.so (void* std::__ndk1::__thread_proxy<std::__ndk1::tuple<std::__ndk1::unique_ptr<std::__ndk1::__thread_struct, std::__ndk1::default_delete<std::__ndk1::__thread_struct>>, void (*)(GlobalThreadContext*, TaskThreadContext*), GlobalThreadContext*, TaskThreadContext*>>(void*)+48) (BuildId: a751278204da14ef743a7c2df8e1342d71318f70)

Weird stuff, almost like there's a hang in the memory allocator (jemalloc) ?

Or it's just stuck performing the same thing over and over somehow..

@hrydgard
Copy link
Owner Author

hrydgard commented Jan 17, 2024

Another thread hang, interesting:

  #00  pc 0x00000000000881b0  /apex/com.android.runtime/lib64/bionic/libc.so (syscall+32)
  #01  pc 0x00000000000f1660  /apex/com.android.runtime/lib64/bionic/libc.so (pthread_join+268)
  #02  pc 0x0000000000fd623c  arm64/libppsspp_jni.so (std::__ndk1::thread::join()+28) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #03  pc 0x000000000082d04c  arm64/libppsspp_jni.so (VulkanRenderManager::StopThread()+236) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #04  pc 0x000000000082d2ac  arm64/libppsspp_jni.so (VulkanRenderManager::DestroyBackbuffers()+16) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #05  pc 0x0000000000873db4  arm64/libppsspp_jni.so (AndroidVulkanContext::Resize()+96) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #06  pc 0x0000000000878754  arm64/libppsspp_jni.so (NativeFrame(GraphicsContext*)+1480) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #07  pc 0x0000000000871900  arm64/libppsspp_jni.so (VulkanEmuThread(ANativeWindow*)) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #08  pc 0x000000000087317c  arm64/libppsspp_jni.so (void* std::__ndk1::__thread_proxy<std::__ndk1::tuple<std::__ndk1::unique_ptr<std::__ndk1::__thread_struct, std::__ndk1::default_delete<std::__ndk1::__thread_struct>>, void (*)(ANativeWindow*), ANativeWindow*>>(void*)+44) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)

vs

  #00  pc 0x00000000000881b0  /apex/com.android.runtime/lib64/bionic/libc.so (syscall+32)
  #01  pc 0x000000000008ca7c  /apex/com.android.runtime/lib64/bionic/libc.so (__futex_wait_ex+148)
  #02  pc 0x00000000000eff60  /apex/com.android.runtime/lib64/bionic/libc.so (pthread_cond_wait+84)
  #03  pc 0x0000000000f93640  arm64/libppsspp_jni.so (std::__ndk1::condition_variable::wait(std::__ndk1::unique_lock<std::__ndk1::mutex>&)+20) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #04  pc 0x000000000082b354  arm64/libppsspp_jni.so (Promise<VkPipeline_T*>::BlockUntilReady()+112) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #05  pc 0x000000000083655c  arm64/libppsspp_jni.so (VulkanQueueRunner::PerformRenderPass(VKRStep const&, VkCommandBuffer_T*, int, QueueProfileContext&)+2384) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #06  pc 0x0000000000835920  arm64/libppsspp_jni.so (VulkanQueueRunner::RunSteps(std::__ndk1::vector<VKRStep*, std::__ndk1::allocator<VKRStep*>>&, int, FrameData&, FrameDataShared&, bool)+524) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #07  pc 0x000000000082dadc  arm64/libppsspp_jni.so (VulkanRenderManager::Run(VKRRenderThreadTask&)+732) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #08  pc 0x000000000082c744  arm64/libppsspp_jni.so (VulkanRenderManager::ThreadFunc()+240) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #09  pc 0x000000000083271c  arm64/libppsspp_jni.so (void* std::__ndk1::__thread_proxy<std::__ndk1::tuple<std::__ndk1::unique_ptr<std::__ndk1::__thread_struct, std::__ndk1::default_delete<std::__ndk1::__thread_struct>>, void (VulkanRenderManager::*)(), VulkanRenderManager*>>(void*)+64) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)

I think this will be fixed by one of my upcoming changes.

@hrydgard
Copy link
Owner Author

hrydgard commented Jan 18, 2024

Report from beta 1:

  #00  pc 0x000000000004ed98  /apex/com.android.runtime/lib64/bionic/libc.so (__memcpy+232)
  #01  pc 0x00000000007215e0  arm64/libppsspp_jni.so (TextureReplacer::NotifyTextureDecoded(ReplacedTexture*, ReplacedTextureDecodeInfo const&, void const*, int, int, int, int, int, int)+1332) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #02  pc 0x00000000006b5718  arm64/libppsspp_jni.so (TextureCacheVulkan::BuildTexture(TexCacheEntry*)+3184) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #03  pc 0x000000000070bb30  arm64/libppsspp_jni.so (TextureCacheCommon::ApplyTexture()+468) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #04  pc 0x00000000006a7608  arm64/libppsspp_jni.so (DrawEngineVulkan::DoFlush()+1572) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #05  pc 0x000000000073eba4  arm64/libppsspp_jni.so (GPUCommonHW::FastRunLoop(DisplayList&)+272) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #06  pc 0x00000000007385fc  arm64/libppsspp_jni.so (GPUCommon::InterpretList(DisplayList&)+608) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #07  pc 0x0000000000737b28  arm64/libppsspp_jni.so (GPUCommon::ProcessDLQueue()+100) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #08  pc 0x00000000007379ac  arm64/libppsspp_jni.so (GPUCommon::EnqueueList(unsigned int, unsigned int, int, PSPPointer<PspGeListArgs>, bool)+1852) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #09  pc 0x0000000000568f1c  arm64/libppsspp_jni.so (void WrapU_UUIU<&sceGeListEnQueue(unsigned int, unsigned int, int, unsigned int)>()+60) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)

Also beta 1:

  #00  pc 0x0000000000664008 arm64/libppsspp_jni.so (Memory::Write_U32(unsigned int, unsigned int)+124) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #01  pc 0x00000000004c6e04 arm64/libppsspp_jni.so (CWCheatEngine::ExecuteOp(CheatOperation const&, CheatCode const&, unsigned long&)+4348) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #02  pc 0x00000000004c46dc arm64/libppsspp_jni.so (CWCheatEngine::Run()+176) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #03  pc 0x00000000004c4000 arm64/libppsspp_jni.so (hleCheat(unsigned long long, int)+788) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)
  #04  pc 0x00000000004c1738 arm64/libppsspp_jni.so (CoreTiming::Advance()+140) (BuildId: 5dd7c95d4ebcf874bb13c567cb6d0ded19f78528)

@hrydgard
Copy link
Owner Author

hrydgard commented Jan 18, 2024

beta 3:

  #04  pc 0x0000000000fd3bc0  arm64/libppsspp_jni.so (std::__ndk1::mutex::lock()+8) (BuildId: bd9158ed79e7cb9fffa0fc67e21ea41d9fc28569)
  #05  pc 0x00000000008a4970  arm64/libppsspp_jni.so (GameInfo::GetTitle()+20) (BuildId: bd9158ed79e7cb9fffa0fc67e21ea41d9fc28569)
  #06  pc 0x00000000008ca6d8  arm64/libppsspp_jni.so (GameScreen::CreateViews()+1792) (BuildId: bd9158ed79e7cb9fffa0fc67e21ea41d9fc28569)
  #07  pc 0x0000000000dd0824  arm64/libppsspp_jni.so (UIScreen::DoRecreateViews()+180) (BuildId: bd9158ed79e7cb9fffa0fc67e21ea41d9fc28569)
  #08  pc 0x0000000000dd1120  arm64/libppsspp_jni.so (UIScreen::render(ScreenRenderMode)+192) (BuildId: bd9158ed79e7cb9fffa0fc67e21ea41d9fc28569)
  #09  pc 0x00000000008cdafc  arm64/libppsspp_jni.so (GameScreen::render(ScreenRenderMode)+48) (BuildId: bd9158ed79e7cb9fffa0fc67e21ea41d9fc28569)
  #10  pc 0x0000000000dcfafc  arm64/libppsspp_jni.so (ScreenManager::render()+732) (BuildId: bd9158ed79e7cb9fffa0fc67e21ea41d9fc28569)
  #11  pc 0x0000000000880000  arm64/libppsspp_jni.so (NativeFrame(GraphicsContext*)+796) (BuildId: bd9158ed79e7cb9fffa0fc67e21ea41d9fc28569)
  #12  pc 0x0000000000879470  arm64/libppsspp_jni.so (VulkanEmuThread(ANativeWindow*)) (BuildId: bd9158ed79e7cb9fffa0fc67e21ea41d9fc28569)

Maybe just slowness in scoped storage land, a work thread has the following top of a stack (but missing the rest):

  #00  pc 0x00000000000979dc  /apex/com.android.runtime/lib64/bionic/libc.so (syscall+28)
  #01  pc 0x00000000003a8af4  /apex/com.android.art/lib64/libart.so (art::ConditionVariable::WaitHoldingLocks(art::Thread*)+140)
  #02  pc 0x000000000077f71c  /apex/com.android.art/lib64/libart.so (artJniMethodEnd+204)
  #03  pc 0x000000000020facc  /apex/com.android.art/lib64/libart.so (art_jni_method_end+12)
  at android.os.BinderProxy.transactNative (Native method)
  at android.os.BinderProxy.transact (BinderProxy.java:678)
  at android.content.ContentProviderProxy.query (ContentProviderNative.java:479)
  at android.content.ContentResolver.query (ContentResolver.java:1245)
  at android.content.ContentResolver.query (ContentResolver.java:1171)
  at android.content.ContentResolver.query (ContentResolver.java:1127)
  at org.ppsspp.ppsspp.PpssppActivity.listContentUriDir (PpssppActivity.java:275)

@hrydgard
Copy link
Owner Author

hrydgard commented Jan 20, 2024

#00  pc 0x000000000059c290  arm64/libppsspp_jni.so (SceKernelVplHeader::Allocate(unsigned int)+104) (BuildId: bd9158ed79e7cb9fffa0fc67e21ea41d9fc28569)
#01  pc 0x0000000000597d74  arm64/libppsspp_jni.so (__KernelAllocateVpl(int, unsigned int, unsigned int, unsigned int&, bool, char const*)) (BuildId: bd9158ed79e7cb9fffa0fc67e21ea41d9fc28569)
#02  pc 0x0000000000598290  arm64/libppsspp_jni.so (sceKernelTryAllocateVpl(int, unsigned int, unsigned int)+44) (BuildId: bd9158ed79e7cb9fffa0fc67e21ea41d9fc28569)
#03  pc 0x0000000000588634  arm64/libppsspp_jni.so (void WrapI_IUU<&sceKernelTryAllocateVpl(int, unsigned int, unsigned int)>()+32) (BuildId: bd9158ed79e7cb9fffa0fc67e21ea41d9fc28569)
#04  pc 0x0000000000544208  arm64/libppsspp_jni.so (CallSyscallWithoutFlags(HLEFunction const*)+52) (BuildId: bd9158ed79e7cb9fffa0fc67e21ea41d9fc28569)

This crashes here:

	PSPPointer<SceKernelVplBlock> SplitBlock(PSPPointer<SceKernelVplBlock> b, u32 allocBlocks) {
		u32 prev = b.ptr;
		b->sizeInBlocks -= allocBlocks;

		b += b->sizeInBlocks;
		b->sizeInBlocks = allocBlocks;   // << CRASH HERE
		b->next = prev;

		return b;
	}

Suspicious... Probably the block header got corrupted.

Another:

  #00  pc 0x000000000070fab8  arm64/libppsspp_jni.so (TextureCacheCommon::InvalidateAll(GPUInvalidationType)+124) (BuildId: bd9158ed79e7cb9fffa0fc67e21ea41d9fc28569)
  #01  pc 0x00000000007409cc  arm64/libppsspp_jni.so (GPUCommonHW::InvalidateCache(unsigned int, int, GPUInvalidationType)+72) (BuildId: bd9158ed79e7cb9fffa0fc67e21ea41d9fc28569)
  #02  pc 0x0000000000586994  arm64/libppsspp_jni.so (sceKernelDcacheWritebackAll()+40) (BuildId: bd9158ed79e7cb9fffa0fc67e21ea41d9fc28569)
  #03  pc 0x000000000054c2ec  arm64/libppsspp_jni.so (void WrapI_V<&sceKernelDcacheWritebackAll()>()+8) (BuildId: bd9158ed79e7cb9fffa0fc67e21ea41d9fc28569)
  #04  pc 0x0000000000544208  arm64/libppsspp_jni.so (CallSyscallWithoutFlags(HLEFunction const*)+52) (BuildId: bd9158ed79e7cb9fffa0fc67e21ea41d9fc28569)

Crash here:

	for (TexCache::iterator iter = cache_.begin(), end = cache_.end(); iter != end; ++iter) {
		if (iter->second->GetHashStatus() == TexCacheEntry::STATUS_RELIABLE) {
			iter->second->SetHashStatus(TexCacheEntry::STATUS_HASHING);
		}
		iter->second->invalidHint++;
	}

@hrydgard
Copy link
Owner Author

hrydgard commented Jan 22, 2024

Crash in libpng, not good. libpng17 seems unmaintained, no updates since 2017 :(

This is on line libpng17/pngread.c:1312

backtrace:
  #00  pc 0x000000000004eed4  /apex/com.android.runtime/lib64/bionic/libc.so (__memcpy+292)
  #01  pc 0x0000000000e06488  arm64/libppsspp_jni.so (png_image_memory_read) (BuildId: d7e251231db61eedf91ca53c301ab3e78736046b)
  #02  pc 0x0000000000e1230c  arm64/libppsspp_jni.so (png_crc_read+36) (BuildId: d7e251231db61eedf91ca53c301ab3e78736046b)
  #03  pc 0x0000000000e03720  arm64/libppsspp_jni.so (png_read_IDAT) (BuildId: d7e251231db61eedf91ca53c301ab3e78736046b)
  #04  pc 0x0000000000e03578  arm64/libppsspp_jni.so (png_read_row+252) (BuildId: d7e251231db61eedf91ca53c301ab3e78736046b)
  #05  pc 0x0000000000e06334  arm64/libppsspp_jni.so (png_image_read_direct) (BuildId: d7e251231db61eedf91ca53c301ab3e78736046b)
  #06  pc 0x0000000000e02920  arm64/libppsspp_jni.so (png_safe_execute+112) (BuildId: d7e251231db61eedf91ca53c301ab3e78736046b)
  #07  pc 0x0000000000e046b8  arm64/libppsspp_jni.so (png_image_finish_read+352) (BuildId: d7e251231db61eedf91ca53c301ab3e78736046b)
  #08  pc 0x00000000007f99b4  arm64/libppsspp_jni.so (pngLoadPtr(unsigned char const*, unsigned long, int*, int*, unsigned char**)+168) (BuildId: d7e251231db61eedf91ca53c301ab3e78736046b)
  #09  pc 0x000000000086dc1c  arm64/libppsspp_jni.so (TempImage::LoadTextureLevelsFromFileData(unsigned char const*, unsigned long, ImageFileType)+460) (BuildId: d7e251231db61eedf91ca53c301ab3e78736046b)
  #10  pc 0x000000000086dfb0  arm64/libppsspp_jni.so (CreateTextureFromFileData(Draw::DrawContext*, unsigned char const*, unsigned long, ImageFileType, bool, char const*)+128) (BuildId: d7e251231db61eedf91ca53c301ab3e78736046b)
  #11  pc 0x00000000008a1580  arm64/libppsspp_jni.so (GameInfoCache::SetupTexture(std::__ndk1::shared_ptr<GameInfo>&, Draw::DrawContext*, GameInfoTex&)+188) (BuildId: d7e251231db61eedf91ca53c301ab3e78736046b)
  #12  pc 0x00000000008a1168  arm64/libppsspp_jni.so (GameInfoCache::GetInfo(Draw::DrawContext*, Path const&, int)+272) (BuildId: d7e251231db61eedf91ca53c301ab3e78736046b)
  #13  pc 0x00000000008b10d4  arm64/libppsspp_jni.so (MainScreen::DrawBackgroundFor(UIContext&, Path const&, float)+104) (BuildId: d7e251231db61eedf91ca53c301ab3e78736046b)
  #14  pc 0x00000000008b0fe0  arm64/libppsspp_jni.so (MainScreen::DrawBackground(UIContext&)+108) (BuildId: d7e251231db61eedf91ca53c301ab3e78736046b)
  #15  pc 0x0000000000dcd678  arm64/libppsspp_jni.so (UIScreen::render(ScreenRenderMode)+300) (BuildId: d7e251231db61eedf91ca53c301ab3e78736046b)
  #16  pc 0x0000000000dcbffc  arm64/libppsspp_jni.so (ScreenManager::render()+732) (BuildId: d7e251231db61eedf91ca53c301ab3e78736046b)
  #17  pc 0x000000000087c718  arm64/libppsspp_jni.so (NativeFrame(GraphicsContext*)+852) (BuildId: d7e251231db61eedf91ca53c301ab3e78736046b)
  #18  pc 0x0000000000873b10  arm64/libppsspp_jni.so (UpdateRunLoopAndroid(_JNIEnv*)+36) (BuildId: d7e251231db61eedf91ca53c301ab3e78736046b)
  #19  pc 0x0000000000876fb4  arm64/libppsspp_jni.so (EmuThreadFunc()) (BuildId: d7e251231db61eedf91ca53c301ab3e78736046b)
  #20  pc 0x00000000004d650c  arm64/libppsspp_jni.so (void* std::__ndk1::__thread_proxy<std::__ndk1::tuple<std::__ndk1::unique_ptr<std::__ndk1::__thread_struct, std::__ndk1::default_delete<std::__ndk1::__thread_struct>>, void (*)()>>(void*)+44) (BuildId: d7e251231db61eedf91ca53c301ab3e78736046b)

@hrydgard hrydgard modified the milestones: v1.17.0, 1.17.1 Jan 27, 2024
@sum2012
Copy link
Collaborator

sum2012 commented Jan 28, 2024

Maybe change other png library ?
https://github.com/pnggroup/libpng

@hrydgard
Copy link
Owner Author

yes, I'm thinking of trying spng instead.

https://libspng.org/

@hrydgard
Copy link
Owner Author

hrydgard commented Jan 28, 2024

Although, I now think it's really due to a data loading race condition in GameInfoCache... Ugh.

I tried spng, and it's pretty nice, just lacking the ability to specify a byte stride in encode/decode. So will probably switch to it later anyway since it's faster, but not for the 1.17 series.

@hrydgard
Copy link
Owner Author

Pretty sure I've solved the png loading crash now.

Here's a savestate load-from-rewind problem, hm:

  #00  pc 0x0000000000690f54  arm64/libppsspp_jni.so (BlockAllocator::Free(unsigned int)+24) (BuildId: 7918a182fa2ad427379f7705c2f0bcb662ac20f4)
  #01  pc 0x000000000059b37c  arm64/libppsspp_jni.so (PartitionMemoryBlock::~PartitionMemoryBlock()+48) (BuildId: 7918a182fa2ad427379f7705c2f0bcb662ac20f4)
  #02  pc 0x0000000000585fb8  arm64/libppsspp_jni.so (KernelObjectPool::DoState(PointerWrap&)+212) (BuildId: 7918a182fa2ad427379f7705c2f0bcb662ac20f4)
  #03  pc 0x0000000000585c58  arm64/libppsspp_jni.so (__KernelDoState(PointerWrap&)+108) (BuildId: 7918a182fa2ad427379f7705c2f0bcb662ac20f4)
  #04  pc 0x0000000000674f78  arm64/libppsspp_jni.so (SaveState::SaveStart::DoState(PointerWrap&)+572) (BuildId: 7918a182fa2ad427379f7705c2f0bcb662ac20f4)
  #05  pc 0x00000000006747f4  arm64/libppsspp_jni.so (CChunkFileReader::Error CChunkFileReader::LoadPtr<SaveState::SaveStart>(unsigned char*, SaveState::SaveStart&, std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char>>*)+88) (BuildId: 7918a182fa2ad427379f7705c2f0bcb662ac20f4)
  #06  pc 0x000000000067975c  arm64/libppsspp_jni.so (SaveState::StateRingbuffer::Restore(std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char>>*)+200) (BuildId: 7918a182fa2ad427379f7705c2f0bcb662ac20f4)
  #07  pc 0x000000000067960c  arm64/libppsspp_jni.so (SaveState::HandleLoadFailure()+100) (BuildId: 7918a182fa2ad427379f7705c2f0bcb662ac20f4)
  #08  pc 0x0000000000679f9c  arm64/libppsspp_jni.so (SaveState::Process()+1244) (BuildId: 7918a182fa2ad427379f7705c2f0bcb662ac20f4)
  #09  pc 0x000000000067fbc4  arm64/libppsspp_jni.so (PSP_RunLoopWhileState()+148) (BuildId: 7918a182fa2ad427379f7705c2f0bcb662ac20f4)
  #10  pc 0x00000000008a16ec  arm64/libppsspp_jni.so (EmuScreen::render(ScreenRenderMode)+1056) (BuildId: 7918a182fa2ad427379f7705c2f0bcb662ac20f4)

@hrydgard hrydgard modified the milestones: 1.17.1, v1.18.0, 1.17.2 Feb 4, 2024
@hrydgard hrydgard closed this as completed Apr 9, 2024
@hrydgard hrydgard removed this from the 1.17.2 milestone Apr 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants