Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attempting to read large file as string causes segfault #80533

Closed
abb128 opened this issue Aug 12, 2023 · 8 comments · Fixed by #86730
Closed

Attempting to read large file as string causes segfault #80533

abb128 opened this issue Aug 12, 2023 · 8 comments · Fixed by #86730

Comments

@abb128
Copy link

abb128 commented Aug 12, 2023

Godot version

4.1.1.stable, master [4714e95]

System information

Godot v4.2.dev (4714e95) - Arch Linux #1 ZEN SMP PREEMPT_DYNAMIC Tue, 08 Aug 2023 22:13:48 +0000 - Wayland - Vulkan (Forward+) - dedicated AMD Radeon RX 5700 (RADV NAVI10) () - AMD Ryzen 7 5800X3D 8-Core Processor (16 Threads)

Issue description

I attempted to read a large JSON file (1GB+) using FileAccess.get_file_as_string, but the result is a crash. Here's a backtrace from a debug build [4714e95] and some info, not sure what to make of this

Loading...

Thread 1 "godot.linuxbsd." received signal SIGSEGV, Segmentation fault.
String::parse_utf8 (this=0x7fffffffa5f8, 
    p_utf8=0x7fff2ffff020 "[{\"p\":[-0.019984539598226547,0.3165023624897003,0.6493667960166931],\"c\":[0.15294925029263085,0.1603185939029318,0.11855615533237046],\"s\":[-5.519049167633057,-5.86762809753418,-4.488621234893799],\"q\":["..., p_len=1093019370, p_skip_cr=false) at core/string/ustring.cpp:1880
1880            dst[str_size] = 0;
(gdb) bt
#0  String::parse_utf8 (this=0x7fffffffa5f8, 
    p_utf8=0x7fff2ffff020 "[{\"p\":[-0.019984539598226547,0.3165023624897003,0.6493667960166931],\"c\":[0.15294925029263085,0.1603185939029318,0.11855615533237046],\"s\":[-5.519049167633057,-5.86762809753418,-4.488621234893799],\"q\":["..., p_len=1093019370, p_skip_cr=false) at core/string/ustring.cpp:1880
#1  0x000055555cae1c27 in FileAccess::get_file_as_string (p_path=..., r_error=0x0) at core/io/file_access.cpp:770
#2  0x000055555cae631e in FileAccess::_get_file_as_string (p_path=...) at core/io/file_access.h:227
#3  0x000055555caf36ae in call_with_variant_args_static_ret<String, String const&, 0ul> (p_method=0x55555cae62e7 <FileAccess::_get_file_as_string(String const&)>, p_args=0x7fffffffa720, r_ret=..., r_error=...)
    at ./core/variant/binder_common.h:766
#4  0x000055555caf0cbd in call_with_variant_args_static_ret_dv<String, String const&> (p_method=0x55555cae62e7 <FileAccess::_get_file_as_string(String const&)>, p_args=0x7fffffffa838, p_argcount=1, r_ret=..., r_error=..., 
    default_values=...) at ./core/variant/binder_common.h:989
#5  0x000055555caeec5a in MethodBindTRS<String, String const&>::call (this=0x5555616f8240, p_object=0x0, p_args=0x7fffffffa838, p_arg_count=1, r_error=...) at ./core/object/method_bind.h:719
#6  0x000055555870c3db in GDScriptFunction::call (this=0x555563c0aa10, p_instance=0x555563bfaba0, p_args=0x0, p_argcount=0, r_err=..., p_state=0x0) at modules/gdscript/gdscript_vm.cpp:1934
#7  0x000055555859a338 in GDScriptInstance::callp (this=0x555563bfaba0, p_method=..., p_args=0x0, p_argcount=0, r_error=...) at modules/gdscript/gdscript.cpp:1892
#8  0x000055555aaeec0c in Node::_gdvirtual__ready_call<false> (this=0x555563bf7930) at scene/main/node.h:322
#9  0x000055555aac7d65 in Node::_notification (this=0x555563bf7930, p_notification=13) at scene/main/node.cpp:186
#10 0x00005555582b77d4 in Node::_notificationv (this=0x555563bf7930, p_notification=13, p_reversed=false) at ./scene/main/node.h:49
#11 0x00005555582d7147 in Node3D::_notificationv (this=0x555563bf7930, p_notification=13, p_reversed=false) at ./scene/3d/node_3d.h:52
#12 0x000055555d092445 in Object::notification (this=0x555563bf7930, p_notification=13, p_reversed=false) at core/object/object.cpp:798
#13 0x000055555aac823e in Node::_propagate_ready (this=0x555563bf7930) at scene/main/node.cpp:230
#14 0x000055555aac81be in Node::_propagate_ready (this=0x555563b84780) at scene/main/node.cpp:221
#15 0x000055555aadae3e in Node::_set_tree (this=0x555563b84780, p_tree=0x555563b84280) at scene/main/node.cpp:2909
#16 0x000055555ab18704 in SceneTree::initialize (this=0x555563b84280) at scene/main/scene_tree.cpp:449
#17 0x0000555558035c2d in OS_LinuxBSD::run (this=0x7fffffffdbf0) at platform/linuxbsd/os_linuxbsd.cpp:900
#18 0x000055555802c274 in main (argc=3, argv=0x7fffffffe1b8) at platform/linuxbsd/godot_linuxbsd.cpp:74
(gdb) p dst
$1 = 0x7fff27ffe020 U""
(gdb) p str_size
$2 = 1093019370
(gdb) p dst[str_size]
Cannot access memory at address 0x80002c987bc8
(gdb) p _cowdata.size()
$3 = 1093019371

After some manual binary searching 0x7ffff7fff000 seems to pop up...

(gdb) p dst[872416247]
$35 = 0 U'\000'
(gdb) p dst[872416248]
Cannot access memory at address 0x7ffff7fff000

Steps to reproduce

extends Node3D

# Called when the node enters the scene tree for the first time.
func _ready():
	print("Loading...")
	var data = FileAccess.get_file_as_string("/path/to/out.json")
	print("Loaded")
	print(len(data))


# Called every frame. 'delta' is the elapsed time since the previous frame.
func _process(delta):
	pass

In case the specific file is needed it's available here (warning: it's big)

Minimal reproduction project

@abb128 abb128 changed the title Attempting to read large file causes segfault Attempting to read large file as string causes segfault Aug 12, 2023
@jsjtxietian
Copy link
Contributor

jsjtxietian commented Aug 17, 2023

I reproduced it with my windows machine, Same backtrace. Godot v4.2.dev (7ba79d6) - Windows 10.0.19045 - Vulkan (Forward+) - dedicated NVIDIA GeForce RTX 3060 (NVIDIA; 31.0.15.3118) - 11th Gen Intel(R) Core(TM) i7-11700K @ 3.60GHz (16 Threads)

@bitsawer
Copy link
Member

Probably caused by an int overflow with Vector<T> size or in parse_utf8(). Could be fixed by #74582, or it might at least bump the size limit a bit higher before crashing.

@abb128
Copy link
Author

abb128 commented Aug 17, 2023

I tested with that PR and it does indeed fix the crash with that particular file, however there is still a limit of 2147483632 bytes and apparently one less still causes a segfault

yes 'a' | tr -d '\n' | head -c 2147483631 > example.txt

Files bigger than 2^31-1 bytes will overflow and also fail to read, or read partially if it overflows to a positive number

@jsjtxietian
Copy link
Contributor

Yes, maybe at least we can update the document and do something to prevent it from crash. Although unlikely, files larger than 2-4G will be needed in certain cases.

@bruvzg
Copy link
Member

bruvzg commented Aug 17, 2023

String is UTF-32, so it will be at least 4x larger adter conversion, and variables used for size are signed so limit is 2G.

@akien-mga akien-mga added this to the 4.3 milestone Jan 19, 2024
@AThousandShips
Copy link
Member

Can someone confirm this on latest? Should be fixed by:

@akien-mga
Copy link
Member

I already did in #86730 (comment) :)

@AThousandShips
Copy link
Member

Sorry didn't see it said it was occuring without that PR 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants