Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash compiling code that just require xml without using it. #10435

Open
hugopl opened this issue Feb 22, 2021 · 18 comments
Open

Crash compiling code that just require xml without using it. #10435

hugopl opened this issue Feb 22, 2021 · 18 comments

Comments

@hugopl
Copy link
Contributor

hugopl commented Feb 22, 2021

The following code crashes with the following stacktrace:

require "xml"

@[Link("libvirt")]
lib LibVirt
  alias ConnectPtr = Void*

  fun connect_open = virConnectOpen(name : LibC::Char*) : ConnectPtr
  fun connect_close = virConnectClose(conn : ConnectPtr) : Int32
end

ptr = LibVirt.connect_open("qemu:///system")
LibVirt.connect_close(ptr)
#0  0x00007ffff758cef5 in raise () at /usr/lib/libc.so.6
#1  0x00007ffff7576862 in abort () at /usr/lib/libc.so.6
#2  0x00007ffff75cef38 in __libc_message () at /usr/lib/libc.so.6
#3  0x00007ffff75d6bea in  () at /usr/lib/libc.so.6
#4  0x00007ffff75d7fbc in _int_free () at /usr/lib/libc.so.6
#5  0x00007ffff75dbca8 in free () at /usr/lib/libc.so.6
#6  0x00007ffff7d7793b in  () at /usr/lib/libvirt.so.0
#7  0x00007ffff7d7c8a0 in  () at /usr/lib/libvirt.so.0
#8  0x00007ffff7e13155 in  () at /usr/lib/libvirt.so.0
#9  0x00007ffff7e138f3 in virConnectOpen () at /usr/lib/libvirt.so.0
#10 0x000055555558dc29 in __crystal_main () at /home/hugo/src/xmlbug/src/xmlbug.cr:11
#11 0x0000555555606176 in main_user_code () at /usr/lib/crystal/crystal/main.cr:110
#12 0x0000555555605fec in main () at /usr/lib/crystal/crystal/main.cr:96
#13 0x00005555555990a6 in main () at /usr/lib/crystal/crystal/main.cr:119

If I just comment the first line, require "xml", the code works correctly and no crash happens. The weird thing is that no XML code is explicit used by the example.

Tested with:

Crystal 0.36.1 (2021-02-03)

LLVM: 10.0.1
Default target: x86_64-pc-linux-gnu

libvirt 7.0.0-3 from ArchLinux package.
@bcardiff
Copy link
Member

Requiring xml is enough to use it. It has some main initialization. Check references to LibXML.xmlGcMemSetup to find it.

@hugopl
Copy link
Contributor Author

hugopl commented Feb 22, 2021

I guess I found the issue... libvirt also uses libxml2, so I think the library is getting double initied or something.

@hugopl
Copy link
Contributor Author

hugopl commented Feb 22, 2021

yep, I reduced it to:

require "xml"

{% if compare_versions(Crystal::VERSION, "0.35.0-0") >= 0 %}
  @[Link("xml2", pkg_config: "libxml-2.0")]
{% else %}
  @[Link("xml2")]
{% end %}
lib LibXML
  fun xmlGcMemSetup(free_func : Void* ->,
                    malloc_func : LibC::SizeT -> Void*,
                    malloc_atomic_func : LibC::SizeT -> Void*,
                    realloc_func : Void*, LibC::SizeT -> Void*,
                    strdup_func : UInt8* -> UInt8*) : Int
end

LibXML.xmlGcMemSetup(
  ->GC.free,
  ->GC.malloc(LibC::SizeT),
  ->GC.malloc(LibC::SizeT),
  ->GC.realloc(Void*, LibC::SizeT),
  ->(str) {
    len = LibC.strlen(str) + 1
    copy = Pointer(UInt8).malloc(len)
    copy.copy_from(str, len)
    copy
  }
)

 @[Link("libvirt")]
 lib LibVirt
   alias ConnectPtr = Void*

   fun connect_open = virConnectOpen(name : LibC::Char*) : ConnectPtr
   fun connect_close = virConnectClose(conn : ConnectPtr) : Int32
 end

 ptr = LibVirt.connect_open("qemu:///system")
 LibVirt.connect_close(ptr)

So... it's not a Crystal bug, but IMO it deservers a mention on XML module documentation.

@hugopl
Copy link
Contributor Author

hugopl commented Feb 22, 2021

I was able to use libvirt with Crystal libxml2 bindings by doing this workaround:

require "xml"

@[Link("libvirt")]
lib LibVirt
  alias ConnectPtr = Void*

  fun connect_open = virConnectOpen(name : LibC::Char*) : ConnectPtr
  fun connect_close = virConnectClose(conn : ConnectPtr) : Int32
end

def no_xmlgc
  LibXML.xmlGcMemSetup(->LibC.free,
    ->LibC.malloc(LibC::SizeT),
    ->LibC.malloc(LibC::SizeT),
    ->LibC.realloc(Void*, LibC::SizeT),
    ->(str) {
      len = LibC.strlen(str) + 1
      copy = Pointer(UInt8).malloc(len)
      copy.copy_from(str, len)
      copy
    })
  yield
  LibXML.xmlGcMemSetup(->GC.free,
    ->GC.malloc(LibC::SizeT),
    ->GC.malloc(LibC::SizeT),
    ->GC.realloc(Void*, LibC::SizeT),
    ->(str) {
      len = LibC.strlen(str) + 1
      copy = Pointer(UInt8).malloc(len)
      copy.copy_from(str, len)
      copy
    })
end

no_xmlgc do
  ptr = LibVirt.connect_open("qemu:///system")
  pp! LibVirt.connect_close(ptr)
end

pp! XML.parse("<test />")

Works... but is bad, because on multiple threads I believe that this need to be protected by a mutex or the caos will reign.... there's also the little overhead of 2 function calls.

So I think the correct to do this is let Crystal XML bindings don't use the GC, but manage its memory on some finalize calls, otherwise the GC will once again block the use a C library with Crystal, anyone agree?

@hugopl
Copy link
Contributor Author

hugopl commented Feb 22, 2021

The use case were I found this was rewriting a small ruby JSON-RPC server used by the company I'm working to Crystal, the application uses libvirt and also threads, in Ruby it need to use multiple process sometimes (and it's a hell) to keep the server responsive during some slow libvirt operations, the same might happen on Crystal if using fibers in the same thread, anyway.... as the XML use I do is really simple, I can workaround it by calling LibXML.xmlNodeFree myself on Xml::Node like:

require "xml"

lib LibXML
  fun xmlFreeNode(cur : Node*) : Void
end

module XML
  struct Node
    def free
      LibXML.xmlFreeNode(@node)
    end
  end
end

@[Link("libvirt")]
lib LibVirt
  alias ConnectPtr = Void*

  fun connect_open = virConnectOpen(name : LibC::Char*) : ConnectPtr
  fun connect_close = virConnectClose(conn : ConnectPtr) : Int32
end

LibXML.xmlGcMemSetup(->LibC.free, ->LibC.malloc(LibC::SizeT), ->LibC.malloc(LibC::SizeT), ->LibC.realloc(Void*, LibC::SizeT), ->(str) {
  len = LibC.strlen(str) + 1
  copy = Pointer(UInt8).malloc(len)
  copy.copy_from(str, len)
  copy
})

ptr = LibVirt.connect_open("qemu:///system")
pp! LibVirt.connect_close(ptr)

node = XML.parse("<test />")
node.free

Doing so I noticed that the XML in stdlib classes are structs (probably for performance issues), not classes, so they don't have a finalize method and therefore my suggestion about let Crystal XML module manage the libXML2 memory can't be done without changing this.

@straight-shoota
Copy link
Member

The finalize hook would need to be on the libxml2 data structur (LibXML::Node in crystal). I suppose if you use Crystal's GC you could initialize that using LibGC.register_finalizer_ignore_self.

But the best solution would probably be to let libvirt use the libxml2 configuration from Crystal. Because the main program runs in Crystal's runtime, libraries should adopt to that, not the other way around.

@hugopl
Copy link
Contributor Author

hugopl commented Feb 22, 2021

I think the best solution you mentioned isn't feasible, since most libraries that use libxml2 have no way to configure that and don't use a GC, calling the libXML2 free functions themselves...

Even if we convince all projects using libXML2 to fill their projects with #ifdef to avoid call the libXML2 free functions this would not be so useful since no distro would package the library with a compile flag that make it leak memory if not used with a GC.

As I need the xml module in my application that uses libvirt, I'll end up with the worst solution... fork the xml module and apply these minor patches replacing struct by class and adding some finalize calls, at least the module isn't so big.

@straight-shoota
Copy link
Member

straight-shoota commented Feb 22, 2021

I just looked how this works in Ruby with nokogiri and libvirt-ruby. It seems nokogiri does not set up libxml2 to use the GC. I suppose they do the same as your patch and free libxml2 memory when the Ruby objects are garbage-collected.
That's probably a better solution in general than hooking up libxml2 with the GC directly. Most (or all?) other libc bindings in stdlib work like this, too.

So I suggest we should apply your patch.

@hugopl
Copy link
Contributor Author

hugopl commented Feb 22, 2021

There's no patch ready yet 😛, but I can write one. The time to do this is now... since the changes from struct -> class is a API breakage. But unless someone is doing something very weird the change is source-compatible and no good code should stop compiling.

@asterite
Copy link
Member

Why does LibXML.xmlFreeNode(@node) need to be called manually if xmlGcMemSetup is already set up to call LibC.free?

@hugopl
Copy link
Contributor Author

hugopl commented Feb 22, 2021

xmlGcMemSetup just tell what free function should be called by LibXML.xmlFreeNode when freeing the resources, so LibXML.xmlFreeNode will call LibC.free.

xmlGcMemSetup implementation just setting a pointer do xmlFree global.
https://github.com/GNOME/libxml2/blob/46837d47d59c7b8c9bd1d08a6a717a90a7f1ceb6/xmlmemory.c#L1107

xmlFreeNode, calling xmlFree
https://github.com/GNOME/libxml2/blob/01411e7c5ea0fff181271e092f46a2138c3720ec/tree.c#L3775

@hugopl
Copy link
Contributor Author

hugopl commented Feb 22, 2021

BTW, the fix isn't just change struct by class and add some finalizers... since XML::Node class holds a LibXML::Doc*, LibXML::Doc* or a LibXML::Attr*, so if the doc is deleted, the nodes will be deleted in cascade by libXML and futher xmlFreeNode calls will be invalid... the same way if a node is deleted then when I delete the doc I get a double free.

i.e. The fix isn't soooo simple as I expected.

@asterite
Copy link
Member

That's the beauty of being able to hook the libxml memory functions to a GC which takes care of all of this for us 😅

@hugopl
Copy link
Contributor Author

hugopl commented Feb 22, 2021

One patch that could be in 1.0.0 is to keep the GC taking care of libXML as it is now and just change the structs to class, so a patch that deals with these memory deallocations can be landed later in a 1.0.1 whatever without break any API... because for sure some finalizers will be used in the solution. I mean, all this depending on time constraints about the 1.0.0 release date.

@bcardiff
Copy link
Member

I think the change from struct to class can happen now and later we can figure ways to setup the GC.

hugopl added a commit to hugopl/crystal that referenced this issue Feb 22, 2021
This will allow a future change to let the XML module manage the libXML2 memory,
fixing crystal-lang#10435 without breaking the API.
@hugopl
Copy link
Contributor Author

hugopl commented Feb 22, 2021

This MR was easy to do :-)

bcardiff pushed a commit that referenced this issue Mar 9, 2021
This will allow a future change to let the XML module manage the libXML2 memory,
fixing #10435 without breaking the API.
@paulocoghi
Copy link

@hugopl , were you able to use libvirt directly on Crystal, without Ruby?

@hugopl
Copy link
Contributor Author

hugopl commented Apr 1, 2022

Yes, it's possible to use, you just can't use/include the XML module from stdlib, since it register hooks to free the memory using the GC.. then when libvirt tries to also free the memory it causes a double free.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants