Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Building and compiling C code with rustc/cargo #1850

Closed
thoughtpolice opened this issue Feb 16, 2012 · 12 comments
Closed

Building and compiling C code with rustc/cargo #1850

thoughtpolice opened this issue Feb 16, 2012 · 12 comments
Labels
A-linkage Area: linking into static, shared libraries and binaries C-enhancement Category: An issue proposing an enhancement or a PR with one.

Comments

@thoughtpolice
Copy link
Contributor

I discussed this in #1555. Basically, there are often times when you want to bundle some C code with a package, either because it's a binding and the source is portable, you're writing a shim for say, a C++ library, or just any transient piece of glue you want to distribute (say something that interfaces with rt etc.)

Cargo doesn't support this right now, as it only compiles crates in the source directory from what I can tell. rustc does not understand .c source files. Native modules work by mapping to shared libraries, but sometimes this is overkill (because the source is small,) and other times it's clunky (sundown and other libraries may be intended to included inline, so it's more work for a user who has to do it out of band anyway.) Sometimes these shims are even required - you need a stable name for the linker to resolve, because the link-time function name is hidden behind a macro. This sort of wrapping occurs in many APIs.

Ideally for something like #1555, if we could compile C/C++ code one way or another with a crate, a sundown library can be registered with cargo-central, with the source inline as it's only 3 files, rustdoc can be split off and also put in cargo-central too. I think this is a nice way to go, especially because rustdoc won't depend on sundown somehow being installed, but this is just one use case overall I think. It would also make it possible for me to reuse code and write bindings to other libraries easily (like LevelDB, and NaCl) that I've done in the past.

There are two ways of accomplishing this broadly speaking, both with merits, I think:

  1. Make rustc be able to just accept c files to compile and run through the linker, along with crates. Alternatively this could be specified with crate attributes or something, listing the files to include in the build.
  2. Make cargo actually take the responsibility of building the C files - this avoids complexity in the frontend but cargo may have to do a lot more. I think this does have merit from the POV of getting complexity out of the frontend.

However, I think 1 is the way to go and has the most bang for your buck, since rustc is already doing the compilation, and can give flags that may be necessary to the C compiler (like say, -fPIC or not.) This does keep cargo simpler. It also makes it really easy for users: rustc leveldb or somesuch, for example, where there may be an #cfiles[(...)] attribute in the crate. You don't need cargo just to build your project, either.

This does mean there may need to be more syntax for accommodating native methods that resolve to a link time symbol, not a full blown library. I don't know what the syntax would look like, I'm open to suggestions from others here.

@brson
Copy link
Contributor

brson commented Feb 16, 2012

I have also wanted something like this on occasion. It would certainly make bindings that require 'just a little bit' of C easier.

My inclination is to add an attribute like #[external(c = "file")] that could be interpreted by either cargo or a hypothetical rustc plugin as 'compile this file and add the approprate link_args attribute to the crate'. That would have fairly minimal impact on the compiler.

@thoughtpolice
Copy link
Contributor Author

Does rustc support plugins at the moment? Or do you just mean a separate tool built on the rustc crate that will process the attributes? I think the idea of an #[external] attribute is nice, but if compiler plugins are optional, I feel this should still be part of the rustc frontend by default, since it's likely something people would want regularly. Alternatively there should be a way to specify 'default compiler plugins', and I would petition for it to be included and always loaded if this is the case, but that's outside the scope of this ticket.

There are other considerations now that I'm awake and more coherent:

  • Sometimes you want a little more sophisticated build system than just the crate names, anyway. This is especially important with integrating C code. Ruby's gems basically allows extensions etc to invoke semi-arbitrary build rules for say, a library (run makefiles, configure, stuff like that,) followed by the regular 'install that gem and also this .a too' or whatever.

    This sort of responsibility should certainly lie with cargo, at least in theory; Cabal for Haskell, for example, will invoke a custom configure script during the build preparation (if you tell it) to check for things like headers, etc (can be autoconf or hand written,) and you can even make it invoke a totally arbitrary Makefile that does whatever build you want at the build stage. rustc should not be a build system.

  • The preprocessor and related stuff in terms of options. It's the worst thing ever basically I know, but many times C files may require certain defines to be passed to the C files by the compiler. I think this can be handled with a combination of #[cfg] and the hypothetical #[external], i.e

#[cfg(target_os = "win32")]
#[external(cc_opts = "-DPLATFORM_WIN32")]

etc. I have needed this in my LevelDB bindings - the source can be trivially compiled and thrown into just about any build system, but it does need some configuration logic depending on the platform it's compiled for.

LevelDB actually uniquely defines this issue somewhat, as it touches almost all of the points I'm thinking of. I wrapped it into a Cabal build rather easily, and it is 'complete' in that the Cabal build system is on par with the included one the authors wrote(it's really simple itself.) Basically:

  • LevelDB is very portable for the most part, as the .cpp source can basically just be thrown at g++/clang++ with a few preprocessor defines and you're good to go. This is nice because you can work it into almost any build system inline, as opposed to a prerequisite installation step. I just tell cabal to build all the .cpp files and include them inline.
  • The LevelDB source does require you to pass a #define specifying the build platform, so you need to compile the sources with -DPLATFORM_OSX or -DPLATFORM_LINUX or -DPLATFORM_FREEBSD. This has to always be specified. This could be handled just by #[cfg] though as I said above (theoretically.)
  • LevelDB optionally has support for compression via snappy. I handle this by using an autoconf script to look for -lsnappy and snappy.h at configure time, and if it exists, I tell Cabal to pass -DSNAPPY into the build of the CPP code, and also tell Cabal to build all the packages that depend on my library to link with -lsnappy, since they now need it at link time. link_args in the crate/module metadata already handles this, but I don't know about how to persist that info inbetween configure logic -> rustc build logic.

Overall these kinds of requirements will invariably appear as people begin integrating more libraries and bindings into Rust code - I'm definitely at a dead end on binding a few libs because of it.

Personally I almost always attempt to distribute these kinds of packages as standalone as possible in this manner, by working them into the toolchain instead of external installs, providing shims/glue, because as the library author, I wish to pay this effort up front, as opposed to paying every time you use the library - complicated build instructions fall into this space. It makes the user experience just much, much better.

I realize this is already going long beyond the initial scope of the ticket it may feel, but these are very important issues people are going to run into soon I speculate, so discussion will happen one way or another. :)

Overall I think a good initial braindump on the whole scheme consists of:

  • Allow rustc to compile C/CPP files, either by the command line, or a separate #[external] attribute. I like the attribute approach since you can specify other options to the C compiler that way, too. With #cfg I can probably get pretty far this way, passing necessary #defines etc based on OS (which is fairly common.)

  • Allow rust users to specify native functions that point to link-time symbols, outside of shared libraries. Perhaps a native fn with an attribute such as external with something like link_name? Then you could say:

    #[external(c_src = "foo_glue.c", link_name = "foo_bar")]
    fn c_foo_bar(baz: *u8, quux: uint)
    
  • Cargo should allow some flexibility for a build, besides just a crate-by-crate run over the source. Invariably some builds are going to need configure time logic (like snappy.h detection) and trying to build all the variations of this with #cfg or having to write rust plugins for it (so you could further parse attributes and keep the rust compiler minimal) is going to be a hellish nightmare. It'd just be easier to tell cargo 'run this set of commands in the source dir e.g. ./configure; make and err if they fail, then build this crate, and install the crate followed by this arbitrary libfoobar.a which we will link later.' I'm not sure of the best way to approach this.

Relating to the last point I feel Cargo should just be a package manager - it installs things. It should be good at that. But it should support people to have build systems that go beyond crate-level logic, because while Rust has a nice compilation model with logic for this built into the crates/sources itself, the rest of the world doesn't. Almost every package that's based on a system wide library like zmq or mongrel2 (for simple examples) can probably just get away with a native mod, and things will work beautifully, but glue code and more complex builds are very common as well, especially in libraries.

Sorry for the length, but this is a semi complicated issue with many facets. I am interested in looking into and implementing it, as it will be crucial to many packages I think, but it crosses between rustc and cargo, brings into question their responsibilities, and potentially requires lots of careful decision making, so I would appreciate what other Rust users (and developers!) think.

@brson
Copy link
Contributor

brson commented Feb 17, 2012

rustc does not support plugins yet, but many of the things it does (#fmt, config, testing) should, in my opinion, be done by plugins (in this case default plugins that are always guaranteed to be there)

There's a ticket open about creating some mechanism to set crate attributes via the build system: #612 . There already exists a limited way to set the configuration on the command line.

I believe native modules do not strictly need to refer to shared libraries. There is some way to use the #[nolink] attribute, then link in a static library, and have it work. It may need to be more ergonomic, but all the basic mechanisms to make it work should already exist.

I like a lot of the ideas you have here, and it sounds like you have a good grasp on some of the practical problems that actual users are going to run into, so am totally supportive of any improvements you want to make in this area. Let's break these down into smaller, discreet issues that can be tackled independently.

You will also want to get feedback from @graydon because he has some opinions on these subjects.

@thoughtpolice
Copy link
Contributor Author

OK. I will split this into multiple bugs referencing this issue, hopefully summarizing the ideas separately, and basic modes of attack.

@thoughtpolice
Copy link
Contributor Author

Also, the stuff in #612 looks exactly like what would be needed to give rustc configuration logic of the kind I imagine, (specifying link_args etc.)

@graydon
Copy link
Contributor

graydon commented Feb 17, 2012

There's a lot here but I like the direction, am happy to see people experimenting in the design space. Something along these lines will be necessary, so go for it.

@brson
Copy link
Contributor

brson commented Mar 7, 2012

Building C code from cargo or rustc was discussed in this week's planning meeting so it appears to be something of a priority.

@thoughtpolice
Copy link
Contributor Author

Awesome! Are there any notes for the meeting and what people brought up? I noticed there was a meeting sections on the wiki with weekly notes, but I'm not sure if it's been updated yet.

I'm still willing to push this issue forward and do work on it, but at the moment I've been a little out of the loop as I just got a new job and moved, so you'll have to forgive me while I adjust. I'll hopefully be on IRC more soon and be able to respond to Graydon's comments he made in #1861 as well.

@brson
Copy link
Contributor

brson commented Mar 8, 2012

@z0w0
Copy link
Contributor

z0w0 commented Mar 14, 2013

I'm going to close this because interfacing with autotools and $CC will be built into rustpkg and the general consensus is to eventually integrate bindgen with rustc directly for automatically generating FFI Rust code inline for C header files. If anyone thinks that there is more use cases that aren't covered, feel free to reopen.

@z0w0 z0w0 closed this as completed Mar 14, 2013
@graydon
Copy link
Contributor

graydon commented Mar 14, 2013

I assumed this bug would continue to serve the tracking role for those features. Is there a better one?

@z0w0
Copy link
Contributor

z0w0 commented Mar 14, 2013

This is a pretty old bug and is a hence a bit vague on the details on how it would be implemented. #2124 has the latest idea for how the bindgen interface would work and #2805 seems to be relevant to the plan of making rustpkg build foreign things.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-linkage Area: linking into static, shared libraries and binaries C-enhancement Category: An issue proposing an enhancement or a PR with one.
Projects
None yet
Development

No branches or pull requests

4 participants