diff --git a/README.md b/README.md index 682b2b86..4c18326e 100644 --- a/README.md +++ b/README.md @@ -27,7 +27,7 @@ Looking for a modern, powerful language centered on Ken Iverson's array programm - A simple, consistent, and [stable](commentary/stability.md) array programming language - A low-dependency C implementation using bytecode compilation: [installation](running.md) -- [System functions](spec/system.md) for math, files, and I/O (including a C FFI) +- [System functions](spec/system.md) for math, files, and I/O (including a [C FFI](doc/ffi.md)) - [Documentation](doc/README.md) with examples, visuals, explanations, and rationale for features - [Libraries](https://github.com/mlochbaum/bqn-libs) with interfaces for common file formats like JSON and CSV diff --git a/doc/README.md b/doc/README.md index 7976e7b9..49600103 100644 --- a/doc/README.md +++ b/doc/README.md @@ -83,5 +83,6 @@ Primitives: - [Windows: `↕`](windows.md) Environment: +- [Foreign Function Interface (FFI)](ffi.md) - [Embedded BQN](embed.md) - [ReBQN](rebqn.md) diff --git a/doc/ffi.md b/doc/ffi.md index 427e4cae..87d08b53 100644 --- a/doc/ffi.md +++ b/doc/ffi.md @@ -2,7 +2,7 @@ # Foreign Function Interface (FFI) -BQN's [foreign function interface](https://en.wikipedia.org/wiki/Foreign_function_interface) allows it to interface with libraries written in C, or other languages that support a compatible format. It's invoked with `•FFI`, which performs the necessary lookups and conversions to call a function from a dynamic shared library (extension .so in Unix-like systems and .dll in Windows). This function is [specified](../spec/system.md#foreign-function-interface-ffi) by BQN and implemented by CBQN—without support for all possible types, but enough for practical use. +BQN's [foreign function interface](https://en.wikipedia.org/wiki/Foreign_function_interface) allows it to interface with libraries written in C, or other languages that support a compatible format. It's invoked with `•FFI`, which performs the necessary lookups and conversions to call a function from a dynamic shared library (extension .so in Linux, .dylib in macOS, and .dll in Windows). This function is [specified](../spec/system.md#foreign-function-interface-ffi) by BQN and [implemented](https://github.com/dzaima/CBQN/blob/master/docs/system.md#ffi) by CBQN—without support for all possible types, but enough to cover practical use well. Warning: object code is unsafe by nature. The OS will hopefully prevent it from exceeding the privileges of the BQN interpreter, but anything code in the interpreter could do is fair game. It might crash, write files, corrupt BQN arrays resulting in "impossible" behavior later on, or read everything in your home directory and email it to someone. Be careful when invoking the FFI! @@ -53,14 +53,14 @@ With that compiled to `cyc.so`, we can call it using the following BQN code. The •Show CycC ≠⊸⋈ p # 3 •Show p # ⟨ 0 2 4 3 1 ⟩ -The cycles are 0, 124, and 3, so that checks out. But `cycles` modifies its argument `p`, with the line `p[j] = i`. Is the BQN list `p` changed when this happens? The check on the last line says no, which is good because BQN arrays are supposed to be immutable. What happens is that BQN copies the data into temporary memory in order to pass it in to C (here it also converts from 8 bits for each element to 32). +The cycles are 0, 124, and 3, so that checks out. But `cycles` modifies its argument `p`, with the line `p[j] = i`. Is the BQN list `p` changed when this happens? The check on the last line says no, which is good because BQN arrays are supposed to be immutable. What happens in this case is that CBQN stores `p` with 8 bits per element, so to widen it to 32 bits it copies to temporary memory. And in fact CBQN assumes the C function won't modify a `*`-typed argument, so to avoid issues in the case that `p` _does_ have 32-bit elements it should be passed in as `≠⊸⋈ •internal.Unshare p`. -But what if we want to see those modifications? Many C functions use pointer modification as a way to return multiple values, so the FFI supports this. With `&` instead of `*`, the values after calling the function are returned as an extra result. +But what if we want to see the modifications? Many C functions use pointer modification as a way to return multiple values, so the FFI supports this. With `&` instead of `*`, the values after calling the function are returned as an extra result (and the original argument is always copied to avoid changes to aliased values). cycles ← "cyc.so" •FFI "u32"‿"cycles"‿"u32"‿"&u32" •Show Cycles ≠⊸⋈ ⍋"cycle" # ⟨ 3 ⟨ 0 1 1 3 1 ⟩ ⟩ -The original result might not be wanted in this case. You can ignore it by using `""` for the result type, but then you get a 1-element list, for consistency with the case with multiple `&` arguments. Similar to `>` on a single argument, you can use `"&"` for the result to get a single mutated argument returned directly. +The original result might not be wanted in this case. You can ignore it (relying on `u32` not being stack-allocated!) by using `""` for the result type, but then you get a 1-element list, for consistency with the case with multiple `&` arguments. Similar to `>` on a single argument, you can use `"&"` for the result to get a single mutated argument returned directly. cycI ← "cyc.so" •FFI "&"‿"cycles"‿"u32"‿"&u32" •Show CycI ≠⊸⋈ ⍋"cycle" # ⟨ 0 1 1 3 1 ⟩ @@ -90,27 +90,29 @@ So what do all these C types mean in BQN? The FFI tries to define sensible conve On to implicit conversions, let's start with **numbers**. CBQN numbers are 64-bit floats, which are a superset of integer types up to 32 bits, and 32-bit floats. So conversions from all these types are exact with no problems. Converting to a smaller integer type requires the float value to fit in that type, while converting to a smaller float type rounds. -Issues show up with **64-bit integer** types because 64-bit floats only have full integer precision from `-2⋆53` to `2⋆53`. _Some_ integers beyond this range are representable, but others aren't: for example `1+2⋆53` rounds to `2⋆53`. For this reason numbers with absolute value `2⋆53` and greater error when converting between float and integer. Bare `u64` and `i64` types are fine when working with lengths and other things that can't reasonably be that large, but when all the bits are used they should be converted with `i64:u32` or similar. +Issues show up with **64-bit integer** types because 64-bit floats only have full integer precision from `-2⋆53` to `2⋆53`. _Some_ integers beyond this range are representable, but others aren't: for example `1+2⋆53` rounds to `2⋆53`. For this reason numbers with absolute value `2⋆53` and greater error when converting between float and integer. Bare `u64` and `i64` types are fine when working with lengths and other things that can't reasonably be that large, but when all the bits are used they should be converted with `u64:i32` or similar. An **array** or **struct** corresponds to a BQN list, easy enough. -A BQN list or pointer object can be converted to a **pointer**, and a C pointer is always converted to a pointer object—it can't be converted to a list because the length is unknown, but sometimes a mutable pointer is a convenient way to get a list from a C function. [Pointer objects](#pointer-objects) are BQN values designed specifically to encapsulate C pointers. If passed as an argument, its type needs to be compatible with the type for that argument, if it has one. When an argument has an _untyped_ pointer type `*` or `&`, it can't take a list as input (what would the elements be converted to?) but any pointer object will be accepted. +A BQN list or pointer object can be converted to a **pointer**, and a C pointer is always converted to a pointer object—it can't be converted to a list because the length is unknown, but sometimes a mutable pointer is a convenient way to get a list from a C function. [Pointer objects](#pointer-objects) are BQN values designed specifically to encapsulate C pointers. If passed as an argument, its type needs to be compatible with the type for that argument, if it has one. + +When an argument has an untyped pointer type `*` or `&`, it can't take a list as input (what would the elements be converted to?) but any pointer object will be accepted. Since an untyped pointer isn't a list of anything, it's treated more like a scalar value. In particular a type like `*:i32` converts a pointer back and forth from two integers. ### Explicit type conversion -Either the type as a whole, or any member of a struct, can have a conversion specification like `:u32` at the end. The type after the colon uses the same format as C numeric types, with the quality `c` allowed for characters in addition to `i`, `u`, and `f`. Here CBQN supports the types it uses internally for arrays: `u1` for booleans, `i8` to `i32` and `c8` to `c32`, and `f64`. +Either the type as a whole, or any member of a struct, can have a conversion specification like `:i32` at the end. The type after the colon uses the same format as C numeric types, with the quality `c` allowed for characters in addition to `i`, `u`, and `f`. Here CBQN supports the types it uses internally for arrays: `u1` for booleans, `i8` to `i32` and `c8` to `c32`, and `f64`. -With explicit conversion, each C value corresponds to a list of BQN values with the same bit representation. For example, you might use `u64:u1` to represent a 64-bit number as 64 bits (little-endian or least significant first), or `u64:c8` to represent it as 8 characters. Similarly, a pointer can be turned into plain bits and back with `*:u1`. The `:` also applies to compound values; another case is an argument such as `*i64:i32`, which will be cast from a BQN list of 32-bit ints to a C list of 64-bit ints that's half as long (with an error if the length wasn't even). And `&` can be used instead of `*` to get mutated values out, converting back to i32 on the way. Other compound cases have some complications and aren't supported in CBQN currently. +With explicit conversion, each C value corresponds to a list of BQN values with the same bit representation. For example, you might use `u64:u1` to represent a 64-bit number as 64 bits (little-endian or least significant first), or `u64:c8` to represent it as 8 characters. Similarly, a pointer can be turned into plain bits and back with `*:u1`. The `:` also applies to compound values; another case is an argument such as `*i64:i32`, which will be cast from a BQN list of 32-bit ints to a C list of 64-bit ints that's half as long (with an error if the length wasn't even). `**:i32` works similarly, assuming 64-bit pointers. The initial `*` can be replaced with `&` to get mutated values out, converting back to i32 on the way. Other compound cases have some complications and aren't supported in CBQN currently. ## Argument and result formats This section covers how FFI arguments and results are structured in BQN, and collects the ways to tweak it. -The normal case is that `𝕩` is a list of the arguments. You can pass in `𝕨` if you really want, but is has to be an empty list. No arguments have been specified as coming from `𝕨`! You can control where a C argument comes from by sticking `𝕨` or `𝕩` to the front (`𝕩` does nothing, it's already the default). So for example, arguments of `"i32"‿"𝕨i32"‿"𝕩i32"` mean the function has to be called like `⟨2⟩ Fn 3‿4`. Then if a C argument is the only one included in its BQN argument, it can have a `>` at the beginning (either before or after the `𝕨`/`𝕩`) meaning that the BQN argument should be that value directly instead of the 1-element list. +The normal case is that `𝕩` is a list of the arguments. You can pass in `𝕨` if you really want, but is has to be an empty list. No arguments have been specified as coming from `𝕨`! You can control where a C argument comes from by sticking `𝕨` or `𝕩` to the front (`𝕩` does nothing, it's already the default). So for example, arguments of `"i32"‿"𝕨i32"‿"𝕩i32"` mean the function has to be called like `⟨2⟩ Fn 1‿3`. Then if a C argument is the only one included in its BQN argument, it can have a `>` at the beginning (either before or after the `𝕨`/`𝕩`) meaning that the BQN argument should be that value directly instead of the 1-element list. -If there are no mutable arguments, the result is what it is. Unless it isn't: a result type of `""` results in a result value of `@`. +If there are no mutable arguments, the result is what it is. Unless it isn't: a void result type `""` results in a result value of `@`. -Mutable arguments (those with `&`) all have to be returned as part of the result. If there are any of these, the result is a list consisting of the C result followed by each mutable argument, mutatis mutandis, in the order they appeared in the arguments. Again, the result type can be `""` to leave it out, so that the result just includes mutable values. It can also be `"&"` if there's exactly one such value, which means it won't be returned as a list. +Mutable arguments (those with `&`) all have to be returned as part of the result. If there are any of these, the result is a list consisting of the C result followed by each mutable argument, mutatis mutandis, in the order they appeared in the arguments. Again, a result type of `""` assumes a void result and leaves it out, so that the result just includes mutable values. A void result can also be written `"&"` if there's exactly one such value, which means it won't be returned as a list. In the table of examples below, `5`, `1`, `0.5`, and `3` are used as values for `i8`, `u1`, `f64`, and `i64` respectively, and `a` and `b` are possible list arguments with `am` and `bm` indicating modified versions of these. @@ -123,6 +125,95 @@ In the table of examples below, `5`, `1`, `0.5`, and `3` are used as values for | `""` | `"&i8"‿"u1"‿"&f64"` | `Fn ⟨a,1,b⟩` | `⟨am,bm⟩` | `"&"` | `"&i8"‿"u1"‿"f64"` | `Fn ⟨a,1,0.5⟩` | `am` -### Pointer objects +## Pointer objects + +Some C functions might allocate or map memory and return a pointer to it, or take a pointer and return a related pointer. A pointer type like `*:i32` doesn't have a meaningful representation in BQN but allows for passing pointers around between C functions. But often it's useful to be able to manipulate pointers within BQN, so the FFI can create and accept pointer objects that support the relevant operations. + +In `•FFI`, any pointer result type without an explicit conversion (`*`, `*u32`, `*{u8,**f32}` but not `*:i8`) means the return value is a pointer object. The pointer object remembers what type it has, but can also be cast to other types. It can be passed as an argument to a field with the same type, or a compatible type—it's allowed to use an untyped pointer `*` somewhere that a typed pointer appears, or vice-versa. For example `*{*f64,*}` is compatible with `*{*,*u8}`. + +The plain `*` type gives an untyped pointer. The only things that can be done with it are to pass it back in to the FFI, and cast it to a different pointer type. Other pointers support more operations, and they actually have two defining properties: the element type, and a stride, which is a distance in bytes. In pointers returned by the FFI, the stride is the element type's width; in others, it may be the width of some parent element type. + +The functions for working with a pointer object are exposed as fields of that object. Here's a summary of these functions. Other than `Cast`, they all give errors if the pointer is untyped. + +| Name | Summary +|----------|------------------------------- +| `Read` | Read value at offset `𝕩` +| `Write` | Write value `𝕩` at offset `𝕨⊣0` +| `Add` | Return a new pointer offset by `𝕩` +| `Sub` | Offset by `-𝕩`, or return the offset from `𝕩` to this pointer +| `Cast` | Return a new pointer to the same location with type `𝕩` +| `Field` | Return a new pointer to field number `𝕩`, maintaining stride + +### Pointer access and arithmetic + +In CBQN, `@` as a "path" for `•FFI` exposes libc functions including `malloc`, which gives us an easy way to make a pointer to play around with. `Read` and `Write` work like you probably expect. + + malloc_i32 ← @ •FFI "*i32"‿"malloc"‿">u64" + ptr ← Malloc_i32 4×10 # Space for 10 ints + •Show ptr.Read 3 # Uninitialized + 3 ptr.Write 123 + •Show ptr.Read 3 # Now 123 + +You can use arithmetic to make another pointer with an offset, or find the offset between two pointers. Pointer objects are immutable in the sense that a given pointer object always points to the same location (the underlying state of the memory can of course change, that's the whole point). + + off ← ptr.Add 3 + •Show off.Sub ptr # 3 + •Show off.Read 0 # 123 + +And we can free it, showing how passing a pointer back in to the FFI works: + + free ← @ •FFI ""‿"free"‿">*" + Free ptr + +Now malloc's memory manager has freed the allocated space, but `•FFI` can't know that "free" does this, and doesn't do anything to the pointer object (it also couldn't track down other pointers like `off` that still point into the same space). So now `•Show ptr.Read 3` may still run without error, but it's use-after-free and could do anything. Using pointer objects is low-level programming; treat it like assembly. + +### Compound pointers + +Now for some more sophisticated pointer stuff. Let's start with a hundred bytes of memory: + + malloc ← @ •FFI "*"‿"malloc"‿">u64" + pv ← Malloc 100 + +Since `pv` is untyped, `pv.Read 0`, `pv.Add 1` and so on all give errors. All we can do is cast it: + + ps ← pv.Cast "{[2]i8,i16}" + +It works! And now if we read from it, we'll get… well, generally a bunch of zeros, but the memory's uninitialized so you never know. We'll initialize to something known by placing the value `i` in byte index `i`. A normal way to do this is `(pv.Cast "u8").Write˜¨ ↕100`, but just to show what's possible: + + (pv.Cast "[100]u8").Write ↕100 + +The argument `↕100` is matched to the element type `[100]u8` in the same way that it would be in an `•FFI` call with argument type `[100]u8`. Now we can read off some values: + + •Show ps.Read 0 # ⟨ ⟨ 0 1 ⟩ 770 ⟩ + •Show (ps.Field 0).Read 3 # ⟨ 12 13 ⟩ + +`770` is `2+3×256`; note the little-endian order. As for the `Field` example, `ps.Field 0` picks out the `[2]i8` component, since it's the first field of the struct type `{[2]i8,i16}`. However, reading at offset 3 doesn't just shift by 6 bytes, which would land in one of the `i16` components. Because `Field` preserves the original 4-byte stride of `ps`, it shifts by 12 bytes, reading the `[2]i8` field of the overall struct with index 3. Doing a `Cast` instead would have discarded this stride. + + •Show (ps.Cast "[2]i8").Read 3 # ⟨ 6 7 ⟩ + •Show ((ps.Add 3).Field 0).Read 0 # ⟨ 12 13 ⟩ + +As a result of this design, you can pick out, say, the second `i8` in each of the first five structs without any intermediate pointer arithmetic (the two `Field` calls below are each only done once, then `Read` is called 5 times). + + •Show ((ps.Field 0).Field 1).Read¨ ↕5 # ⟨ 1 5 9 13 17 ⟩ + +Casting also makes it possible to offset a pointer by a specific number of bytes: cast to `i8` or `u8`, add an offset, and cast back to the original type. + +## Strings + +Since strings present some encoding issues, and C has an unfortunate practice of using null-terminated strings everywhere, let's go over some details of how to work with these. To pass an ASCII string in, the appropriate type is `*u8:c8`, which as an [explicit conversion](#explicit-type-conversion) takes a list of 1-byte characters, and passes it as a list of 1-byte integers (sometimes C `char`s are signed ints, but the FFI just passes bits around so `i8` and `u8` behave the same way). Null-terminate the string explicitly in BQN by passing `str∾@` where `str` is the string itself. + +This format is also suitable for sequences of not-necessarily-ASCII bytes like the output of `•file.Bytes`. To pass unicode characters as UTF-8, you can use `•ToUTF8 str` in CBQN, and there are also pure BQN conversion functions in bqn-libs [strings.bqn](https://github.com/mlochbaum/bqn-libs/blob/master/strings.bqn). + +When a C function passes you back a string, you get a pointer, and to get a BQN list of characters you're going to have to hunt down that null byte. `*u8:c8` is allowed for the result type, but it gives a pointer with element type `u8:c8`, which means `Read` always returns a 1-character list instead of a single character. Let's use `*u8` instead. We could build the string during the search to read each character only once, but it's simpler and faster to use two passes: + + StrFromPtr ← { @ + 𝕩.Read¨ ↕ 1⊸+•_while_(0≠𝕩.Read) 0 } + +Using CBQN's `@` argument to get `strlen` is much faster at the cost of a bit of implementation dependence. In either version you might want to apply `•FromUTF8` afterwards if the string is UTF-8 encoded and not just a bunch of bytes. + + StrFromPtr ← (@ •FFI "u64"‿"strlen"‿">*u8"){ @ + 𝕩.Read¨ ↕𝔽𝕩 } + +And the fastest way is to hook into CBQN's internal function for building a string instead of using the pointer object interface at all. There's also the option of using `bqn_makeUTF8Str` instead of `bqn_makeC8Vec` to get UTF-8 conversion included. -[specification](../spec/system.md#pointer-objects) + strlen ← @ •FFI "u64"‿"strlen"‿">*u8" + mkstr ← @ •FFI "a"‿"bqn_makeC8Vec"‿">𝕨u64"‿">*u8" + StrFromPtr ← Strlen⊸MkStr diff --git a/docs/doc/ffi.html b/docs/doc/ffi.html index 10f9bf66..f3a13ace 100644 --- a/docs/doc/ffi.html +++ b/docs/doc/ffi.html @@ -6,7 +6,7 @@
BQN's foreign function interface allows it to interface with libraries written in C, or other languages that support a compatible format. It's invoked with •FFI
, which performs the necessary lookups and conversions to call a function from a dynamic shared library (extension .so in Unix-like systems and .dll in Windows). This function is specified by BQN and implemented by CBQN—without support for all possible types, but enough for practical use.
BQN's foreign function interface allows it to interface with libraries written in C, or other languages that support a compatible format. It's invoked with •FFI
, which performs the necessary lookups and conversions to call a function from a dynamic shared library (extension .so in Linux, .dylib in macOS, and .dll in Windows). This function is specified by BQN and implemented by CBQN—without support for all possible types, but enough to cover practical use well.
Warning: object code is unsafe by nature. The OS will hopefully prevent it from exceeding the privileges of the BQN interpreter, but anything code in the interpreter could do is fair game. It might crash, write files, corrupt BQN arrays resulting in "impossible" behavior later on, or read everything in your home directory and email it to someone. Be careful when invoking the FFI!
•FFI
takes the path of the object file as 𝕨
, with local paths resolved relative to the source file as in •Import
. 𝕩
describes the specific function requested from this file and its type signature, and the result is a BQN function that calls it. While a major purpose of •FFI
is to be called with files like libpng.so that you may already have on your system, we'll start by writing some C to make it clear what's happening on both sides of the interface. A factorial function is pretty short, so let's start there:
The cycles are 0, 124, and 3, so that checks out. But cycles
modifies its argument p
, with the line p[j] = i
. Is the BQN list p
changed when this happens? The check on the last line says no, which is good because BQN arrays are supposed to be immutable. What happens is that BQN copies the data into temporary memory in order to pass it in to C (here it also converts from 8 bits for each element to 32).
But what if we want to see those modifications? Many C functions use pointer modification as a way to return multiple values, so the FFI supports this. With &
instead of *
, the values after calling the function are returned as an extra result.
The cycles are 0, 124, and 3, so that checks out. But cycles
modifies its argument p
, with the line p[j] = i
. Is the BQN list p
changed when this happens? The check on the last line says no, which is good because BQN arrays are supposed to be immutable. What happens in this case is that CBQN stores p
with 8 bits per element, so to widen it to 32 bits it copies to temporary memory. And in fact CBQN assumes the C function won't modify a *
-typed argument, so to avoid issues in the case that p
does have 32-bit elements it should be passed in as ≠⊸⋈ •internal.Unshare p
.
But what if we want to see the modifications? Many C functions use pointer modification as a way to return multiple values, so the FFI supports this. With &
instead of *
, the values after calling the function are returned as an extra result (and the original argument is always copied to avoid changes to aliased values).
cycles ← "cyc.so" •FFI "u32"‿"cycles"‿"u32"‿"&u32" •Show Cycles ≠⊸⋈ ⍋"cycle" # ⟨ 3 ⟨ 0 1 1 3 1 ⟩ ⟩-
The original result might not be wanted in this case. You can ignore it by using ""
for the result type, but then you get a 1-element list, for consistency with the case with multiple &
arguments. Similar to >
on a single argument, you can use "&"
for the result to get a single mutated argument returned directly.
The original result might not be wanted in this case. You can ignore it (relying on u32
not being stack-allocated!) by using ""
for the result type, but then you get a 1-element list, for consistency with the case with multiple &
arguments. Similar to >
on a single argument, you can use "&"
for the result to get a single mutated argument returned directly.
cycI ← "cyc.so" •FFI "&"‿"cycles"‿"u32"‿"&u32" •Show CycI ≠⊸⋈ ⍋"cycle" # ⟨ 0 1 1 3 1 ⟩@@ -73,17 +73,18 @@
So what do all these C types mean in BQN? The FFI tries to define sensible conversions. But BQN can't always represent every value directly, so it also provides explicit conversions with :
using the underlying bit representation, giving a way to safely store any value.
On to implicit conversions, let's start with numbers. CBQN numbers are 64-bit floats, which are a superset of integer types up to 32 bits, and 32-bit floats. So conversions from all these types are exact with no problems. Converting to a smaller integer type requires the float value to fit in that type, while converting to a smaller float type rounds.
-Issues show up with 64-bit integer types because 64-bit floats only have full integer precision from -2⋆53
to 2⋆53
. Some integers beyond this range are representable, but others aren't: for example 1+2⋆53
rounds to 2⋆53
. For this reason numbers with absolute value 2⋆53
and greater error when converting between float and integer. Bare u64
and i64
types are fine when working with lengths and other things that can't reasonably be that large, but when all the bits are used they should be converted with i64:u32
or similar.
Issues show up with 64-bit integer types because 64-bit floats only have full integer precision from -2⋆53
to 2⋆53
. Some integers beyond this range are representable, but others aren't: for example 1+2⋆53
rounds to 2⋆53
. For this reason numbers with absolute value 2⋆53
and greater error when converting between float and integer. Bare u64
and i64
types are fine when working with lengths and other things that can't reasonably be that large, but when all the bits are used they should be converted with u64:i32
or similar.
An array or struct corresponds to a BQN list, easy enough.
-A BQN list or pointer object can be converted to a pointer, and a C pointer is always converted to a pointer object—it can't be converted to a list because the length is unknown, but sometimes a mutable pointer is a convenient way to get a list from a C function. Pointer objects are BQN values designed specifically to encapsulate C pointers. If passed as an argument, its type needs to be compatible with the type for that argument, if it has one. When an argument has an untyped pointer type *
or &
, it can't take a list as input (what would the elements be converted to?) but any pointer object will be accepted.
A BQN list or pointer object can be converted to a pointer, and a C pointer is always converted to a pointer object—it can't be converted to a list because the length is unknown, but sometimes a mutable pointer is a convenient way to get a list from a C function. Pointer objects are BQN values designed specifically to encapsulate C pointers. If passed as an argument, its type needs to be compatible with the type for that argument, if it has one.
+When an argument has an untyped pointer type *
or &
, it can't take a list as input (what would the elements be converted to?) but any pointer object will be accepted. Since an untyped pointer isn't a list of anything, it's treated more like a scalar value. In particular a type like *:i32
converts a pointer back and forth from two integers.
Either the type as a whole, or any member of a struct, can have a conversion specification like :u32
at the end. The type after the colon uses the same format as C numeric types, with the quality c
allowed for characters in addition to i
, u
, and f
. Here CBQN supports the types it uses internally for arrays: u1
for booleans, i8
to i32
and c8
to c32
, and f64
.
With explicit conversion, each C value corresponds to a list of BQN values with the same bit representation. For example, you might use u64:u1
to represent a 64-bit number as 64 bits (little-endian or least significant first), or u64:c8
to represent it as 8 characters. Similarly, a pointer can be turned into plain bits and back with *:u1
. The :
also applies to compound values; another case is an argument such as *i64:i32
, which will be cast from a BQN list of 32-bit ints to a C list of 64-bit ints that's half as long (with an error if the length wasn't even). And &
can be used instead of *
to get mutated values out, converting back to i32 on the way. Other compound cases have some complications and aren't supported in CBQN currently.
Either the type as a whole, or any member of a struct, can have a conversion specification like :i32
at the end. The type after the colon uses the same format as C numeric types, with the quality c
allowed for characters in addition to i
, u
, and f
. Here CBQN supports the types it uses internally for arrays: u1
for booleans, i8
to i32
and c8
to c32
, and f64
.
With explicit conversion, each C value corresponds to a list of BQN values with the same bit representation. For example, you might use u64:u1
to represent a 64-bit number as 64 bits (little-endian or least significant first), or u64:c8
to represent it as 8 characters. Similarly, a pointer can be turned into plain bits and back with *:u1
. The :
also applies to compound values; another case is an argument such as *i64:i32
, which will be cast from a BQN list of 32-bit ints to a C list of 64-bit ints that's half as long (with an error if the length wasn't even). **:i32
works similarly, assuming 64-bit pointers. The initial *
can be replaced with &
to get mutated values out, converting back to i32 on the way. Other compound cases have some complications and aren't supported in CBQN currently.
This section covers how FFI arguments and results are structured in BQN, and collects the ways to tweak it.
-The normal case is that 𝕩
is a list of the arguments. You can pass in 𝕨
if you really want, but is has to be an empty list. No arguments have been specified as coming from 𝕨
! You can control where a C argument comes from by sticking 𝕨
or 𝕩
to the front (𝕩
does nothing, it's already the default). So for example, arguments of "i32"‿"𝕨i32"‿"𝕩i32"
mean the function has to be called like ⟨2⟩ Fn 3‿4
. Then if a C argument is the only one included in its BQN argument, it can have a >
at the beginning (either before or after the 𝕨
/𝕩
) meaning that the BQN argument should be that value directly instead of the 1-element list.
If there are no mutable arguments, the result is what it is. Unless it isn't: a result type of ""
results in a result value of @
.
Mutable arguments (those with &
) all have to be returned as part of the result. If there are any of these, the result is a list consisting of the C result followed by each mutable argument, mutatis mutandis, in the order they appeared in the arguments. Again, the result type can be ""
to leave it out, so that the result just includes mutable values. It can also be "&"
if there's exactly one such value, which means it won't be returned as a list.
The normal case is that 𝕩
is a list of the arguments. You can pass in 𝕨
if you really want, but is has to be an empty list. No arguments have been specified as coming from 𝕨
! You can control where a C argument comes from by sticking 𝕨
or 𝕩
to the front (𝕩
does nothing, it's already the default). So for example, arguments of "i32"‿"𝕨i32"‿"𝕩i32"
mean the function has to be called like ⟨2⟩ Fn 1‿3
. Then if a C argument is the only one included in its BQN argument, it can have a >
at the beginning (either before or after the 𝕨
/𝕩
) meaning that the BQN argument should be that value directly instead of the 1-element list.
If there are no mutable arguments, the result is what it is. Unless it isn't: a void result type ""
results in a result value of @
.
Mutable arguments (those with &
) all have to be returned as part of the result. If there are any of these, the result is a list consisting of the C result followed by each mutable argument, mutatis mutandis, in the order they appeared in the arguments. Again, a result type of ""
assumes a void result and leaves it out, so that the result just includes mutable values. A void result can also be written "&"
if there's exactly one such value, which means it won't be returned as a list.
In the table of examples below, 5
, 1
, 0.5
, and 3
are used as values for i8
, u1
, f64
, and i64
respectively, and a
and b
are possible list arguments with am
and bm
indicating modified versions of these.