Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

importc types: CT sizeof, $, {.sizeof.}, const CVAR {.importc.}: int #205

Open
timotheecour opened this issue Mar 30, 2020 · 12 comments
Open

Comments

@timotheecour
Copy link
Member

timotheecour commented Mar 30, 2020

proposal 1 would allow CT sizeof for specified importc types
proposal 2 is just cleanup (related to proposal 1)
proposal 4 would fix $ and repr for importc types
proposal 5 would avoid having to hardcode macro constants defined in C headers (which is brittle and not cross platform), and make them available at CT in a reliable way

proposal 3 is just sugar

proposal 1: introduce {.sizeof.} for importc types

  • introduce {.sizeof.} pragma for importc types, with following meaning:
type Foo1 {.importc.} = object
  x1: cint
const s1 = sizeof(Foo1) # CT error: Foo1 is importc and either incomplete or needs to be annotated with sizeof

type Foo2 {.importc, sizeof.} = object
  x1: cint # all fields are specicied
  c2: cstring
const s2 = sizeof(Foo2) # works

type Foo3 {.importc, sizeof: s2 .} = object
  x1: cint  # some fields may be missing but sizeof is specified
static: doAssert sizeof(Foo3) == s2
  • Note: there should be zero additional code to compute CT sizeof for importc types, it's just the regular CT sizeof except that we remove a CT error depending on presence of {.sizeof.}

  • we add a CT check in codegen to check for errors (eg if some header is updated and fields are added), using NIM_STATIC_ASSERT C/C++ macro introduced in fix some codegen bugs: NIM_BOOL, NIM_STATIC_ASSERT, --passc:-std=... (etc) Nim#13798:
    codegen inserts NIM_STATIC_ASSERT(sizeof(Foo1) == computedSizeofFoo1) right after the header where it's define (or right after its declaration if no header).

proposal 2: deprecate incompleteStruct

[EDIT: this needs to be double checked]
deprecate {.incompleteStruct.} (introduced in 2012), it's the wrong default, it's not observed (AFAIK) as evidenced by face only few structs are annotated with it. It's also un-safe as a lack of annotation would mean a struct is complete, which is often wrong.

proposal 3: {.importc.} for ref/ptr object types

introduce type Foo {.importc.} = ref object, with the meaning that the importc applies to the object, not the ref, to avoid the common boilerplate:

type TFoo {.importc.} = object
  x1: cint
  c2: cstring
type Foo = ref TFoo

where TFoo doesn't need to be declared.
ditto with ptr.

I don't think there's any use case otherwise for {.importc.} applying to the ref or ptr directly, since a pointer is just a pointer.

proposal 4: {.hidden.} for importc fields that shouldn't be referenced by name in codegen

(previous version of this proposal called it {.importc:"?".})
this is an alternative approach to nim-lang/Nim#13783 for cases where some field names (eg pad, reserved etc) are un-reliable because they're OS-specific, and not part of stable ABI. eg:

# /usr/include/x86_64-linux-gnu/bits/stat.h
struct stat
  {
...
#ifdef __x86_64__
    __syscall_slong_t __glibc_reserved[3];
#else
# ifndef __USE_FILE_OFFSET64
    unsigned long int __glibc_reserved4;
    unsigned long int __glibc_reserved5;
# else
    __ino64_t st_ino;                   /* File serial number.  */
# endif
#endif
  };
type stat {.importc: "stat", header = "<stat.h>".} =
  foo1: cint
  reserved {.hidden}: array[4, byte]
  foo2: cstring

proposal 5: const CVAR {.importc.}: int

the idea is to replace hardcoded values obtained from reading header files (which may be unreliable, not cross platform, and even wrong if user passes some other --passc option affecting some symbols) by simply calling preprocessor:

const
  DARWIN_MAXPATHLEN = 1024

by:

const
  DARWIN_MAXPATHLEN {.importc: "__DARWIN_MAXPATHLEN", header: "<dirent.h>".}: int
static: doAssert DARWIN_MAXPATHLEN == 1024 # this would pass

I've implemented this in a POC and it seems to work, but no PR yet.
It is efficient since I'm caching header files and preprocessor output.
EDIT: yet another example use case: nim-lang/Nim#16459 (comment)

Note: regardless of whether we use const CVAR {.importc.}: int, codegen could validate the value we get via NIM_STATIC_ASSERT

Note: this doesn't require libffi

@timotheecour timotheecour changed the title importc types: CT sizeof, $, {.sizeof.}, NIM_STATIC_ASSERT importc types: CT sizeof, $, {.sizeof.}, const CVAR {.importc.}: int Mar 30, 2020
@krux02
Copy link
Contributor

krux02 commented Mar 30, 2020

You propose 5 things to solve which problem?

@timotheecour
Copy link
Member Author

updated top section with a summary

@cooldome
Copy link
Member

cooldome commented Apr 3, 2020

Proposal 1: we already have pragma size which does this. Possibly you want to extend this pragma coverage.
Proposal 3: -1 from me, imo one line of boiler plate is better than one more convention to remember.

@krux02
Copy link
Contributor

krux02 commented Apr 3, 2020

The size of an object is pretty much worthless for sizeof computations, if you don't also know the alignment.

@Araq
Copy link
Member

Araq commented Apr 3, 2020

Proposal 6: Introduce a pragma .hidden for fields that are introduced for alignment purposes and should remain hidden. fieldPairs and friends should ignore these fields.

@timotheecour
Copy link
Member Author

timotheecour commented Apr 3, 2020

Proposal 6:

how is that different from proposal 4 besides the name change {importc:"?"} vs {.hidden.} ? I prefer the name you're suggesting {.hidden.} though, since it's meant to work in other contexts (objc,c++) and simpler to type (at the very minor cost of introducing a new builtin pragma)

fieldPairs and friends should ignore these fields.

what about repr which is meant for low-level debugging? I've suggested in top post to use offsetof to access value of hidden fields so that we can still access the value (eg useful for serialization and debugging purposes, accessing those reserved fields may be desirable)

@krux02
Copy link
Contributor

krux02 commented Apr 4, 2020

well currently sizeof computation does opt out of all types that are imported via {.importc.}, you should mention that you want to change that.

Then I don't like the reuse of the name sizeof, because this name doesn't make sense in this context. I propose to use completeStruct. In that respect, incompleteStruct is basically the default and attaching it to a type is redundant. But that doesn't justify to deprecate it, yet.

I don't like neither {.importc:"?".} nor {.hidden.}. After all you specify a field name that isn't really a field at all and the name shouldn't be used anywhere. I would suggest to not provide a name in the first place. The sink _ identifier would work for this. Example:

type stat {.importc: "stat", header = "<stat.h>", sizeof.} =
  foo1: cint
  _: array[4, byte]
  foo2: cstring
  _: array[16, byte]

what about repr which is meant for low-level debugging? I've suggested in top post to use offsetof to access value of hidden fields so that we can still access the value (eg useful for serialization and debugging purposes, accessing those reserved fields may be desirable)

No. If those fields should be inspectable, give them the correct name of the C declaration. Then it can be used properly. After all we already have an export marker, we don't need yet another level of exportedness.

we add a CT check in codegen to check for errors (eg if some header is updated and fields are added), using NIM_STATIC_ASSERT C/C++ macro introduced in nim-lang/Nim#13798:
codegen inserts NIM_STATIC_ASSERT(sizeof(Foo1) == computedSizeofFoo1) right after the header where it's define (or right after its declaration if no header).

You must add static asserts for all field offsets as well and the object alignment as well. Really you don't want a single size/alignment computation to be wrong and stay undetected.

Last but not least. The hidden fields should not be listed with the fields iterator, nor should they be part of the ast that is returned from getTypeImpl. You can't do anything with these fields from nim, So don't force macro writers to filter these incompatible fields out all the time.

type Foo3 {.importc, sizeof: s2 .} = object

That part is underspecified. What is s2? It doesn't exist in your code example. Also, what use case does it cover? After I don't think it should be supported.

@timotheecour
Copy link
Member Author

timotheecour commented Apr 4, 2020

/cc @cooldome

Proposal 1: we already have pragma size which does this. Possibly you want to extend this pragma coverage.

indeed, {.size:4.} is similar (and allows sizeof at CT even with importc, TIL), but the big difference though is to allow sizeof with 0 argument: {.importc, sizeof.} which implies to use foldSizeOf.

Also, .size:N has weird semantics conflating sizeof and alignof (only 1,2,4,8 is accepted, and it implies alignment = N which is wrong eg for type M1 {.importc, size:4.} = object: x: array[4, bool]; alignment should be 1, not 4. This RFC could possibly extend .size with this new meaning instead of introducing .sizeof, TBD. For clarity, I'm still using sizeof in this RFC until that point is resolved.

/cc @krux02

well currently sizeof computation does opt out of all types that are imported via {.importc.}, you should mention that you want to change that.

That's basically what proposal 1 is about; allow .importc types to use the foldSizeOf when .sizeof is specified.

The size of an object is pretty much worthless for sizeof computations, if you don't also know the alignment.

same exact same thing would apply with {.alignof.} (infer via foldAlignOf) and {.alignof: N.} (use N). And IMO, {.sizeof.} can imply {.alignof.} for simplicity.
All these would be statically enforced via NIM_STATIC_ASSERT at cgen compile time.

No. If those fields should be inspectable, give them the correct name of the C declaration.

you can't for ABI compatibility reasons. But you should still be able to access the values (implicitly casted to specified type) transparently: the cgen inserts something like *(FieldTypeInNimDecl*)(void*)(&offsetof(CStructType, fieldOffset)) every time a hidden field is accessed/set. So it should work transparently. Maybe a POC can show this.

The sink _ identifier would work for this.

_ would make it hard/impossible to serialize/inspect/debug the type, whereas reserved {.hidden.}: array[N, byte] would allow exactly that.

You must add static asserts for all field offsets as well and the object alignment as well

yes, and it's cheap to do.

type Foo3 {.importc, sizeof: s2 .} = object
That part is underspecified. What is s2? It doesn't exist in your code example. Also, what use case does it cover? After I don't think it should be supported.

s2 is any expression that can be computed at CT and is of type int.
Example:

const N = when defined(osx): 40 else: 48 # just an example; in practice we can automate
type cvMat {.importc, sizeof: N, alignof: cint.alignof.} = object
  rows*, cols*: cint
  data*: pointer
  # other fields that are skipped for various reasons (eg not needed, laziness, or ABI issues etc)

Now cvMat.{sizeof,alignof} can be used at CT, which in turns mean any type involving it can be used at CT.
Which means more code works at CT. Note that you can't emulate this with a reserved {.hidden.}: array[N , byte] field as N would be cumbersome to compute in user code and the other fields could appear anywhere, including before rows etc.

Note also that my original idea was to introduce {.completeStruct.} but I instead went for {.sizeof.} which naturally extends to {.sizeof: N.}

@krux02
Copy link
Contributor

krux02 commented Apr 4, 2020

_ would make it hard/impossible to serialize/inspect/debug the type, whereas reserved {.hidden.}: array[N, byte] would allow exactly that.

No, it would not. All fields with their true names will be visualized correctly in gdb and other debuggers, they use the debug information that comes from the C(++) compiler, with correct field names.

@timotheecour
Copy link
Member Author

timotheecour commented Apr 4, 2020

_ would make it hard/impossible to serialize/inspect/debug the type

No, it would not. All fields with their true names will be visualized correctly in gdb and other debuggers

I'm talking about getting/setting those fields from nim code (eg: serialization), not from a debugger. Of course a debugger has access to those with debug info.

@krux02
Copy link
Contributor

krux02 commented Apr 4, 2020

I'm talking about getting/setting those fields from nim code (eg: serialization), not from a debugger. Of course a debugger has access to those with debug info.

Give them the correct name of the C declaration, then you can access those members from Nim properly.

@timotheecour
Copy link
Member Author

see nim-lang/Nim#13926 which implements proposal 1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants