Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix #16752: threadvar now works with importcpp types; osx now uses native TLS (--tlsEmulation:off), which can be orders of magnitude faster #16750

Merged
merged 4 commits into from
Jan 27, 2021

Conversation

timotheecour
Copy link
Member

@timotheecour timotheecour commented Jan 19, 2021

links

The clang compiler included with Xcode 8 and later supports the C++11 thread_local keyword.

Regarding iOS, I found by experimentation that thread_local is supported for iOS 9 and later, but not for iOS 8.4 or earlier.

thread_local and __thread are, in fact, not the same thing. The main difference between them is precisely the one you stumbled upon - thread_local allows the variable to be non-POD.

AFAIK nim uses POD types in cgen code, even for types with destructors, and even for c++

freebsd, openbsd, netbsd

I didn't change the behavior here, but see: #15066 (comment) and #15493

note regarding cppNonPod

I used pod for simplicity in the pragma name even though __is_pod trait will be deprecated (see https://stackoverflow.com/a/48225882/1426932):

__is_pod (C++, GNU, Microsoft, Embarcadero): Note, the corresponding standard trait was deprecated in C++20.

this pragma is user defined anyways, and the compiler will error if that pragma is missing, resulting in __thread being used on a variable of a type that contains non-trivial destructor or if type has a constructor and cgen calls var a: Foo.

So in the end it works as intended.

note regarding thread_local

  • if needed, in future work, we can use __has_feature(cxx_thread_local) conditionally
  • C++11 thread_local is supported on OSX since xcode 8, released in 2016
  • ios supports thread_local since ios9 refs https://stackoverflow.com/a/29929949/1426932

future work

  • there are other use cases for {.cppNonPod.}, eg thread safe static initialization:
    instead of Error: a thread var cannot be initialized explicitly; this would only run for the main thread we could allow:
when true:
  proc main =
    var g0 {.threadvar.} = 12
  main()

move all ffi pragmas (including nodecl, incompleteStruct etc) into lib/ffi.rst;

@timotheecour timotheecour force-pushed the pr_threadlocal_native branch from d22a0bc to 40c8f57 Compare January 19, 2021 05:28
@timotheecour timotheecour changed the title osx now uses native TLS, which can be orders of magnitude faster osx now uses native TLS (--tlsEmulation:off), which can be orders of magnitude faster Jan 19, 2021
@timotheecour timotheecour force-pushed the pr_threadlocal_native branch from 40c8f57 to 567d2d8 Compare January 19, 2021 08:00
@timotheecour timotheecour changed the title osx now uses native TLS (--tlsEmulation:off), which can be orders of magnitude faster fix #16752; osx now uses native TLS (--tlsEmulation:off), which can be orders of magnitude faster Jan 19, 2021
@timotheecour timotheecour marked this pull request as ready for review January 19, 2021 08:58
@timotheecour timotheecour changed the title fix #16752; osx now uses native TLS (--tlsEmulation:off), which can be orders of magnitude faster fix #16752: threadvar works with cpp types; osx now uses native TLS (--tlsEmulation:off), which can be orders of magnitude faster Jan 19, 2021
@timotheecour timotheecour changed the title fix #16752: threadvar works with cpp types; osx now uses native TLS (--tlsEmulation:off), which can be orders of magnitude faster fix #16752: threadvar now works with cpp types; osx now uses native TLS (--tlsEmulation:off), which can be orders of magnitude faster Jan 19, 2021
@timotheecour timotheecour changed the title fix #16752: threadvar now works with cpp types; osx now uses native TLS (--tlsEmulation:off), which can be orders of magnitude faster fix #16752: threadvar now works with importcpp types; osx now uses native TLS (--tlsEmulation:off), which can be orders of magnitude faster Jan 19, 2021
@timotheecour timotheecour force-pushed the pr_threadlocal_native branch from 0e7a3d2 to 6303d3c Compare January 19, 2021 17:13
@timotheecour timotheecour requested review from ringabout, Clyybber and Araq and removed request for Araq January 19, 2021 22:21
@mratsim
Copy link
Collaborator

mratsim commented Jan 20, 2021

Linked to #12624 [Spec] {.threadvar.} and destructors

Also linked to Weave issue mratsim/weave#61 (comment)

Note that on Mac, tlsEmulation was on by default for iPhone support which lagged a bit behind MacOSX.

@mratsim
Copy link
Collaborator

mratsim commented Jan 20, 2021

Another issue with tlsEmulation on Windows, Ctrl-C will crash your program #4057.

@timotheecour timotheecour force-pushed the pr_threadlocal_native branch from 6303d3c to 7fc37e8 Compare January 20, 2021 17:38
Copy link
Member

@ringabout ringabout left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@timotheecour
Copy link
Member Author

ping @Araq

@Araq
Copy link
Member

Araq commented Jan 27, 2021

Instead of yet another pragma, we should IMHO mark non-pod C++ types with an importc'ed destructor or similar.

@timotheecour
Copy link
Member Author

timotheecour commented Jan 27, 2021

{.cppNonPod.} is much simpler, both for end users code and nim compiler code.

importc'ed destructor

that won't work, Foo2 below has a default destructor but can't be used with __thread (otherwise gives: error: initializer for thread-local variable must be a constant expression)

struct Foo2 {
  Foo2() { }
};
int main() {
  static __thread Foo2 a2;
  return 0;
}

or similar

I don't see how this can work either.
If Derived is a subclass of some Base which has a non-default constructor (eg: case 2, case 3 below), you also can't use __thread.

# case 1: works
struct Base {
};

# case 2: doesn't work
struct Base {
  Base() { }
};

# case 3: doesn't work
struct Base {
  Base(int n) { }
};

# case 4: works
struct Base {
  Base() = default;
  Base(int n) { } # with or without this
};

# rest of code:
struct Derived : public Base {
};

void bar() {
  static __thread Derived a2;
}

int main (int argc, char *argv[]) {
  bar();
  return 0;
}

As you can see the determination of whether to use __thread or thread_local is complex and can't be made on sole basis of looking for non-trivial destructors or constructors, you'd have to also look at all base classes, this is impractical because a nim user may not wish to declare all (recursive) bases classes of the class they're interested in wrapping, nor their constructors/destructors; and nim shouldn't re-implement the complex logic already done in C++ compiler.

@Araq
Copy link
Member

Araq commented Jan 27, 2021

Ok, I'm convinced. But can we move the FFI pragmas out of the manual to an "appendix"? These things shouldn't be part of the language "spec".

@timotheecour
Copy link
Member Author

But can we move the FFI pragmas out of the manual to an "appendix"

I'm happy to do that in a followup PR, which would move all ffi pragmas (including nodecl, incompleteStruct etc) into lib/ffi.rst; this is out of scope for this PR though.

@Araq Araq merged commit e112974 into nim-lang:devel Jan 27, 2021
@timotheecour timotheecour deleted the pr_threadlocal_native branch January 27, 2021 21:47
@timotheecour timotheecour added the TODO: followup needed remove tag once fixed or tracked elsewhere label Mar 9, 2021
ardek66 pushed a commit to ardek66/Nim that referenced this pull request Mar 26, 2021
… uses native TLS (`--tlsEmulation:off`), which can be orders of magnitude faster (nim-lang#16750)

* osx now uses native TLS, which can be orders of magnitude faster

* add {.cppNonPod.}

* improve test

* changelog, docs, disable part of windows test
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
TODO: followup needed remove tag once fixed or tracked elsewhere
Projects
None yet
Development

Successfully merging this pull request may close these issues.

threadvar doesn't work for importcpp types
4 participants