Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multithreaded evaluator #10938

Draft
wants to merge 71 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
71 commits
Select commit Hold shift + click to select a range
6760b39
EvalState: Make the parse/eval caches thread-safe
edolstra Jan 4, 2024
d3854d1
LRUCache: Mark size() as const
edolstra Jan 4, 2024
945cd69
Sync: Add support for shared locks
edolstra Jan 4, 2024
5f3b1a3
WIP
edolstra Feb 4, 2024
9ddca98
WIP3
edolstra Mar 12, 2024
d133aca
WIP4
edolstra May 23, 2024
1a55754
Disable some blackhole tests for now
edolstra May 24, 2024
d623dfb
WIP working
edolstra May 28, 2024
a9e3594
Better hash
edolstra May 29, 2024
b63a132
Symbol table concurrency hack
edolstra May 29, 2024
76f822f
Hacks
edolstra May 29, 2024
6a85af7
Fix failures due to value reuse
edolstra May 31, 2024
6eafc52
Revive the Boehm GC alloc cache
edolstra May 31, 2024
f018a55
Make RegexCache thread-safe
edolstra May 31, 2024
ec8593d
Add some stats
edolstra Jun 3, 2024
105dea5
Cleanup
edolstra Jun 3, 2024
27fb652
Make EvalState::srcToStore thread-safe
edolstra Jun 4, 2024
d990974
PosixSourceAccessor: Use SharedSync
edolstra Jun 5, 2024
eba54f5
FileParseCache, FileEvalCache: Use read lock
edolstra Jun 5, 2024
a25a5b7
Add getOptional()
edolstra Jun 6, 2024
ca11328
EvalState: Add importResolutionCache
edolstra Jun 6, 2024
c2c01d8
Make fileEvalCache insertion more efficient
edolstra Jun 6, 2024
9b88021
Ensure that files are parsed/evaluated only once
edolstra Jun 6, 2024
708e0e8
Small optimization
edolstra Jun 6, 2024
cc38822
SymbolStr: Remove std::string conversion
edolstra Jun 6, 2024
424e01e
Use a contiguous arena for storing symbols
edolstra Jun 6, 2024
c663076
Executor: Randomize the work queue
edolstra Jun 6, 2024
adcc351
Provide std::hash<SourcePath>
edolstra Jun 6, 2024
3988faf
Provide std::hash<Symbol>
edolstra Jun 6, 2024
a70ec9e
Remove unused #include
edolstra Jun 6, 2024
0cd29fe
Split the symbol table into domains
edolstra Jun 6, 2024
0c87ead
Fix --disable-gc build
edolstra Jun 7, 2024
33f50ae
Don't use finishValue() for thunks
edolstra Jun 7, 2024
5e87cf4
Remove debug statement
edolstra Jun 7, 2024
400a670
Specify memory order
edolstra Jun 7, 2024
5c6eb1a
Split the PosixSourceAccessor lstat cache
edolstra Jun 7, 2024
3353f9a
nix search: Restore output
edolstra Jun 13, 2024
9b814c4
Make the max-call-depth check thread-local
edolstra Jun 13, 2024
fd5c32b
Move code
edolstra Jun 13, 2024
1bdf907
nix flake show: Make multi-threaded
edolstra Jun 14, 2024
3cc1319
Disable some failing tests for now
edolstra Jun 14, 2024
f6cbd6a
Merge remote-tracking branch 'origin/master' into multithreaded-eval
edolstra Jun 18, 2024
6103246
Cleanups
edolstra Jun 19, 2024
576a03e
Re-enable assertNoSymlinks()
edolstra Jun 19, 2024
52bd994
Formatting
edolstra Jun 19, 2024
b713591
Merge remote-tracking branch 'origin/master' into multithreaded-eval
edolstra Jun 19, 2024
fcbdc9f
Merge remote-tracking branch 'origin/master' into multithreaded-eval
edolstra Jul 3, 2024
997af66
Make the default GC_INITIAL_HEAP_SIZE a lot bigger
edolstra Jul 5, 2024
d3397d7
Merge remote-tracking branch 'origin/master' into multithreaded-eval
edolstra Jul 11, 2024
2b4c36f
Remove obsolete comment
edolstra Jul 11, 2024
d131a02
Merge remote-tracking branch 'origin/master' into multithreaded-eval
edolstra Jul 12, 2024
4482fe4
Merge remote-tracking branch 'origin/master' into multithreaded-eval
edolstra Jul 26, 2024
8cf80c9
nix repl: Remove unnecessary call to evalString
edolstra Jul 26, 2024
67ff326
Remove FIXME
edolstra Jul 26, 2024
c8c9500
Merge remote-tracking branch 'origin/master' into multithreaded-eval
edolstra Aug 5, 2024
dd44b26
Merge remote-tracking branch 'origin/master' into multithreaded-eval
edolstra Aug 12, 2024
9102baf
Make AllowListSourceAccessor thread-safe
edolstra Aug 12, 2024
998a289
Make positionToDocComment thread-safe
edolstra Aug 12, 2024
5310b0f
callFunction(): Use correct environment in error messages
edolstra Aug 13, 2024
839aec2
callFunction(): Create the primop app chain safely
edolstra Aug 13, 2024
4f90786
Debug
edolstra Aug 13, 2024
6357885
Move perf counters into EvalState
edolstra Aug 13, 2024
4086c1c
Debug
edolstra Aug 13, 2024
a6d8217
Remove "SPURIOUS" message
edolstra Aug 13, 2024
ea4e981
Fix formatting
edolstra Aug 13, 2024
d36ea2e
Fix meson build
edolstra Aug 13, 2024
114d1a0
finishAll(): Propagate an arbitrary exception
edolstra Aug 14, 2024
f947b63
nix flake show: Make sure the visit() closure is still alive in case …
edolstra Aug 14, 2024
ceeb648
Introduce ValueType::nFailed
edolstra Aug 14, 2024
8b7d5b4
Make 'nix search --json' thread-safe
edolstra Aug 15, 2024
8020c0c
Merge remote-tracking branch 'origin/master' into multithreaded-eval
edolstra Aug 22, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion configure.ac
Original file line number Diff line number Diff line change
Expand Up @@ -346,7 +346,7 @@ AC_ARG_ENABLE(gc, AS_HELP_STRING([--enable-gc],[enable garbage collection in the
gc=$enableval, gc=yes)
if test "$gc" = yes; then
PKG_CHECK_MODULES([BDW_GC], [bdw-gc])
CXXFLAGS="$BDW_GC_CFLAGS $CXXFLAGS"
CXXFLAGS="$BDW_GC_CFLAGS -DGC_THREADS $CXXFLAGS"
AC_DEFINE(HAVE_BOEHMGC, 1, [Whether to use the Boehm garbage collector.])

# See `fixupBoehmStackPointer`, for the integration between Boehm GC
Expand Down
2 changes: 2 additions & 0 deletions src/libexpr-c/nix_api_value.cc
Original file line number Diff line number Diff line change
Expand Up @@ -181,6 +181,8 @@ ValueType nix_get_type(nix_c_context * context, const nix_value * value)
switch (v.type()) {
case nThunk:
return NIX_TYPE_THUNK;
case nFailed:
return NIX_TYPE_FAILED;
case nInt:
return NIX_TYPE_INT;
case nFloat:
Expand Down
3 changes: 2 additions & 1 deletion src/libexpr-c/nix_api_value.h
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,8 @@ typedef enum {
NIX_TYPE_ATTRS,
NIX_TYPE_LIST,
NIX_TYPE_FUNCTION,
NIX_TYPE_EXTERNAL
NIX_TYPE_EXTERNAL,
NIX_TYPE_FAILED,
} ValueType;

// forward declarations
Expand Down
51 changes: 42 additions & 9 deletions src/libexpr/eval-gc.cc
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
#include "config-global.hh"
#include "serialise.hh"
#include "eval-gc.hh"
#include "file-system.hh"

#if HAVE_BOEHMGC

Expand Down Expand Up @@ -32,6 +33,39 @@ static void * oomHandler(size_t requested)
throw std::bad_alloc();
}

static size_t getFreeMem()
{
/* On Linux, use the `MemAvailable` or `MemFree` fields from
/proc/cpuinfo. */
# if __linux__
{
std::unordered_map<std::string, std::string> fields;
for (auto & line : tokenizeString<std::vector<std::string>>(readFile("/proc/meminfo"), "\n")) {
auto colon = line.find(':');
if (colon == line.npos)
continue;
fields.emplace(line.substr(0, colon), trim(line.substr(colon + 1)));
}

auto i = fields.find("MemAvailable");
if (i == fields.end())
i = fields.find("MemFree");
if (i != fields.end()) {
auto kb = tokenizeString<std::vector<std::string>>(i->second, " ");
if (kb.size() == 2 && kb[1] == "kB")
return string2Int<size_t>(kb[0]).value_or(0) * 1024;
}
}
# endif

/* On non-Linux systems, conservatively assume that 25% of memory is free. */
long pageSize = sysconf(_SC_PAGESIZE);
long pages = sysconf(_SC_PHYS_PAGES);
if (pageSize != -1)
return (pageSize * pages) / 4;
return 0;
}

static inline void initGCReal()
{
/* Initialise the Boehm garbage collector. */
Expand All @@ -50,10 +84,12 @@ static inline void initGCReal()

GC_INIT();

GC_allow_register_threads();

GC_set_oom_fn(oomHandler);

/* Set the initial heap size to something fairly big (25% of
physical RAM, up to a maximum of 384 MiB) so that in most cases
/* Set the initial heap size to something fairly big (80% of
free RAM, up to a maximum of 8 GiB) so that in most cases
we don't need to garbage collect at all. (Collection has a
fairly significant overhead.) The heap size can be overridden
through libgc's GC_INITIAL_HEAP_SIZE environment variable. We
Expand All @@ -64,13 +100,10 @@ static inline void initGCReal()
if (!getEnv("GC_INITIAL_HEAP_SIZE")) {
size_t size = 32 * 1024 * 1024;
# if HAVE_SYSCONF && defined(_SC_PAGESIZE) && defined(_SC_PHYS_PAGES)
size_t maxSize = 384 * 1024 * 1024;
long pageSize = sysconf(_SC_PAGESIZE);
long pages = sysconf(_SC_PHYS_PAGES);
if (pageSize != -1)
size = (pageSize * pages) / 4; // 25% of RAM
if (size > maxSize)
size = maxSize;
size_t maxSize = 8ULL * 1024 * 1024 * 1024;
auto free = getFreeMem();
debug("free memory is %d bytes", free);
size = std::min((size_t) (free * 0.8), maxSize);
# endif
debug("setting initial heap size to %1% bytes", size);
GC_expand_hp(size);
Expand Down
61 changes: 57 additions & 4 deletions src/libexpr/eval-inline.hh
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,8 @@ Value * EvalState::allocValue()
GC_malloc_many returns a linked list of objects of the given size, where the first word
of each object is also the pointer to the next object in the list. This also means that we
have to explicitly clear the first word of every object we take. */
thread_local static std::shared_ptr<void *> valueAllocCache{std::allocate_shared<void *>(traceable_allocator<void *>(), nullptr)};

if (!*valueAllocCache) {
*valueAllocCache = GC_malloc_many(sizeof(Value));
if (!*valueAllocCache) throw std::bad_alloc();
Expand Down Expand Up @@ -62,6 +64,8 @@ Env & EvalState::allocEnv(size_t size)
#if HAVE_BOEHMGC
if (size == 1) {
/* see allocValue for explanations. */
thread_local static std::shared_ptr<void *> env1AllocCache{std::allocate_shared<void *>(traceable_allocator<void *>(), nullptr)};

if (!*env1AllocCache) {
*env1AllocCache = GC_malloc_many(sizeof(Env) + sizeof(Value *));
if (!*env1AllocCache) throw std::bad_alloc();
Expand All @@ -84,9 +88,33 @@ Env & EvalState::allocEnv(size_t size)
[[gnu::always_inline]]
void EvalState::forceValue(Value & v, const PosIdx pos)
{
auto type = v.internalType.load(std::memory_order_acquire);

if (isFinished(type))
goto done;

if (type == tThunk) {
try {
if (!v.internalType.compare_exchange_strong(type, tPending, std::memory_order_acquire, std::memory_order_acquire)) {
if (type == tPending || type == tAwaited) {
waitOnThunk(v, type == tAwaited);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally the thread could pick up other work from a collection of thunks that are likely needed.
There's latent parallelism in functions like foldl', toString and derivationStrict that we could exploit this way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And there is parallelism to exploit in vector operations like ExprConcatStrings and all the binary operators, provided we relax the strictness assumptions of these operators.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not all binary operators; for instance && is conditionally lazy in its second operand. We'd need speculative evaluation for those, which is absolutely feasible, but adds a cross cutting concern, so I wouldn't go there first.
String concatenation definitely belongs in that list, regardless of which part of the code does it, and we might want to move the actual concatenation out of ExprConcatStrings to make #6530 deterministic and pure.

goto done;
}
if (isFinished(type))
goto done;
printError("NO LONGER THUNK %x %d", this, type);
abort();
}
Env * env = v.payload.thunk.env;
Expr * expr = v.payload.thunk.expr;
expr->eval(*this, *env, v);
} catch (...) {
v.mkFailed();
throw;
}
}
#if 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be a bit rusty on this, but GHC black holes store a thread id (thread pointer?) in its blackholes.
Same thread => definitely infinite recursion
Different thread => check that the other thread is not (transitively) waiting for us. (iirc it might perform the same computation - which is why unsafeDupablePerformIO exists and is cheap). Nix could just block instead, after saving an edge in the thread waiting map that we used for checking.

if (v.isThunk()) {
Env * env = v.payload.thunk.env;
Expr * expr = v.payload.thunk.expr;
try {
v.mkBlackhole();
//checkInterrupt();
Expand All @@ -97,8 +125,33 @@ void EvalState::forceValue(Value & v, const PosIdx pos)
throw;
}
}
else if (v.isApp())
callFunction(*v.payload.app.left, *v.payload.app.right, v, pos);
#endif
else if (type == tApp) {
try {
if (!v.internalType.compare_exchange_strong(type, tPending, std::memory_order_acquire, std::memory_order_acquire)) {
if (type == tPending || type == tAwaited) {
waitOnThunk(v, type == tAwaited);
goto done;
}
if (isFinished(type))
goto done;
printError("NO LONGER APP %x %d", this, type);
abort();
}
callFunction(*v.payload.app.left, *v.payload.app.right, v, pos);
} catch (...) {
v.mkFailed();
throw;
}
}
else if (type == tPending || type == tAwaited)
type = waitOnThunk(v, type == tAwaited);
else
abort();

done:
if (type == tFailed)
std::rethrow_exception(v.payload.failed->ex);
}


Expand Down
Loading
Loading