-
Notifications
You must be signed in to change notification settings - Fork 270
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sharing Filesystem between multiple PHP instances #1027
Labels
Comments
adamziel
added
[Type] Enhancement
New feature or request
[Feature] PHP.wasm
[Aspect] Filesystem
labels
Feb 11, 2024
adamziel
changed the title
Shared Filesystem between multiple PHP instances
Sharing Filesystem between multiple PHP instances
Feb 11, 2024
This PR explores syncing and replaying FS operations: |
Actually, Emscripten's native PROXYFS provides this exact feature 🎉 This means we can have two (or more!) Emscripten modules acting on the same filesystem. // Module 2 can use the path "/fs1" to access and modify Module 1's filesystem
module2.FS.mkdir("/fs1");
module2.FS.mount(module2.PROXYFS, {
root: "/",
fs: module1.FS
}, "/fs1"); |
adamziel
added a commit
that referenced
this issue
Feb 28, 2024
Adds support for spawning PHP subprocesses via `<?php proc_open(['php', 'activate_theme.php']);`. The spawned subprocess affects the filesystem used by the parent process. ## Implementation details This PR updates the default `spawnHandler` in `worker-thread.ts` that creates another WebPHP instance and mounts the parent filesystem using Emscripten's PROXYFS. [A shared filesystem didn't pan out. Synchronizing is the second best option.](#1027) This code snippet illustrates the idea – note the actual implementation is more nuanced: ```ts php.setSpawnHandler( createSpawnHandler(async function (args, processApi) { const childPHP = new WebPHP(); const { exitCode, stdout, stderr } = await childPHP.run({ scriptPath: args[1] }); processApi.stdout(stdout); processApi.stderr(stderr); processApi.exit(exitCode); }) ); ``` ## Future work * Stream `stdout` and `stderr` from `childPHP` to `processApi` instead of buffering the output and passing everything at once ## Example of how it works <img width="500" src="https://github.com/WordPress/wordpress-playground/assets/205419/470d79b2-2f10-4f1a-806c-5f26463766da" /> #### /wordpress/spawn.php ```php <?php echo "<plaintext>"; echo "Spawning /wordpress/child.php\n"; $handle = proc_open('php /wordpress/child.php', [ 0 => ['pipe', 'r'], 1 => ['pipe', 'w'], 2 => ['pipe', 'w'], ], $pipes); echo "stdout: " . stream_get_contents($pipes[1]) . "\n"; echo "stderr: " . stream_get_contents($pipes[2]) . "\n"; echo "Finished\n"; echo "Contents of the created file: " . file_get_contents("/wordpress/new.txt") . "\n"; ``` #### /wordpress/child.php ```php <?php echo "<plaintext>"; echo "Spawned, running"; error_log("Here's a message logged to stderr! " . rand()); file_put_contents("/wordpress/new.txt", "Hello, world!" . rand() . "\n"); ``` ## Testing instructions 1. Update `worker-thread.ts` to create the two files listed above 2. In Playground, navigate to `/spawn.php` 3. Confirm the output is the same as on the screenshot above
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Blueprints as a PHP library depends on a sharing the filesystem between two PHP instances.
Many tasks must be delegated to a PHP sub-process which makes no sense if that sub-process has no access to WordPress files seen by the main process.
Imagine the following scenario:
It could be handled by PHP.wasm as follows:
If, however,
php2
acts on a separate filesystem, thependant
theme won't be activated from the perspective ofphp1
.This issue is about Web Browser. In Node.js, NODEFS solves this problem
wp-now
uses the filesystem of the device it runs on via the NODEFS Emscripten API.Reusing the same filesystem
PHP.wasm uses MEMFS by default
A newly created PHP instance handles all filesystem operations using an in-memory filesystem implementation called MEMFS. MEMFS keeps track of the files using JavaScript objects. It also contains hardcoded references to HEAP and FS of the PHP instance it lives in
Reusing MEMFS seems like a non-starter
Sharing MEMFS between two PHP instances seems extremely difficult. The hardcoded heap and FS references makes it difficult to bind the same MEMFS to two PHP instances. Perhaps it could be done with deep refactoring, but I worry we'd quickly run into a problem of reusing heap, but not the static memory or stack, between PHP instances. It doesn't seem to be worth it.
IDBFS is too slow
It takes a few minutes to read the WordPress files from IDBFS into MEMFS. It's just too slow for this amount of I/O.
Conceptually, OPFS could work. Unfortunately, the Emscripten OPFS backend crashes
Ditching MEMFS and relying on OPFS would enable all the PHP instances to act on the same underlying files. Emscripten has new and undocumented support for OPFS. I explored it in this PR and, unfortunately, couldn't get it to work without crashing:
Synchronizing two filesystems
If we can't easily share the same filesystem, synchronizing two distinct filesystems is the next best approach.
Overwriting the entire filesystem
Whenever the child process yields to the event loop, we could overwrite the main process's filesystem with all the files from the child's filesystem.
I can only think of two issues with this approach:
Safety – could it affect the main FS in an unsafe way that would not happen with concurrent writes to a shared Filesystem?
I'm not sure, but intuitively, I want to say it's safe. There is no real concurrency involved – this could work in a single JavaScript worker on a single event loop:
Because everything happens in order, replaying the filesystem operations intuitively seem safe to me. Or at least not less safe than running both processes in parallel on my Mac.
Is there a flaw in this reasoning?
Speed – would overwriting the entire filesystem take ages?
My intuition says yes, but rotatedPHP performs exactly this kind of filesystem overwrite and it's barely noticeable. If, however, the speed turns out to be a problem after all, we could turn to the next approach on the list.
Replaying the filesystem operations
Playground supports synchronizing two Playground instances by journaling and replaying the MEMFS operations – see the (demo](https://playground.wordpress.net/demos/sync.html).
This PR explores syncing and replaying FS operations:
Related points
Emscripten supports PThreads and WASM Workers. They seem tempting at first, but they aren't a silver bullet. The underlying implementation is just built with web workers and SharedArrayBuffer, neither of which solves our problem here.
We could potentially use the same SharedArrayBuffer in both PHP instances to keep track of MEMFS files. However, and I'm making a few guesses that could be wrong, we'd end up also sharing the heap, the stack, and the global variables and locks as well. That would make it the same as just calling
run()
on the current PHP instance, which we can't do. Please tell me if wrong about anything here!The text was updated successfully, but these errors were encountered: