Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate to ESM (and get dynamic await) #32

Open
CMCDragonkai opened this issue Oct 7, 2021 · 34 comments
Open

Migrate to ESM (and get dynamic await) #32

CMCDragonkai opened this issue Oct 7, 2021 · 34 comments
Labels
Epic r&d:polykey:supporting activity Supporting core activity research Requires research

Comments

@CMCDragonkai
Copy link
Member

CMCDragonkai commented Oct 7, 2021

What is your research hypothesis/question?

There is a conjunction of 4 features:

  • dynamic imports
  • top level await
  • nodejs module type (es modules in nodejs)
  • explicit exports in package.json

That enables some nifty capabilities:

  • The ability to create modules that can do feature tests and import the right module for a given platform - cross platform compatibility (especially in terms of platform-native features vs importing specific modules)
  • The ability to move away from commonjs modules and use esmodules as a standard for all platforms - this can reduce the dependence on bundling if all platforms support esmodules
  • The ability to create explicit package exports, so one can create subdirectory imports import x from 'a/b/c'; - this can allow us to flexibly export things without having to go down the dist/... route all the time

These features are all coming online in various forms, however some initial experiments show that they involve alot of changes.

Review existing ideas, literature and prior work

  1. motivation from nativescript React Native and Mobile OS (iOS and Android) Compatibility Polykey#155 (comment)
  2. experiment in js-id ID standardisation for all projects and domains js-id#1 (comment)
  3. ts-node support ESM support: soliciting feedback TypeStrong/ts-node#1007
  4. Wait for TS 4.5 release

Research conclusion

See the notes in MatrixAI/js-logger#29

@CMCDragonkai CMCDragonkai added the research Requires research label Oct 7, 2021
@CMCDragonkai
Copy link
Member Author

It would be nice if changing to this didn't involve changing all of our import paths.

@CMCDragonkai CMCDragonkai changed the title Suport Dynamic Imports and Nodejs Module Type Support Dynamic Imports and Nodejs Module Type Oct 11, 2021
@CMCDragonkai
Copy link
Member Author

Solving this will enable MatrixAI/js-id#2

@CMCDragonkai
Copy link
Member Author

Pkg doesn't currently support ES modules. But it should soon: vercel/pkg#782.

Currently it may require us to use esbuild.

The esbuild could be trialed out to see if it's better than using webpack which we had removed before, and see if it can help also deal with non-transpiled files that are in ./src and automatically move to ./dist.

It may also help with doing some optimisation work like tree-shaking. But of course it doesn't make sense to put it all into one single file like we were doing in webpack. Library modules should be able to imported separately.

@CMCDragonkai
Copy link
Member Author

Note that dynamic imports already work. Right now they are just translated to require calls. However being able to use these for cross-platform implementation swapping will require top level awaits.

@CMCDragonkai
Copy link
Member Author

CMCDragonkai commented Apr 25, 2022

With #37 merged, we are one step closer.

Before working with ES modules and dynamic imports that use ES modules, it must be tested that it works in vercel pkg. Keep track of this issue: vercel/pkg#1603

Also important that it works with ts-node too.

@CMCDragonkai
Copy link
Member Author

CMCDragonkai commented Jun 1, 2022

Experimental support for ESM in ts-node is now available: TypeStrong/ts-node#1007 (comment)

@CMCDragonkai
Copy link
Member Author

Moving to ESM is a big task, there are other tools we need support for: https://gist.github.com/sindresorhus/a39789f98801d908bbc7ff3ecc99d99c

@CMCDragonkai
Copy link
Member Author

This would be necessary to support usage of WASM libraries: MatrixAI/Polykey#422 (comment)

We also have to decide whether to use the .js import extension, or do something experimental to avoid having to always choose the extension to import it (and force extensionless imports).

@CMCDragonkai
Copy link
Member Author

PKE is going to be on ESM. If we move the rest of PK ecosystem to ESM, we have to deal with:

  1. tsconfig-paths not working with ESM so we can't use ts-node to load things that use the alias paths
  2. possibly the fact that CJS cannot import/require things that are written in ESM, so we just publish ESM and hope for the best? Not sure.

@CMCDragonkai
Copy link
Member Author

Note that:

What is doable is vercel/pkg#1291 (comment) - which uses a bundler to convert ESM to CJS, then uses pkg on CJS.

Furthermore any bundler to convert ESM to CJS does not support top-level await. Which means you often need refactor any usages of top level await like in this library Offroaders123/NBTify#30 (comment).

This is quite annoying, other bundlers like webpack focus on more a web-like environment and so has features that isn't really needed.

Node is coming out with its own single file executables too, but currently only supports CJS.

Deno is probably the only one that cleanly creates a single file executable with full ESM support but involves bunch of migration that can be difficult to do with all the existing assumptions.

@CMCDragonkai
Copy link
Member Author

CMCDragonkai commented Jul 16, 2023

@CMCDragonkai
Copy link
Member Author

If top-level await is a necessary construct, then we will not be able to use esbuild as the bundler. If so, then we should go with webpack since it's well understood, and I've used it quite a bit. Suppose our entire codebase was ES module native. Then the webpack would eliminate all ES modules and turn it into a single file. There will still need to be a virtual filesystem to deal with all the shared objects and other data files, but this could be sorted some how with deeper understanding of pkg.

Subsequently we can then use pkg as a lower level tool that consumes the output of webpack which just basically puts it all into a single executable file.

@CMCDragonkai
Copy link
Member Author

This issue has become more important since PKE has gone ES native and right now importing our CJS modules is kind of broken. https://gitlab.com/MatrixAI/Engineering/Polykey/Polykey-Enterprise/-/merge_requests/2#note_1472206322

@CMCDragonkai
Copy link
Member Author

ESM support will be in js-async-cancellable, js-resources already and will be merged to staging and master soon.

I was waiting on them first before pushing js-logger and js-errors. But I think these will be passed too.

All of this means all our packages will be ESM native by default.

It's still possible to use regular CJS packages but you have to pattern match off from a default import.

The true test is going to be js-db where all the relevant packages are updated to ESM and then see how it goes.

After the main changes are good, I believe that can just be replicated. @amydevs can you start helping out on the other packages following js-errors changes.

@CMCDragonkai
Copy link
Member Author

Note that top-level await does work in esbuild now. It was a later comment. Especially since tsx uses esbuild it would have to support top-level await.

Should also investigate https://github.com/privatenumber/pkgroll for bundling as it uses esbuild too. This will be important for the pkg optimisation for Polykey-CLI.

@CMCDragonkai
Copy link
Member Author

So the new esm packages which I've released for a few upstream dependencies can lead to a better bundling situation for PK CLI
Atm we use PKG to bundle all the code and also put together the executable.
But it's a bad bundler
The migration to esm took a bunch of work in weird places - tsx replaces ts-node, jest required a special node option, a bunch of stuff in tsconfig and package.json
But one thing it revealed is that esbuild is back to being a possibility for bundling
One major improvement that was unexpected is that benches and tests now run against compiled code.
Meaning test results and benchmarking is far more accurate and would avoid regressions resulting from compilation bugs or mistakes
There are 2 issues that is yet to be tested. ESM when dealing with native modules - the .node binaries, ESM when dealing with our decorators (due to esbuild).
Esbuild could choke on some special features in TS if we are using them. Not sure.
Top level await is going to be really useful for platform conditional exports
Additionally we still use tsc for compilation which produces regular JS. Swc is now only used during testing. Esbuild is used during tsx execution. This can all lead to weird behaviour which is unfortunate. Hopefully in the future jest will use swc, tsx or something similar uses swc, and compilation uses swc... To maintain consistency. But for now the stitching works. (Funny how there are literally 3 different TS compilers being used in every project)
But the fact that tsc is used for the final compilation means that when using esbuild for bundling, esbuild doesn't do any TS compilation. This keeps it simple.
So we should see a far more lightweight executable with even compressed JS/minified JS inside the executable.
The final step of combining it with a node executable is then sort of simple. Additionally a virtual filesystem is provided by PKG to enable loading of shared objects. These 2 things we could potentially replace with our own executable bundling system.
The end result should be something that is far more secure, simpler, more customisable, and lighterweight
Also I believe if we stick with PKG, esbuild would have to produce a cjs bundle at the end

@CMCDragonkai
Copy link
Member Author

CMCDragonkai commented Aug 13, 2023

Small blocker on MatrixAI/js-workers#12. The threadsjs types aren't properly exported and we are making use of 3 types.

import type { ModuleThread } from 'threads';
import type { ModuleMethods } from 'threads/dist/types/master';
import type { QueuedTask } from 'threads/dist/master/pool-types';

Currently none of the types work because they aren't properly exported by the upstream package.

Additionally a recent commit made QueuedTask exported, but again can't access it. It's also because by using ESM and also using the "exports" key, we can't just go straight to the dist directory to get it.

@CMCDragonkai
Copy link
Member Author

Can confirm that decorator usage is fine.

@CMCDragonkai
Copy link
Member Author

The js-workers is a problem:

  1. threads.js needs to be replaced because its getting unmaintained
  2. threadjs when using .ts files forces the usage of ts-node

This is blocking js-db migration since it has tests that rely on it.

We can try swapping out threadsjs for this https://github.com/piscinajs/piscina.

Maybe that can enable it, and we can then proceed with js-db.

It's possible we may have to change the way we create workers a bit.

@CMCDragonkai CMCDragonkai changed the title Support Dynamic Imports and Nodejs Module Type Migrate to ESM (and get dynamic await) Aug 14, 2023
@CMCDragonkai
Copy link
Member Author

Due to jestjs/jest#2549, vm isolation can introduce problems when testing and expecting errors to throw or certain types to be instances of something. Basically the globals inside nodejs isn't the same as globals in jest. Thus toThrow(TypeError) or toBeInstanceOf(Array) might fail if those objects are coming out of nodejs.

This doesn't just affect tests, but it could also affect internal code that has to do instanceof Array.

There's no good solutions, the workaround on tests is to do toHaveProperty('name', 'RangeError') instead of toThrow(RangeError). And instead of instanceof tests you'll need to find other ways of checking like Array.isArray instead of instanceof Array.

The good thing is that most of the native globals that come out of nodejs should have checking functions like Array.isArray.

And exceptions wise... we usually wrap it in our own error tree.

@amydevs
Copy link

amydevs commented Sep 15, 2023

I've been using toHaveProperty in js-ws so far

Copy link
Member Author

@amydevs let's make the top level issue for ESM migration into Polykey Core Library. We could also create a Project for it - as it will end up grouping multiple ESM migration related issues.

@CMCDragonkai
Copy link
Member Author

Convert this to a project, and then group all relevant ESM issues under this for all repos.

Make sure to link the background context of why you're patching the jest fastcheck problem, as well as then notes on how we are going to deal with the mocking issue - probably by not mocking at all where we can.

@CMCDragonkai
Copy link
Member Author

CMCDragonkai commented Aug 23, 2024

Due to jestjs/jest#2549, vm isolation can introduce problems when testing and expecting errors to throw or certain types to be instances of something. Basically the globals inside nodejs isn't the same as globals in jest. Thus toThrow(TypeError) or toBeInstanceOf(Array) might fail if those objects are coming out of nodejs.

This doesn't just affect tests, but it could also affect internal code that has to do instanceof Array.

There's no good solutions, the workaround on tests is to do toHaveProperty('name', 'RangeError') instead of toThrow(RangeError). And instead of instanceof tests you'll need to find other ways of checking like Array.isArray instead of instanceof Array.

The good thing is that most of the native globals that come out of nodejs should have checking functions like Array.isArray.

And exceptions wise... we usually wrap it in our own error tree.

@tegefaulkes mentions that in MatrixAI/Polykey-CLI#257, there is issues with using instanceof is a problem for any situation where the npm dependencies may be duplicated when using npm link. I still don't see any documentation on this concept called "npm modules". So as far as I can tell, fixing this means eliminating any common dependencies entirely from the package.json.

However in relation to ESM migration, relying on things like instanceof is flaky in our tests. And relying on jest and fast-check has a problem with dubzzz/fast-check#4986.

Either way, it means for the benefit of testing, and fast-checking, we should avoid using instanceof in our tests, replacing it with toHaveProperty or preferably X.isX sort of functions that's native to nodejs. Additionally we have a patch workaround for the fast check bridge to jest here dubzzz/fast-check#4986 (comment). However we can also just avoid using the bridge, and just use fast-check directly within the test functions.

There are workarounds mentioned like using jest-environment-node-single-context as the testEnvironment but it's not ideal as it introduces other problems.

So...

Copy link
Member Author

@amydevs get your js-ws issue here too. And identify what is blocking here.

@amydevs amydevs mentioned this issue Sep 9, 2024
12 tasks
Copy link
Contributor

Someone needs to take this over now that Amy has left the company.

Copy link
Member

Should this just be backlogged for the time being as this is not really being worked upon by anyone, and isn't on anyone's recent priority list either.

Copy link
Contributor

I'm moving this to backlog for now.

Copy link
Member Author

PKE requires this to be worked on. This is general issue across all repos and we should move to ESM for js-rpc to simplify PKE. I created an issue for this there ENG-504

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Epic r&d:polykey:supporting activity Supporting core activity research Requires research
Development

No branches or pull requests

4 participants