diff --git a/how-to-implement-swc-jvm-binding.html b/how-to-implement-swc-jvm-binding.html index bbf9318..49685df 100644 --- a/how-to-implement-swc-jvm-binding.html +++ b/how-to-implement-swc-jvm-binding.html @@ -1,21 +1,21 @@ -

How to implement SWC JVM binding

+

How to implement SWC JVM binding

Translated with DeepL.com (free version)

-

Background

-

In the process of using Kotlin Jupiter Kennel, I found that there is no 3D drawing library, and I can only use JS to draw data. We can only use JS to draw the data by using the HTML(...) function to write JS, which is very inconvenient. So I wrote the kotlin-jupyter-js plugin to support %js line magics. The core problem with the kotlin-jupyter-js plugin is: compiling JS code into ASTs is supported in the JVM. The core problem with the kotlin-jupyter-js plugin is that the JVM supports compiling JS code into ASTs.

-

My idea is to implement SWC's JVM binding to solve this problem, SWC itself provides Node binding, so JVM binding is not that difficult to implement. Moreover, SWC supports TS/JSX compilation, which allows kotlin-jupyter-js to support typescript and React.

-

Implementation Ideas

+

Background

+

In the process of using Kotlin Jupiter Kennel, I found that there is no 3D drawing library, and I can only use JS to draw data. We can only use JS to draw the data by using the HTML(...) function to write JS, which is very inconvenient. So I wrote the kotlin-jupyter-js plugin to support %js line magics. The core problem with the kotlin-jupyter-js plugin is: compiling JS code into ASTs is supported in the JVM. The core problem with the kotlin-jupyter-js plugin is that the JVM supports compiling JS code into ASTs.

+

My idea is to implement SWC's JVM binding to solve this problem, SWC itself provides Node binding, so JVM binding is not that difficult to implement. Moreover, SWC supports TS/JSX compilation, which allows kotlin-jupyter-js to support typescript and React.

+

Implementation Ideas

The SWC JVM binding implementation is divided into two parts: 1) compiling the SWC Rust code into a JNI dynamic library, and 2) the JVM side, which implements the configuration classes and AST classes.

SWC is for JS and only provides support for Node binding, we need to refer to Node binding to implement JVM binding.

SWC Node binding exposes API output and input parameters are JSON strings, in Node, JSON strings can be easily converted to objects, but in JVM, you need to declare the corresponding classes.

SWC provides WASM binding, you can encapsulate SWC based on WASM, the advantage is that you don't need to implement JNI binding, but you need to introduce WASM Runtime, so we don't consider it.

-

SWC binding

-

Rust JNI FFIs

+

SWC binding

+

Rust JNI FFIs

Compiling Rust into a JNI dynamic library requires the Rust JNI FFI, which is supported by using jni.

-

This library provides an easy way to bridge Rust and Java, see the official jni example.

+

This library provides an easy way to bridge Rust and Java, see the official jni example.

On the JVM side of the code.

-
class HelloWorld {
+
class HelloWorld {
     init {
         System.loadLibrary("mylib");
     }
@@ -23,7 +23,7 @@ 

jni_fn to generate the bridge function declaration automatically by macro to simplify the declaration.

-
#[jni_fn("HelloWorld")]
+
#[jni_fn("HelloWorld")]
 pub fn hello<'local>(...) -> jstring
-

With jni and jni_fn we can compile Rust code into JNI dynamic libraries.

-

Binding

+

With jni and jni_fn we can compile Rust code into JNI dynamic libraries.

+

Binding

SWC Node binding offers the following methods.

  • transform @@ -71,21 +71,21 @@

napi. However, the JVM's FFI jni doesn't only support asynchrony, so we only implement the synchronous APIs: transformSync,transformFileSync,parseSync,parseFileSync,minifySync,printSync.

-

pase_sync

-

Below is an example of pase_sync to explain how to implement it.

-

Dependencies

-

SWC itself only considers Node binding.swc_core implements the logic of binding to Node, aggregating other SWC sub-package dependencies. NMP package @swc/core also wraps swc_core. We can't use the swc_core library directly, we need to replace other SWC subpackage calls.

-

For example, Compiler from swc_core:

-
use swc_core::{
+

SWC Node binding provides synchronous and asynchronous methods via napi. However, the JVM's FFI jni doesn't only support asynchrony, so we only implement the synchronous APIs: transformSync,transformFileSync,parseSync,parseFileSync,minifySync,printSync.

+

pase_sync

+

Below is an example of pase_sync to explain how to implement it.

+

Dependencies

+

SWC itself only considers Node binding.swc_core implements the logic of binding to Node, aggregating other SWC sub-package dependencies. NMP package @swc/core also wraps swc_core. We can't use the swc_core library directly, we need to replace other SWC subpackage calls.

+

For example, Compiler from swc_core:

+
use swc_core::{
     base::{
         Compiler,
     },
 }

Needs to be changed to be introduced from swc.

-
use swc::Compiler;
-

All SWC-related dependencies after swc_core conversion.

-
[dependencies]
+
use swc::Compiler;
+

All SWC-related dependencies after swc_core conversion.

+
[dependencies]
 # ...
 swc = "0.270.25"
 swc_common = "0.33.9"
@@ -95,11 +95,11 @@ 

entry/exit parameter

-

Theoretically, what needs to be done is simple: replace all napi related logic with jni. We don't need to change how SWC implements the specific functionality.

-

See SWC - binding_core_node for the pase_sync implementation [binding_core_node /src/parse.rs#L168](https://github.com/swc-project/swc/blob/828190c035d61e6521280e2260c511bc02b81327/bindings/binding_core_node/ src/parse.rs#L168), parseSync copies most of the logic directly, but requires changes to the handling of incoming and outgoing parameters.

-

The pase_sync implementation of binding_core_node:

-
#[napi]
+

entry/exit parameter

+

Theoretically, what needs to be done is simple: replace all napi related logic with jni. We don't need to change how SWC implements the specific functionality.

+

See SWC - binding_core_node for the pase_sync implementation [binding_core_node /src/parse.rs#L168](https://github.com/swc-project/swc/blob/828190c035d61e6521280e2260c511bc02b81327/bindings/binding_core_node/ src/parse.rs#L168), parseSync copies most of the logic directly, but requires changes to the handling of incoming and outgoing parameters.

+

The pase_sync implementation of binding_core_node:

+
#[napi]
 pub fn parse_sync(src: String, opts: Buffer, filename: Option<String>) -> napi::Result<String> {
     // ...
 
@@ -107,7 +107,7 @@ 

Exception Handling

+

Getting a string passed by the JVM requires a call to get_string of JNIEnv.

+

Converting a Rust string to a Java string also requires a call to new_string of JNIEnv before converting to a jstring type.

+

Exception Handling

If SWC fails to process JS code (e.g. JS code has syntax errors), it needs to throw an exception to the JVM, which will be handled by the JVM side.

The code thrown by Rust is first caught and then converted into an exception thrown by the JVM.

-

The binding_core_node handler implements the MapErr<T> trait for Result, which converts the Rust exception to a napi exception via the convert_err method, and finally throws it in the Node.

+

The binding_core_node handler implements the MapErr<T> trait for Result, which converts the Rust exception to a napi exception via the convert_err method, and finally throws it in the Node.

Exception handling in SWC [swc/bindings/binding_core_node/src/parse.rs#L179](https://github.com/swc-project/swc/blob/ 828190c035d61e6521280e2260c511bc02b81327/bindings/binding_core_node/src/parse.rs#L179)

-
let program = try_with(c.cm.clone(), false, ErrorFormat::Normal, |handler| {
+
let program = try_with(c.cm.clone(), false, ErrorFormat::Normal, |handler| {
     // ....
 }).convert_err()?;
-

We need to throw JVM exceptions, so implement the JVM's MapErr<T> trait to turn Rust exceptions into jni exceptions for jni to throw to the JVM.

-

Copy the SWC's MapErr<T> trait.

-
pub trait MapErr<T>: Into<Result<T, anyhow::Error>> {
+

We need to throw JVM exceptions, so implement the JVM's MapErr<T> trait to turn Rust exceptions into jni exceptions for jni to throw to the JVM.

+

Copy the SWC's MapErr<T> trait.

+
pub trait MapErr<T>: Into<Result<T, anyhow::Error>> {
     fn convert_err(self) -> SwcResult<T> {
         self.into().map_err(|err| SwcException::SwcAnyException {
             msg: format!("{:?}", err),
         })
     }
 }
-

Result implements MapErr<T>.

-
impl<T> MapErr<T> for Result<T, anyhow::Error> {}
-

Here jni throws an exception and it should be noted that the function still needs to return a value, usually an empty string. Here jni-rs#76 explains why.

+

Result implements MapErr<T>.

+
impl<T> MapErr<T> for Result<T, anyhow::Error> {}
+

Here jni throws an exception and it should be noted that the function still needs to return a value, usually an empty string. Here jni-rs#76 explains why.

You still have to return to the JVM, even if you've thrown an exception. Remember that unwinding across the ffi boundary is always undefined behavior, so any panics need to be caught and recovered from in your extern functions.

The final exception is handled like this

-
let result = try_with(c.cm.clone(), false, ErrorFormat::Normal, |handler| {
+
let result = try_with(c.cm.clone(), false, ErrorFormat::Normal, |handler| {
     // ...
 }).convert_err();
 
@@ -175,9 +175,9 @@ 

SwcNative

+

SwcNative

Implementation of Rust compiled into a dynamic library, the next step will need to implement the JVM side of the glue code, the following is the Kotlin implementation.

-
class SwcNative {
+
class SwcNative {
     init {
         System.loadLibrary("swc_jni")
     }
@@ -185,9 +185,9 @@ 

Load Native Library from Class path, if System.loadLibrary("swc_jni") fails to load, then copy the jar's dynamic library to a temporary directory and load it again.

-
    init {
+

When the JVM loads swc_jni, it looks for dynamic libraries from the filesystem as a rule, but not from the resources directory of the jar. So, by System.loadLibrary("swc_jni") if there is no swc_jni dynamic library locally, it will fail to load. The user installs from maven and there is definitely no swc_jni locally.

+

Solution, refer to this answer Load Native Library from Class path, if System.loadLibrary("swc_jni") fails to load, then copy the jar's dynamic library to a temporary directory and load it again.

+
    init {
         try {
             System.loadLibrary("swc_jni")
         } catch (e: UnsatisfiedLinkError) {
@@ -197,10 +197,10 @@ 

Summary

-

Like the other methods just implement them like parse_sync.

+

Summary

+

Like the other methods just implement them like parse_sync.

At this point we can compile JS in the JVM.

-
SwcNative().parseSync(
+
SwcNative().parseSync(
     "var foo = 'bar'", 
     """{"syntax": "ecmascript";}""",
     "test.js",
@@ -208,7 +208,7 @@ 

Kotlin AST DSL

+

Kotlin AST DSL

Now that we get the AST JSON string, it is still inconvenient if we want to manipulate the AST. We need the JSON string to convert it to a class so that traversing and modifying it will be easy.

-

Also, the second options of parseSync is not type aware and needs to be constrained to a configuration item.

+

Also, the second options of parseSync is not type aware and needs to be constrained to a configuration item.

So how do we implement type descriptions for SWC ASTs and configuration item parameters in Kotlin?

I've tried an AI conversion from Rust to Kotlin and it works pretty well. The only problem is that it requires kryptonite, and I admit that lack of money is my problem.

Writing SWC class definitions from scratch? I'm afraid there's a lot of work to be done, SWC has 200+ ASTs and configuration item types.

The best solution is to generate Kotlin classes via scripts. As it happens, SWC provides the TS declaration file @swc/types.

-

@swc/types

-

When you open the declaration file for @swc/types, it is full of type and interface declarations with a very simple structure.

+

@swc/types

+

When you open the declaration file for @swc/types, it is full of type and interface declarations with a very simple structure.

It can be divided into the following cases.

  1. type alias
      -
    1. literal union type: type T = 'foo' | 'bar' -
    2. -
    3. primary union type: type T = string | number -
    4. -
    5. type alias and object literal type: type T = S & { foo: string } -
    6. -
    7. type alias union type: type T = S | E -
    8. +
    9. literal union type: type T = 'foo' | 'bar'
    10. +
    11. primary union type: type T = string | number
    12. +
    13. type alias and object literal type: type T = S & { foo: string }
    14. +
    15. type alias union type: type T = S | E
  2. interface

The case of Type alias is relatively complex, mainly because of the flexibility of JS.

-

type alias

+

type alias

For some special cases we need to reduce the dynamics of types to make it easier for us to work with them.

-

Like T | T[] we can convert to T[] to avoid not being able to define the type in Kotlin.

+

Like T | T[] we can convert to T[] to avoid not being able to define the type in Kotlin.

For example:

-
export interface Config {
+
export interface Config {
     test?: string | string[];
     // ...
 }

Just convert:

-
class Config {
+
class Config {
     var test: Array<String>? = null
 }
-

A literal union type like props: 'foo' | 'bar' should be converted directly to the base type: val props: String?.

-

A type T = S & { foo: string } requires that the object literal type be extracted as a separate type, with T inheriting from S and the extracted new type. Conversion to kotlin should look like this:

-
interface BaseT {
+

A literal union type like props: 'foo' | 'bar' should be converted directly to the base type: val props: String?.

+

A type T = S & { foo: string } requires that the object literal type be extracted as a separate type, with T inheriting from S and the extracted new type. Conversion to kotlin should look like this:

+
interface BaseT {
     val foo: String;
 }
 
 class T : S, BaseT {}
-

interface

-

For interface processing, it is divided into 3 parts: 1. TS interface to Kotlin class; 2. inheritance; 3. serialization.

-

TS interface to Kotlin class

-

Define a KotlinClass to represent the Kotlin class to be converted. Implement toString() to convert it to a Kotlin class.

-
export class KotlinClass {
+

interface

+

For interface processing, it is divided into 3 parts: 1. TS interface to Kotlin class; 2. inheritance; 3. serialization.

+

TS interface to Kotlin class

+

Define a KotlinClass to represent the Kotlin class to be converted. Implement toString() to convert it to a Kotlin class.

+
export class KotlinClass {
     klassName: string = '';
     headerComment: string = ''
     annotations: string[] = []
@@ -321,9 +317,9 @@ 

parents: string[] = [] properties: KotlinClassProperty[] = [] }

-

The KotlinClass is generated by traversing the AST of the TS interface.

-

When traversing interface properties, you need to recursively traverse the properties of the parent class. Properties inherited from the parent type need to set KotlinClassProperty.isOverride to true to facilitate the generation of kotlin classes with the override modifier.

-
class KotlinClassProperty {
+

The KotlinClass is generated by traversing the AST of the TS interface.

+

When traversing interface properties, you need to recursively traverse the properties of the parent class. Properties inherited from the parent type need to set KotlinClassProperty.isOverride to true to facilitate the generation of kotlin classes with the override modifier.

+
-

Inheritance

-

The parent interface from which the TS interface directly inherits is simply added to the KotlinClass.parents array.

-

However, type T = S | E needs to be handled separately.

+

Inheritance

+

The parent interface from which the TS interface directly inherits is simply added to the KotlinClass.parents array.

+

However, type T = S | E needs to be handled separately.

As an example

-
export interface VariableDeclarator extends Node, HasSpan {
+
export interface VariableDeclarator extends Node, HasSpan {
     init?: Expression;
 
     // other props...
@@ -350,9 +346,9 @@ 

Serialization

+

So, for type T = S | E, T is the parent of S and E, and T needs to be added to the KotlinClass.parents array of S,E.

+

Serialization

When serializing AST nodes, one encounters problems with polymorphic serialization.

-

For example, serialize Expression, and Expression is an empty interface, then toJson doesn't know how to deal with ThisExpression and ArrayExpression properties, and then it can only throw an exception or output an empty object, which don't meet our expectation.

-
val thisExpression: ThisExpression = ThisExpression()
+

For example, serialize Expression, and Expression is an empty interface, then toJson doesn't know how to deal with ThisExpression and ArrayExpression properties, and then it can only throw an exception or output an empty object, which don't meet our expectation.

+
val thisExpression: ThisExpression = ThisExpression()
 val arrayExpression: ArrayExpression = ArrayExpression()
 
 var expression: Expression = thisExpression
@@ -374,15 +370,15 @@ 

polymorphic serialization, requires transforming the kotlin code.

-

Annotate the class with JsonClassDiscriminator to indicate by which field the type is distinguished, and SerialName to indicate the name of the type after serialization. Deserialization can find the specific type based on this type name.

-
interface ArrayExpression : ExpressionBase, Expression {
+

Annotate the class with JsonClassDiscriminator to indicate by which field the type is distinguished, and SerialName to indicate the name of the type after serialization. Deserialization can find the specific type based on this type name.

+
interface ArrayExpression : ExpressionBase, Expression {
     // ....
 }
 
@@ -404,8 +400,8 @@ 

DSL

+

DSL

We have generated class definitions for ASTs and configuration items, and would find it less elegant and convenient to build configuration or ASTs directly using classes.

-
const foo = 'bar'
+
const foo = 'bar'
SWC compile output string -
    {
+
    {
       "type": "VariableDeclaration",
       "span": {
         "start": 0,
@@ -481,7 +477,7 @@ 

kotlin -- Type-safe builders

-

DSL extension function

-
interface VariableDeclarator : Node, HasSpan {
+

DSL extension function

+
interface VariableDeclarator : Node, HasSpan {
     val init: Expression?;
     // other props...
 }
-

For the VariableDeclarator interface, its init field type is Expression, meaning that its right value can be any subtype of arrayExpression, thisExpression, and so on.

-
variableDeclarator {
+

For the VariableDeclarator interface, its init field type is Expression, meaning that its right value can be any subtype of arrayExpression, thisExpression, and so on.

+
variableDeclarator {
     init = arrayExpression { ... }
     // or
     init = thisExpression { ... }
 }
-

So for VariableDeclarator it should have methods to create all Expression subclasses. The creation of Expression subclasses is added by extending the function to do so.

-

When we parse @swc/types declaration file, we need to check the type of the attribute, if it is converted to Kotlin and is a class, then find out all its non-intermediate subclasses, and then generate extension functions for it.

-
fun VariableDeclarator.arrayExpression(block: ArrayExpression.() -> Unit): ArrayExpression {
+

So for VariableDeclarator it should have methods to create all Expression subclasses. The creation of Expression subclasses is added by extending the function to do so.

+

When we parse @swc/types declaration file, we need to check the type of the attribute, if it is converted to Kotlin and is a class, then find out all its non-intermediate subclasses, and then generate extension functions for it.

+
fun VariableDeclarator.arrayExpression(block: ArrayExpression.() -> Unit): ArrayExpression {
     return ArrayExpressionImpl().apply(block)
 }
-

This allows the Expression class to be constructed from the arrayExpression {} function in variableDeclarator {}.

-

-TemplateLiteral vs TsTemplateLiteralType -

-

There's another special case to deal with here. TemplateLiteral conflicts with TsTemplateLiteralType, whose type is "TemplateLiteral". This makes DSL-built ASTs unserializable. See the definition of a structure in rust.

-
// https://github.com/swc-project/swc/blob/828190c035d61e6521280e2260c511bc02b81327/crates/swc_ecma_ast/src/typescript.rs#L823
+

This allows the Expression class to be constructed from the arrayExpression {} function in variableDeclarator {}.

+

TemplateLiteral vs TsTemplateLiteralType

+

There's another special case to deal with here. TemplateLiteral conflicts with TsTemplateLiteralType, whose type is "TemplateLiteral". This makes DSL-built ASTs unserializable. See the definition of a structure in rust.

+
// https://github.com/swc-project/swc/blob/828190c035d61e6521280e2260c511bc02b81327/crates/swc_ecma_ast/src/typescript.rs#L823
 #[ast_node("TemplateLiteral")]
 #[derive(Eq, Hash, EqIgnoreSpan)]
 #[cfg_attr(feature = "arbitrary", derive(arbitrary::Arbitrary))]
 pub struct TsTplLitType {
     // ...
 }
-
// https://github.com/swc-project/swc/blob/828190c035d61e6521280e2260c511bc02b81327/crates/swc_ecma_ast/src/expr.rs#L1060
+
// https://github.com/swc-project/swc/blob/828190c035d61e6521280e2260c511bc02b81327/crates/swc_ecma_ast/src/expr.rs#L1060
 #[ast_node("TemplateLiteral")]
 #[derive(Eq, Hash, EqIgnoreSpan)]
 #[cfg_attr(feature = "arbitrary", derive(arbitrary::Arbitrary))]
@@ -575,8 +569,8 @@ 

pub quasis: Vec<TplElement>, }

These two types need to be handled separately and not generated by a script.

-

Implement both TemplateLiteral, TsTemplateLiteralType by one class. When used, it is then up-converted to TemplateLiteral, TsTemplateLiteralType.

-
// ignore annotation
+

Implement both TemplateLiteral, TsTemplateLiteralType by one class. When used, it is then up-converted to TemplateLiteral, TsTemplateLiteralType.

+
-

新的 parseSync -

-

Now we can upgrade the parseSync signature.

-
@Throws(RuntimeException::class)
+

新的 parseSync

+

Now we can upgrade the parseSync signature.

+
@Throws(RuntimeException::class)
 fun parseSync(code: String, options: ParserConfig, filename: String?): Program 

Type safety and type hints are now guaranteed when used.

-
const program = SwcNative().parseSync(
+
const program = SwcNative().parseSync(
     """
     function App() {
        return <div>App</div>
@@ -620,7 +613,7 @@ 

Conclusion

+

Conclusion

Here, we have explained the idea and core implementation points of SWC JVM binding: 1. SWC supports JNI; 2. AST JSON is serialized into Kotlin classes; 3. ASTs and configurations are described through DSL.

Some details are not covered, such as the handling of boundary cases in Kotlin generated scripts, Rust cross-compilation, etc. For more details, you can read the source code. If you are interested in the details, you can read the source code yidafu/swc-binding.

If you need to compile JS in the JVM, SWC JVM binding has been released to the Maven central repository, use dev.yidafu.swc:swc-binding:0.5.0. swc-binding)

@@ -628,4 +621,3 @@