Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate performance issues for experimental plugin #3598

Closed
kwonoj opened this issue Feb 16, 2022 · 38 comments
Closed

Investigate performance issues for experimental plugin #3598

kwonoj opened this issue Feb 16, 2022 · 38 comments
Assignees

Comments

@kwonoj
Copy link
Member

kwonoj commented Feb 16, 2022

I wrote a fairly simple plugin to extract the first string literal argument from calls to i18n.get, in order to collect translation strings that are used.

My goal was to transform an input file into a JSON array of strings. Please disregard that it currently will emit a semicolon at the end of that array :-)

use std::vec::Vec;
use swc_plugin::{ast::*, plugin_transform, syntax_pos::DUMMY_SP};

pub struct I18NVisitor {
    strings: Vec<String>,
}

impl Default for I18NVisitor {
    fn default() -> I18NVisitor {
        I18NVisitor {
            strings: Vec::new(),
        }
    }
}

impl Visit for I18NVisitor {
    noop_visit_type!();

    fn visit_call_expr(&mut self, call_expr: &CallExpr) {
        if let Callee::Expr(expr) = &call_expr.callee {
            if let Expr::Member(member_expr) = &**expr {
                if let (Expr::Ident(obj_id), MemberProp::Ident(prop_id)) =
                    (&*member_expr.obj, &member_expr.prop)
                {
                    if &*obj_id.sym == "i18n" && &*prop_id.sym == "get" {
                        let args = &call_expr.args;
                        let arg0 = &args[0];
                        let expr = &arg0.expr;
                        if let Expr::Lit(lit_expr) = &**expr {
                            if let Lit::Str(str_lit) = lit_expr {
                                self.strings.push(str_lit.value.to_string());
                            }
                        }
                    }
                }
            }
        }
    }
}

#[plugin_transform]
pub fn process_transform(program: Program, _config: String, _context: String) -> Program {
    let mut visitor = I18NVisitor::default();

    program.visit_with(&mut visitor);

    let strs = visitor.strings.into_iter().map(|s| {
        Some(ExprOrSpread {
            spread: None,
            expr: Box::new(Expr::Lit(Lit::Str(Str {
                span: DUMMY_SP,
                has_escape: false,
                kind: StrKind::default(),
                value: JsWord::from(s),
            }))),
        })
    });
    let expr = Box::new(Expr::Array(ArrayLit {
        span: DUMMY_SP,
        elems: strs.collect(),
    }));
    let stmt = Stmt::Expr(ExprStmt {
        span: DUMMY_SP,
        expr,
    });
    let body = Vec::from([ModuleItem::Stmt(stmt)]);

    return Program::Module(Module {
        span: DUMMY_SP,
        body,
        shebang: None,
    });
}

Now when trying this WASM plugin, it is a lot slower than the equivalent JavaScript version while processing about 4500 files individually. The JavaScript version takes about 5 seconds (using parseFile + visitor.visitModule), while the Rust WASM plugin version takes about a minute (using a lot more CPU as well!) and sometimes actually crashes with (likely caused by my lacking knowledge of writing code in Rust):

✘ [ERROR] [plugin i18n-plugin] failed to handle: failed to invoke plugin: failed to invoke `./swc_i18n.wasm` as js transform plugin at swc_i18n.wasm

Caused by:
    0: RuntimeError: out of bounds memory access
           at dlfree (<module>[538]:0xce51b)
           at free (<module>[537]:0xce131)
           at __plugin_process_impl (<module>[34]:0x7534)
           at __plugin_process_impl.command_export (<module>[566]:0xd0534)
    1: heap_get_oob

It is invoked via transformFile inside an esbuild build script as follows:

await transformFile(filePath, {
    syntax: 'typescript',
    tsx: true,
    jsc: {
    parser: {
        syntax: 'typescript',
        tsx: true,
    },
    experimental: {
        plugins: [['./swc_i18n.wasm', {}]],
    },
    },
});

Using @swc/core version 1.2.140 (JS) & swc_plugin version 0.27.0 (Rust)

I hoped I could speed things up even further using a native plugin, but got the opposite. What am I doing wrong? :)

Originally posted by @fxb in #3540 (comment)

@kwonoj
Copy link
Member Author

kwonoj commented Feb 16, 2022

  1. Which platform do you use? (Windows / Mac / Linux): usually this doesn't matter but just in case.
  2. Can you share full script invokes transformFile?
  3. Do you possibly have a repro repo?

nit:

inside an esbuild build script as follows:

Is this esbuild typo?

@kwonoj
Copy link
Member Author

kwonoj commented Feb 16, 2022

Current possible guess is the way we init wasm module - we instantiate it per each transform, which increases linear per input number grows. May try some attempt to reuse modules.

@kwonoj kwonoj added this to the v2.0.0-alpha.1 milestone Feb 16, 2022
@fxb
Copy link

fxb commented Feb 16, 2022

  1. macOS 12.1, MacBook Pro (15-inch, 2019, 2.4 GHz 8-Core Intel Core i9, 32 GB RAM)
  2. I'll post a simplified version of the esbuild plugin that calls SWC's transformFile below.
  3. I'm trying this out in an internal repo, and I can't make a public version of it that has the same amount of files (~4500) easily.

Also adding the last comment I posted in the discussion thread (#3540):

Simplifying it even further to exhibits the same problems:

use swc_plugin::{ast::*, plugin_transform};

#[plugin_transform]
pub fn process_transform(program: Program, _config: String, _context: String) -> Program {
    program
}

It must be that the WASM runtime overhead that hits 4500 times as it needs to instantiate the plugin for each file?

@kwonoj
Copy link
Member Author

kwonoj commented Feb 16, 2022

I'll post a simplified version of the esbuild plugin that calls SWC's transformFile below.

This is part I'm bit confused, are you saying you call swc inside of esbuild and with plugins?

@fxb
Copy link

fxb commented Feb 16, 2022

I'll post a simplified version of the esbuild plugin that calls SWC's transformFile below.

This is part I'm bit confused, are you saying you call swc inside of esbuild and with plugins?

Yes, I do have an esbuild based build script, which has a custom esbuild plugin, which in turn invokes SWC.

@fxb
Copy link

fxb commented Feb 16, 2022

This is a simplified version of the esbuild plugin (https://esbuild.github.io/plugins/) I wrote.

As esbuild itself offers no way of visiting the AST in a plugin I investigated using SWC for that part (instead of using Babel, which is slower).

Instead of using esbuild for all of this, it should be simple to write a script that just runs SWC's transformFile on a bunch of files individually.

const { parseFile, transformFile } = require('@swc/core');
const { Visitor } = require('@swc/core/Visitor.js');

class I18nVisitor extends Visitor {
  constructor() {
    super();

    this.strings = [];
  }

  visitCallExpression(node) {
    if (node.callee.type !== 'MemberExpression') {
      return super.visitCallExpression(node);
    }

    const isI18nObject =
      node.callee.object.type === 'Identifier' &&
      node.callee.object.value === 'i18n';
    const isGetProperty =
      node.callee.property.type === 'Identifier' &&
      node.callee.property.value === 'get';

    if (!isI18nObject || !isGetProperty) {
      return super.visitCallExpression(node);
    }

    if (node.arguments.length === 0) {
      throw new Error(
        `Invalid number of arguments to 'i18n.get' call! One argument is required.`,
      );
    }

    const arg = node.arguments[0].expression;
    if (arg.type !== 'StringLiteral') {
      throw new Error(
        `Invalid argument for 'i18n.get' call! Only string literals are allowed.`,
      );
    }

    this.strings.push(arg.value);

    return super.visitCallExpression(node);
  }

  visitTsType(node) {
    return node;
  }
}

const i18nPlugin = () => ({
  name: 'i18n-plugin',
  setup(build) {
    const strings = [];

    // /^(?!.*\.d\.tsx?$).*\.[tj]sx?$/
    build.onLoad(
      { filter: /\.(js|jsx|ts|tsx)$/ },
      async ({ path: filePath }) => {
        const visitor = new I18nVisitor();

        // Using 'parseFile' API (fastest)
        // const module = await parseFile(filePath, {
        //   syntax: 'typescript',
        //   tsx: true,
        // });
        // visitor.visitModule(module);

        // Using 'transformFile' API
        // await transformFile(filePath, {
        //   syntax: 'typescript',
        //   tsx: true,
        //   plugin: module => visitor.visitProgram(module),
        // });

        // Using 'transformFile' API with Rust (WASM) plugin (slowest)
        await transformFile(filePath, {
          jsc: {
            parser: {
              syntax: 'typescript',
              tsx: true,
            },
            experimental: {
              plugins: [['./swc_i18n.wasm', {}]],
            },
          },
        });

        strings.push(...visitor.strings);

        return undefined;
      },
    );

    build.onEnd(async () => {
      // eslint-disable-next-line no-console
      console.log(
        `[i18n-plugin] Found ${strings.length} strings!`,
      );
    });
  },
});

module.exports = i18nPlugin;

@kwonoj
Copy link
Member Author

kwonoj commented Feb 16, 2022

What happens if you isolate to invoke swc only? I doubt it makes huge difference but for better isolation between multiple processes.

@fxb
Copy link

fxb commented Feb 16, 2022

I've made this simple script to take esbuild out of the equation and just run SWC on the same file over and over again, once with the WASM plugin (no-op) and once without:

const { transformFile } = require('@swc/core');

const filePath = './path/to/file.tsx';

async function main() {
  for (let i = 0; i < 1000; ++i) {
    await transformFile(filePath, {
      jsc: {
        parser: {
          syntax: 'typescript',
          tsx: true,
        },
        // Commented this out for the second run
        experimental: {
          plugins: [['./swc_i18n.wasm', {}]],
        },
      },
    });
  }
}

main();

The first run (with WASM plugin enabled):

node ./transformFile.js  7.76s user 3.24s system 122% cpu 8.963 total

The second run (with WASM plugin commented out):

node ./transformFile.js  2.21s user 1.40s system 218% cpu 1.651 total

@RiESAEX
Copy link
Contributor

RiESAEX commented Feb 16, 2022

Does our in-memory bytecode cache work with js calls?
If we call transformFile 1000 times in js, how many times does swc try to read the file in .swc?

@kwonoj
Copy link
Member Author

kwonoj commented Feb 16, 2022

Theoretically, bytecode cache itself is being shared across transform* calls. If we see overhead, probably it's coming from actual module instantiation.

@kwonoj
Copy link
Member Author

kwonoj commented Feb 16, 2022

OK, this is baseline script I'll try to dig further.

const { transform } = require('.');
const { PerformanceObserver, performance } = require('perf_hooks');

const EXECUTE_COUNT = 5000;

const obs = new PerformanceObserver((items) => {
  console.log(items.getEntries()[0]);
  performance.clearMarks();
});

const arr = [...Array(EXECUTE_COUNT).keys()];

obs.observe({ entryTypes: ['measure'], buffered: true });

performance.mark("scheduleTask-start");
let tasks = arr.map((value) => transform(`console.log(${value})`, {
  jsc: {
    experimental: {
      plugins: [[`${pluginPath}`, {}]]
    }
  }
}));
performance.mark("scheduleTask-complete");
performance.measure("scheduleTask", "scheduleTask-start", "scheduleTask-complete");

performance.mark("executeTask-start");
Promise.all(tasks).then(() => {
  performance.mark("executeTask-complete");
  performance.measure("executeTask", "executeTask-start", "executeTask-complete");
});
PerformanceEntry {
  name: 'executeTask',
  entryType: 'measure',
  startTime: 0,
  duration: 23907.4127
}

@kwonoj
Copy link
Member Author

kwonoj commented Feb 17, 2022

Ok, could confirm module init caching is the first notable offender. Will work on revising cache logic.

@kwonoj
Copy link
Member Author

kwonoj commented Feb 18, 2022

@fxb let's try #3616 this once new version's published.

@joshcartme
Copy link

joshcartme commented Feb 23, 2022

👋 hey @kwonoj, am I correct in thinking that this landed in @swc/core 1.2.142?

I'm using swc-loader and on @swc/core 1.2.144 and getting RuntimeError: Out of bounds memory access when trying to use a plugin that is just an unmodified

$ swc plugin new test_plugin --target-type=wasm32-wasi

@kwonoj
Copy link
Member Author

kwonoj commented Feb 23, 2022

@joshcartme It sounds like bit different to the perf issue itself. Those specific path is currently under basic test coverage (https://github.com/swc-project/swc/blob/main/crates/swc_plugin_runner/tests/integration.rs#L91) which we haven't observed since.

I know you just mentioned it's from barebone plugin, but can you try to create reproducible repo to share?

@joshcartme
Copy link

@kwonoj yeah let me see if I can come up with a repro. Our build has a huge number of files and I'm wondering if it has to do with that.

@kwonoj
Copy link
Member Author

kwonoj commented Feb 23, 2022

@joshcartme I recall we used to use thread-loader for the webpack. Is it gone with swc-loader? and also are there some parallel thread / processes pushes memory pressure?

@joshcartme
Copy link

@kwonoj I did get rid of thread-loader to test this out, the only thing be applied to javascript/typescript files is swc-loader. That part of the webpack config looks like:

{
    test: /\.(j|t)sx?$/,
    exclude: excludeCertainFiles,
    use: [
        {
            loader: 'swc-loader',
            options: {
                jsc: {
                    parser: {
                        syntax: 'typescript',
                        tsx: true,
                    },
                    experimental: {
                        plugins: [
                            ['/path/to/test_plugin/target/wasm32-wasi/debug/test_plugin.wasm', {}],
                        ]
                    },
                },
            },
        },
    ],
},

And the error I'm getting is:

thread '<unnamed>' panicked at 'failed to invoke plugin: failed to invoke `/path/to/test_plugin/target/wasm32-wasi/debug/test_plugin.wasm` as js transform plugin at /path/to/test_plugin/target/wasm32-wasi/debug/test_plugin.wasm

Caused by:
    0: RuntimeError: out of bounds memory access
           at dlfree (<module>[11762]:0x381167)
           at free (<module>[11761]:0x380fcd)
           at __rdl_dealloc (<module>[11728]:0x37de6b)
           at __rust_dealloc (<module>[6818]:0x2a228c)
           at alloc::alloc::dealloc::hb3519a7bad02d435 (<module>[8949]:0x305c94)
           at <rkyv::util::aligned_vec::AlignedVec as core::ops::drop::Drop>::drop::h4cad9fb03489331b (<module>[9293]:0x311916)
           at core::ptr::drop_in_place<rkyv::util::aligned_vec::AlignedVec>::hb56ec81820140e26 (<module>[9079]:0x30b8d6)
           at core::ptr::drop_in_place<swc_common::plugin::Serialized>::hbe53c5a5af5d8bc2 (<module>[9109]:0x30c16f)
           at __plugin_process_impl (<module>[1010]:0x2f9d1)
           at __plugin_process_impl.command_export (<module>[11886]:0x38841a)
    1: heap_get_oob

Stack backtrace:
   0: _rust_eh_personality
   1: _wasmer_vm_f64_nearest
   2: _rust_eh_personality
   3: _rust_eh_personality
   4: _rust_eh_personality
   5: _rust_eh_personality
   6: _rust_eh_personality
   7: _rust_eh_personality
   8: <unknown>
   9: _worker
  10: __pthread_start', crates/swc/src/plugin.rs:130:14

@kwonoj
Copy link
Member Author

kwonoj commented Feb 23, 2022

I see. Please share repo once possible. Given stack it looks like it's more close to incorrectly deallocating memory instead of oom, but might need to peek.

@joshcartme
Copy link

joshcartme commented Feb 23, 2022

@kwonoj ok this is a very contrived example. It's not even really valid, but it fails in the same way:
swc-repro.js

import { blamo } from 'it-doesnt-exist';
import { wamo } from 'it-doesnt-exist';

// uncomment this and it works
// const factory = () => {};

export const blub = factory(
	() => {
		// or leave the definition of factory commented out and comment out any of these lines and it works
		if (blamo()) return;
		foo(updatedState);

		wamo();
		bat();

		if (true) {
			const bam = baz();
		}
	}
);

.swcrc (the plugin is unmodified output from creating a new plugin):

{
    "jsc": {
        "experimental": {
            "plugins": [
                ["/path/to/test_plugin/target/wasm32-wasi/debug/test_plugin.wasm", {}]
            ]
        }
    }
}

Then:

npx swc ./swc-repro.js

fails with the same error.

If you uncomment the line that defines factory, it works. Or if you don't do that and comment out any line inside the arrow function it works.

@joshcartme
Copy link

I just discovered that if swc-repro is at the top level, this repo doesn't work. But moving it to new directories, for example fun/stuff and then doing npx swc ./fun/stuff/swc-repro.js does make it fail. Let me attempt to make a completely clean repo which illustrates the issue

@joshcartme
Copy link

👋 ok @kwonoj here's a repro https://github.com/joshcartme/swc-oob-repro. I put a README with some instructions but happy to explain more as is helpful

@kwonoj
Copy link
Member Author

kwonoj commented Feb 24, 2022

@joshcartme it.. works on my machine? :lolsob:

image

@joshcartme
Copy link

oh boy! Did you build the plugin with

cargo build

or

cargo build --release

?

@kwonoj
Copy link
Member Author

kwonoj commented Feb 24, 2022

I did try both just to be sure.

@joshcartme
Copy link

joshcartme commented Feb 24, 2022

Is there a way I could share the built wasm file I have with you to see if that makes a difference?

edit: ooh and I wonder, I'm on os x, maybe it's an operating system specific thing

@kwonoj
Copy link
Member Author

kwonoj commented Feb 24, 2022

I think this #3732 might fix issue. Let's try once this goes in and new version published.

@joshcartme
Copy link

Sounds good, thanks!

fwiw, I've tried this on my work computer macOS 10.15.7 and my personal computer macOS 11.6 and they both fail in the same way

@kdy1
Copy link
Member

kdy1 commented Feb 25, 2022

As #3732 is a change of plugin macro, it can be tested without a new version of @swc/core

@joshcartme
Copy link

@kdy1 can I test it now, and if so how?

@kdy1
Copy link
Member

kdy1 commented Feb 25, 2022

Yes, and by upgrading the macro crate

@joshcartme
Copy link

Sorry to be dense, I'm pretty new to rust and SWC, is that just:

cargo install swc_cli

?

@joshcartme
Copy link

Oh so I think the answer to my question was that no, that's not the way. I added

swc_plugin_macro = "0.3.1"

to my cargo.toml

@kwonoj
Copy link
Member Author

kwonoj commented Feb 25, 2022

@joshcartme try cargo update .

@joshcartme
Copy link

Ok, with the update I can no longer repro on https://github.com/joshcartme/swc-oob-repro 🎉

I'm going to test things out in my bigger codebase and will report back. Thanks @kwonoj and @kdy1!

@joshcartme
Copy link

I'm not getting out of bounds memory access in my actual build anymore, thank you!

@kwonoj
Copy link
Member Author

kwonoj commented Mar 1, 2022

With recent updates, I think this is reasonably resolved. We don't want to track all of performance issues in a single issue, so let me close this. If this is still a problem please share repro. Otherwise, please create a new issue.

@kwonoj kwonoj closed this as completed Mar 1, 2022
@swc-bot
Copy link
Collaborator

swc-bot commented Oct 18, 2022

This closed issue has been automatically locked because it had no new activity for a month. If you are running into a similar issue, please create a new issue with the steps to reproduce. Thank you.

@swc-project swc-project locked as resolved and limited conversation to collaborators Oct 18, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Development

No branches or pull requests

6 participants