Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Normalized filename in cache #997

Merged
merged 4 commits into from
Apr 20, 2020
Merged

Conversation

sylc
Copy link
Contributor

@sylc sylc commented Apr 5, 2020

Fix #754 (comment)

These changes make the filename used to index the cache normalized.

Tested on windows only on https://github.com/sylc/ts-node-debug

@sylc sylc changed the title Normalized filename Normalized filename in cache Apr 5, 2020
@cspotcode
Copy link
Collaborator

cspotcode commented Apr 8, 2020

@sylc! Thanks. I did some testing locally on this very large sample project: https://github.com/TypeStrong/ts-node-repros/tree/754/src
https://github.com/TypeStrong/ts-node-repros/blob/754/run.ts

Turns out we need to normalize paths at the very top of getOutput (or even pass a normalized path into getOutput)

The problem is, even if we keep normalized paths in our cache, we still end up passing un-normalized paths to the compiler, for example to getEmitOutput() and getSemanticDiagnostics().

Node's require mechanism passes us Windows \ paths. But the TypeScript compiler always deals in / paths, based on what we get back from config file parsing with --files enabled. To keep things simple, we should normalize paths as soon as they get to us (in _compile or at the top of getOutput)

When I made that change, and with --files enabled, ts-node was able to load 1000 files, typecheck them all at the start, then never again. This made the code run super-fast.

It was still terrible without --files and with dynamic require() calls, because each require() changed the project and required a full typecheck. This is just a limitation of typechecking, and people who want to avoid it will have to change how they do things.

…t of ts-node's codebase only deals with / paths
@cspotcode
Copy link
Collaborator

@sylc I went ahead and pushed an implementation of what I described. We can always revert back to yours if it turns out I was wrong.

@sylc
Copy link
Contributor Author

sylc commented Apr 8, 2020

@cspotcode thanks for looking into it. I tested your changes on my project and it is working. (timings are the same as with tsc 3.7.5) 👍

Altough I have a couple of question:

  • I ran your test repo with your ts-node changes and it says that ts-node with --files is faster than tsc enabled. I found that surprising. good but surprising. Do you know why? Am I miss reading the test?
+ ./node_modules/.bin/tsc --project .

real    0m3.332s
user    0m0.030s
sys     0m0.107s
+ ./node_modules/.bin/ts-node ./run.ts

real    0m41.985s
user    0m0.061s
sys     0m0.136s
+ ./node_modules/.bin/ts-node --files ./run.ts

real    0m3.070s
user    0m0.015s
sys     0m0.106s
  • Regarding --files: I use ts-node to run a dev server. I do not use --files because the project has a lot of test files that are not necessary to run the dev server. My understanding was that not using --files was an easy way to separate the test files from the necessary files by only providing 1 entry point. So I would assume that ts-node go through the file tree and load the necessary files then run the type checking, like if --files was used but only with a subset fo files, which should make it faster than with --files .
    If it is not the case then for my dev server tsconfig.server.json, I should maintain a regex file list of only the files required for it and add the --files option. (But this is not as developer friendly). Do you agree?

thanks

@cspotcode
Copy link
Collaborator

cspotcode commented Apr 8, 2020

@sylc

@cspotcode thanks for looking into it. I tested your changes on my project and it is working. (timings are the same as with tsc 3.7.5) 👍

Awesome, thanks!

  • I ran your test repo with your ts-node changes and it says that ts-node with --files is faster than tsc enabled. I found that surprising. good but surprising. Do you know why? Am I miss reading the test?

The test is intentionally written to dynamically require 1000 files as opposed to statically importing them. This mimics the behavior of a test runner like mocha, which is programmatically discovering and require()ing all the test files.

In this situation, without --files, TypeScript does not see these files during the first typecheck. There are no import statements telling it to look at these files. When one is require()d, we have to tell the compiler to look at it by adding it to the files array. This means the Program has changed, and TypeScript must re-do a full typecheck.

With --files, TypeScript is told to look at all these files up-front. It parses all 1000 files and does a typecheck. When we require() them dynamically, they have already been included in the Program, so typechecking is not repeated.

The difference is typechecking the project 1000 times vs only once.

  • Regarding --files: I use ts-node to run a dev server. I do not use --files because the project has a lot of test files that are not necessary to run the dev server. My understanding was that not using --files was an easy way to separate the test files from the necessary files by only providing 1 entry point. So I would assume that ts-node go through the file tree and load the necessary files then run the type checking, like if --files was used but only with a subset fo files, which should make it faster than with --files .
    If it is not the case then for my dev server tsconfig.server.json, I should maintain a regex file list of only the files required for it and add the --files option. (But this is not as developer friendly). Do you agree?

@blakeembrey might have a better answer, but here's what I know.

If your codebase uses import statements, those files are immediately discovered and parsed during type-checking, because the typechecker understands imports. The imports are compiled into require() calls which are executed by node. Unfortunately, typechecking might still be re-done when the require calls execute.

The compiler has a distinction between "files discovered via imports" and "files explicitly included in the files array." Due to #999, we have to change how we deal with the latter. When require() executes, we must add that file to the explicit files array. Even if the typechecker has already seen and typechecked the file, it has now moved from "files discovered" into "files explicitly included." This means the Program changed, and typechecking is re-done.

I have another PR to avoid modifying the "files explicitly included" array unless absolutely necessary, which should help.

EDIT: specifically to answer your question about which option is faster:
It might actually be faster to take the nuclear approach: use --files and include everything, even your test files, in the typecheck. You can also make several tsconfig files for different situations, each globbing a different subset of your source code. Finally, I personally use --transpile-only to avoid typechecking in ts-node, and use tsc --watch or VSCode's tooltips for typechecking while I write code.

@cspotcode cspotcode merged commit 5e192d3 into TypeStrong:master Apr 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Compilation is unbelievably slow
2 participants