Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Significantly speed up file handling error paths #17920

Merged
merged 1 commit into from
Oct 14, 2024

Conversation

hauntsaninja
Copy link
Collaborator

@hauntsaninja hauntsaninja commented Oct 11, 2024

It's late at night, so I'm probably measuring wrong, but I think I got a 1.5x speed up on the following when running at compile level 3, going from 50s to 33s on a Linux box:

mypy -c 'import torch' --no-incremental

Not really a measurable change when run interpreted.

See #17919

I'm probably measuring wrong, but I think I got a 1.5x speed up on
the following when running compiled:
```
mypy -c 'import torch' --no-incremental
```
@hauntsaninja hauntsaninja changed the title Significantly speed up file handling paths??? Significantly speed up file handling paths Oct 11, 2024
@hauntsaninja hauntsaninja marked this pull request as draft October 11, 2024 05:42
Copy link
Contributor

According to mypy_primer, this change doesn't affect type check results on a corpus of open source code. ✅

@hauntsaninja hauntsaninja changed the title Significantly speed up file handling paths Significantly speed up file handling paths (when compiled) Oct 11, 2024
@hauntsaninja hauntsaninja marked this pull request as ready for review October 11, 2024 07:42
@hauntsaninja
Copy link
Collaborator Author

hauntsaninja commented Oct 11, 2024

Okay, I think I am benchmarking this right, but I think I only see the win on -c "import torch", not on any of the things in mypy_primer corpus (this is an artifact of not having long search paths in primer). I'll double check some other use cases in day time tomorrow.

If you have a benchmarking setup, I'd be curious if you see an effect!

@JukkaL
Copy link
Collaborator

JukkaL commented Oct 11, 2024

I'm not seeing much of a difference on my Linux desktop:

Interpreted, master:

(torch) jukka mypy $ time mypy --no-incremental -c 'import torch'
Success: no issues found in 1 source file

real	0m21.448s
user	0m20.794s
sys	0m0.586s

Interpreted, PR branch:

(torch) jukka mypy $ time mypy --no-incremental -c 'import torch'
Success: no issues found in 1 source file

real	0m21.464s
user	0m20.738s
sys	0m0.666s

(torch) jukka mypy $ time mypy --no-incremental -c 'import torch'
Success: no issues found in 1 source file

real	0m7.097s
user	0m6.720s
sys	0m0.357s

Compiled, PR branch:

(torch) jukka mypy $ time mypy --no-incremental -c 'import torch'
Success: no issues found in 1 source file

real	0m7.045s
user	0m6.672s
sys	0m0.354s

Compiled, PR branch, incremental:

(torch) jukka mypy $ time mypy -c 'import torch'
Success: no issues found in 1 source file

real	0m1.093s
user	0m1.008s
sys	0m0.085s

Some hypotheses that could explain the differences:

  • You have some (security) software running which slows down file access. I've seen this happen.
    • I didn't have anything that hooks into file system access running on the system where I measured the above.
    • Maybe run top while mypy is running and check if anything other than mypy is using significant CPU.
  • Using network disk that is slow (EBS?). I was using a local nvme SSD.
  • Didn't measure using the correct configuration (e.g. confusion between compiled/interpreted).
  • Different versions of software.
  • Different hardware.
  • Use of virtualization (I didn't use any).
  • Not enough RAM (seem unlikely though).

My config:

  • Python 3.12.3, compiled by myself, Ubuntu package
  • Ubuntu 24.04
  • AMD Ryzen 9950X CPU
  • 64 GB RAM
  • Torch 2.4.1
  • Clang 18.1.3 with -O2 (when compiling mypy)
  • Mypy master commit: 46c108e

@JukkaL
Copy link
Collaborator

JukkaL commented Oct 11, 2024

If the differences are due to speed of file system access (which seems likely), this PR looks great! Fast file system access is not very universal and we shouldn't assume a fast local disk.

@JukkaL
Copy link
Collaborator

JukkaL commented Oct 11, 2024

Another hypothesis is that the module search path has an impact:

>>> sys.path
['', '/usr/lib/python312.zip', '/usr/lib/python3.12', '/usr/lib/python3.12/lib-dynload', '/home/jukka/venv/torch/lib/python3.12/site-packages']

@hauntsaninja
Copy link
Collaborator Author

hauntsaninja commented Oct 11, 2024

Hm, this PR doesn't change the amount of file system access. The saving should just be CPU time from cloning and raising and catching exceptions. My environment does have some differences from yours, I'll try to narrow it down tomorrow morning. I do have a very long module search path in this environment, which is a very likely guess for what's going on.

@JukkaL
Copy link
Collaborator

JukkaL commented Oct 11, 2024

The saving should just be CPU time from cloning and raising and catching exceptions.

Would it help if we'd try to avoid some of these exceptions by doing os.path.exists before calling stat, for example?

@JukkaL
Copy link
Collaborator

JukkaL commented Oct 11, 2024

The saving should just be CPU time from cloning and raising and catching exceptions.

Another random idea would be to cache results of listdir(...) as sets for various directories where we perform stat calls. Then we could use the sets to quickly check if a file exists, without a syscall. This assumes we are calling stat unsuccessfully on many files in the same directories. Alternatively maybe we would cache whether various directories exist, and if a directory doesn't exist, there's no point calling stat on a file in this directory, as we know it will fail.

@JukkaL
Copy link
Collaborator

JukkaL commented Oct 11, 2024

I added 100 empty directories to PYTHONPATH. This slowed things a bit, but now this PR gives me a 20% performance gain (13.1s vs 10.9s when using a compiled mypy).

@hauntsaninja hauntsaninja changed the title Significantly speed up file handling paths (when compiled) Significantly speed up file handling error paths (when compiled) Oct 11, 2024
@hauntsaninja
Copy link
Collaborator Author

hauntsaninja commented Oct 11, 2024

Yep, I can confirm that longer search path is what makes the error handling code here so hot.

In my main work environment, we install all first party packages editably, so it's common to have 100s of entries in the search path.

I tried this script:

rm -rf v1
python -m venv v1
uv pip install torch --python v1/bin/python
hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_88ae62b4a/venv/bin/mypy -c "import torch" --python-executable=v1/bin/python --no-incremental' '/tmp/mypy_primer/timer_mypy_bd9200bda/venv/bin/mypy -c "import torch" --python-executable=v1/bin/python --no-incremental'

rm -rf v2
python -m venv v2
uv pip install torch --python v2/bin/python
for i in $(seq 1 200); do
    dir=$(pwd)/repo/$i
    mkdir -p $dir
    echo $dir >> $(v2/bin/python -c "import site; print(site.getsitepackages()[0])")/repo.pth
done
hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_88ae62b4a/venv/bin/mypy -c "import torch" --python-executable=v2/bin/python --no-incremental' '/tmp/mypy_primer/timer_mypy_bd9200bda/venv/bin/mypy -c "import torch" --python-executable=v2/bin/python --no-incremental'

Normal venv (v1):

Benchmark 1: /tmp/mypy_primer/timer_mypy_88ae62b4a/venv/bin/mypy -c "import torch" --python-executable=v1/bin/python --no-incremental
  Time (mean ± σ):     18.829 s ±  0.262 s    [User: 16.568 s, System: 2.219 s]
  Range (min … max):   18.648 s … 19.129 s    3 runs
 
Benchmark 2: /tmp/mypy_primer/timer_mypy_bd9200bda/venv/bin/mypy -c "import torch" --python-executable=v1/bin/python --no-incremental
  Time (mean ± σ):     19.311 s ±  0.292 s    [User: 17.092 s, System: 2.176 s]
  Range (min … max):   18.981 s … 19.538 s    3 runs
 
Summary
  /tmp/mypy_primer/timer_mypy_88ae62b4a/venv/bin/mypy -c "import torch" --python-executable=v1/bin/python --no-incremental ran
    1.03 ± 0.02 times faster than /tmp/mypy_primer/timer_mypy_bd9200bda/venv/bin/mypy -c "import torch" --python-executable=v1/bin/python --no-incremental

Many search paths venv (v2):

Benchmark 1: /tmp/mypy_primer/timer_mypy_88ae62b4a/venv/bin/mypy -c "import torch" --python-executable=v2/bin/python --no-incremental
  Time (mean ± σ):     24.896 s ±  0.126 s    [User: 22.181 s, System: 2.680 s]
  Range (min … max):   24.798 s … 25.039 s    3 runs
 
Benchmark 2: /tmp/mypy_primer/timer_mypy_bd9200bda/venv/bin/mypy -c "import torch" --python-executable=v2/bin/python --no-incremental
  Time (mean ± σ):     34.830 s ±  0.145 s    [User: 32.004 s, System: 2.798 s]
  Range (min … max):   34.737 s … 34.996 s    3 runs
 
Summary
  /tmp/mypy_primer/timer_mypy_88ae62b4a/venv/bin/mypy -c "import torch" --python-executable=v2/bin/python --no-incremental ran
    1.40 ± 0.01 times faster than /tmp/mypy_primer/timer_mypy_bd9200bda/venv/bin/mypy -c "import torch" --python-executable=v2/bin/python --no-incremental

When run interpreted, the gap is a little smaller in absolute terms (maybe mypyc exceptions are more expensive?), and obviously much smaller in relative terms

There's still some more gap between this and the environment I was running in earlier, I'll look into it further.

@hauntsaninja
Copy link
Collaborator Author

hauntsaninja commented Oct 11, 2024

Another random idea would be to cache results of listdir(...) as sets

Yeah, I do this in some other module resolving code I wrote a while back. It works pretty well, esp since scandir might mean we could avoid separate stats in some cases, so could conceivably be a win on short search paths as well.

@hauntsaninja hauntsaninja changed the title Significantly speed up file handling error paths (when compiled) Significantly speed up file handling error paths Oct 11, 2024
@hauntsaninja hauntsaninja merged commit c32d11e into python:master Oct 14, 2024
18 checks passed
@hauntsaninja hauntsaninja deleted the fast branch October 14, 2024 00:50
hauntsaninja added a commit that referenced this pull request Oct 20, 2024
This can have a huge overall impact on mypy performance when search paths are long
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants