-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: use the faster get_device_type
#763
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
avik-pal
force-pushed
the
ap/get_device_type
branch
2 times, most recently
from
July 13, 2024 03:31
4f3ea76
to
de6b551
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Benchmark Results
Benchmark suite | Current: 5356d1b | Previous: 6a9ef65 | Ratio |
---|---|---|---|
Dense(2 => 2)/cpu/reverse/ReverseDiff (compiled)/(2, 128) |
3681.875 ns |
3705.75 ns |
0.99 |
Dense(2 => 2)/cpu/reverse/Zygote/(2, 128) |
7241.833333333333 ns |
7107.4 ns |
1.02 |
Dense(2 => 2)/cpu/reverse/Tracker/(2, 128) |
20473.5 ns |
20799 ns |
0.98 |
Dense(2 => 2)/cpu/reverse/ReverseDiff/(2, 128) |
9846.4 ns |
9710.3 ns |
1.01 |
Dense(2 => 2)/cpu/reverse/Flux/(2, 128) |
9045 ns |
9047 ns |
1.00 |
Dense(2 => 2)/cpu/reverse/SimpleChains/(2, 128) |
4557.25 ns |
4463.375 ns |
1.02 |
Dense(2 => 2)/cpu/reverse/Enzyme/(2, 128) |
1175.327205882353 ns |
1160.731884057971 ns |
1.01 |
Dense(2 => 2)/cpu/forward/NamedTuple/(2, 128) |
1178.4166666666667 ns |
1111.9610389610389 ns |
1.06 |
Dense(2 => 2)/cpu/forward/ComponentArray/(2, 128) |
1172.0551724137931 ns |
1187.8854961832062 ns |
0.99 |
Dense(2 => 2)/cpu/forward/Flux/(2, 128) |
1780 ns |
1791.5333333333333 ns |
0.99 |
Dense(2 => 2)/cpu/forward/SimpleChains/(2, 128) |
179.6008344923505 ns |
179.46262341325811 ns |
1.00 |
Dense(20 => 20)/cpu/reverse/ReverseDiff (compiled)/(20, 128) |
17352 ns |
17282 ns |
1.00 |
Dense(20 => 20)/cpu/reverse/Zygote/(20, 128) |
16792 ns |
16862 ns |
1.00 |
Dense(20 => 20)/cpu/reverse/Tracker/(20, 128) |
37310 ns |
37170 ns |
1.00 |
Dense(20 => 20)/cpu/reverse/ReverseDiff/(20, 128) |
29325 ns |
29185 ns |
1.00 |
Dense(20 => 20)/cpu/reverse/Flux/(20, 128) |
20118 ns |
20228 ns |
0.99 |
Dense(20 => 20)/cpu/reverse/SimpleChains/(20, 128) |
17192 ns |
17303 ns |
0.99 |
Dense(20 => 20)/cpu/reverse/Enzyme/(20, 128) |
4379.714285714285 ns |
4306.714285714285 ns |
1.02 |
Dense(20 => 20)/cpu/forward/NamedTuple/(20, 128) |
3908.5 ns |
3868.5 ns |
1.01 |
Dense(20 => 20)/cpu/forward/ComponentArray/(20, 128) |
3967.375 ns |
3936.25 ns |
1.01 |
Dense(20 => 20)/cpu/forward/Flux/(20, 128) |
4990.857142857143 ns |
4992.142857142857 ns |
1.00 |
Dense(20 => 20)/cpu/forward/SimpleChains/(20, 128) |
1660.1 ns |
1654.1 ns |
1.00 |
Conv((3, 3), 3 => 3)/cpu/reverse/ReverseDiff (compiled)/(64, 64, 3, 128) |
38961143 ns |
48611821 ns |
0.80 |
Conv((3, 3), 3 => 3)/cpu/reverse/Zygote/(64, 64, 3, 128) |
57972611.5 ns |
57919354 ns |
1.00 |
Conv((3, 3), 3 => 3)/cpu/reverse/Tracker/(64, 64, 3, 128) |
76624136 ns |
109926255 ns |
0.70 |
Conv((3, 3), 3 => 3)/cpu/reverse/ReverseDiff/(64, 64, 3, 128) |
89077765 ns |
107281465 ns |
0.83 |
Conv((3, 3), 3 => 3)/cpu/reverse/Flux/(64, 64, 3, 128) |
72966465.5 ns |
91622516 ns |
0.80 |
Conv((3, 3), 3 => 3)/cpu/reverse/SimpleChains/(64, 64, 3, 128) |
11815804 ns |
11726802 ns |
1.01 |
Conv((3, 3), 3 => 3)/cpu/reverse/Enzyme/(64, 64, 3, 128) |
7058451.5 ns |
6972718 ns |
1.01 |
Conv((3, 3), 3 => 3)/cpu/forward/NamedTuple/(64, 64, 3, 128) |
7157573 ns |
7145941 ns |
1.00 |
Conv((3, 3), 3 => 3)/cpu/forward/ComponentArray/(64, 64, 3, 128) |
7103084.5 ns |
7086834.5 ns |
1.00 |
Conv((3, 3), 3 => 3)/cpu/forward/Flux/(64, 64, 3, 128) |
10121933 ns |
18155782 ns |
0.56 |
Conv((3, 3), 3 => 3)/cpu/forward/SimpleChains/(64, 64, 3, 128) |
6431712 ns |
6400164 ns |
1.00 |
vgg16/cpu/reverse/Zygote/(32, 32, 3, 16) |
699918077 ns |
695663529 ns |
1.01 |
vgg16/cpu/reverse/Zygote/(32, 32, 3, 64) |
2595036967 ns |
2551834000 ns |
1.02 |
vgg16/cpu/reverse/Zygote/(32, 32, 3, 2) |
133155770 ns |
144828075.5 ns |
0.92 |
vgg16/cpu/reverse/Tracker/(32, 32, 3, 16) |
852084721 ns |
937571020 ns |
0.91 |
vgg16/cpu/reverse/Tracker/(32, 32, 3, 64) |
3006347452 ns |
3760657273 ns |
0.80 |
vgg16/cpu/reverse/Tracker/(32, 32, 3, 2) |
190807268.5 ns |
218456733 ns |
0.87 |
vgg16/cpu/reverse/Flux/(32, 32, 3, 16) |
665464895.5 ns |
1023397506 ns |
0.65 |
vgg16/cpu/reverse/Flux/(32, 32, 3, 64) |
2643466986 ns |
2833424793 ns |
0.93 |
vgg16/cpu/reverse/Flux/(32, 32, 3, 2) |
125579236 ns |
135152975 ns |
0.93 |
vgg16/cpu/forward/NamedTuple/(32, 32, 3, 16) |
175649989.5 ns |
173677436.5 ns |
1.01 |
vgg16/cpu/forward/NamedTuple/(32, 32, 3, 64) |
657092817.5 ns |
667093038.5 ns |
0.99 |
vgg16/cpu/forward/NamedTuple/(32, 32, 3, 2) |
35192809 ns |
35692896 ns |
0.99 |
vgg16/cpu/forward/ComponentArray/(32, 32, 3, 16) |
168010917.5 ns |
165432458 ns |
1.02 |
vgg16/cpu/forward/ComponentArray/(32, 32, 3, 64) |
637595858 ns |
638281808 ns |
1.00 |
vgg16/cpu/forward/ComponentArray/(32, 32, 3, 2) |
30409000 ns |
30282180 ns |
1.00 |
vgg16/cpu/forward/Flux/(32, 32, 3, 16) |
186296351 ns |
228436072 ns |
0.82 |
vgg16/cpu/forward/Flux/(32, 32, 3, 64) |
721752628 ns |
907887712 ns |
0.79 |
vgg16/cpu/forward/Flux/(32, 32, 3, 2) |
40378123 ns |
37665017.5 ns |
1.07 |
Conv((3, 3), 64 => 64)/cpu/reverse/ReverseDiff (compiled)/(64, 64, 64, 128) |
1282290559 ns |
1202743851 ns |
1.07 |
Conv((3, 3), 64 => 64)/cpu/reverse/Zygote/(64, 64, 64, 128) |
1898061333 ns |
1871595149.5 ns |
1.01 |
Conv((3, 3), 64 => 64)/cpu/reverse/Tracker/(64, 64, 64, 128) |
2475681575 ns |
2375315336 ns |
1.04 |
Conv((3, 3), 64 => 64)/cpu/reverse/ReverseDiff/(64, 64, 64, 128) |
2586274658 ns |
2568230230 ns |
1.01 |
Conv((3, 3), 64 => 64)/cpu/reverse/Flux/(64, 64, 64, 128) |
1885575995.5 ns |
1850936169 ns |
1.02 |
Conv((3, 3), 64 => 64)/cpu/reverse/Enzyme/(64, 64, 64, 128) |
332640402.5 ns |
325578942 ns |
1.02 |
Conv((3, 3), 64 => 64)/cpu/forward/NamedTuple/(64, 64, 64, 128) |
331002918 ns |
325502546 ns |
1.02 |
Conv((3, 3), 64 => 64)/cpu/forward/ComponentArray/(64, 64, 64, 128) |
332588478 ns |
327773885 ns |
1.01 |
Conv((3, 3), 64 => 64)/cpu/forward/Flux/(64, 64, 64, 128) |
448113081 ns |
362130805 ns |
1.24 |
Conv((3, 3), 1 => 1)/cpu/reverse/ReverseDiff (compiled)/(64, 64, 1, 128) |
11972300 ns |
11798402 ns |
1.01 |
Conv((3, 3), 1 => 1)/cpu/reverse/Zygote/(64, 64, 1, 128) |
18045592.5 ns |
17952895 ns |
1.01 |
Conv((3, 3), 1 => 1)/cpu/reverse/Tracker/(64, 64, 1, 128) |
19361053.5 ns |
19111937 ns |
1.01 |
Conv((3, 3), 1 => 1)/cpu/reverse/ReverseDiff/(64, 64, 1, 128) |
23966453.5 ns |
23813592 ns |
1.01 |
Conv((3, 3), 1 => 1)/cpu/reverse/Flux/(64, 64, 1, 128) |
17999909 ns |
17994614 ns |
1.00 |
Conv((3, 3), 1 => 1)/cpu/reverse/SimpleChains/(64, 64, 1, 128) |
1176247 ns |
1161846 ns |
1.01 |
Conv((3, 3), 1 => 1)/cpu/reverse/Enzyme/(64, 64, 1, 128) |
2073893 ns |
2071410 ns |
1.00 |
Conv((3, 3), 1 => 1)/cpu/forward/NamedTuple/(64, 64, 1, 128) |
2088462 ns |
2088598.5 ns |
1.00 |
Conv((3, 3), 1 => 1)/cpu/forward/ComponentArray/(64, 64, 1, 128) |
2101436 ns |
2095050 ns |
1.00 |
Conv((3, 3), 1 => 1)/cpu/forward/Flux/(64, 64, 1, 128) |
2101003.5 ns |
2087753 ns |
1.01 |
Conv((3, 3), 1 => 1)/cpu/forward/SimpleChains/(64, 64, 1, 128) |
207677 ns |
204526 ns |
1.02 |
Dense(200 => 200)/cpu/reverse/ReverseDiff (compiled)/(200, 128) |
295128 ns |
294988 ns |
1.00 |
Dense(200 => 200)/cpu/reverse/Zygote/(200, 128) |
265821 ns |
266894 ns |
1.00 |
Dense(200 => 200)/cpu/reverse/Tracker/(200, 128) |
368565 ns |
370339.5 ns |
1.00 |
Dense(200 => 200)/cpu/reverse/ReverseDiff/(200, 128) |
409452 ns |
410705 ns |
1.00 |
Dense(200 => 200)/cpu/reverse/Flux/(200, 128) |
274032 ns |
277925 ns |
0.99 |
Dense(200 => 200)/cpu/reverse/SimpleChains/(200, 128) |
410223 ns |
409543 ns |
1.00 |
Dense(200 => 200)/cpu/reverse/Enzyme/(200, 128) |
83577 ns |
83628 ns |
1.00 |
Dense(200 => 200)/cpu/forward/NamedTuple/(200, 128) |
81282 ns |
82105 ns |
0.99 |
Dense(200 => 200)/cpu/forward/ComponentArray/(200, 128) |
81474 ns |
83929 ns |
0.97 |
Dense(200 => 200)/cpu/forward/Flux/(200, 128) |
87365 ns |
87756 ns |
1.00 |
Dense(200 => 200)/cpu/forward/SimpleChains/(200, 128) |
104537 ns |
104467 ns |
1.00 |
Conv((3, 3), 16 => 16)/cpu/reverse/ReverseDiff (compiled)/(64, 64, 16, 128) |
192440856.5 ns |
190419837 ns |
1.01 |
Conv((3, 3), 16 => 16)/cpu/reverse/Zygote/(64, 64, 16, 128) |
327419012 ns |
328138588.5 ns |
1.00 |
Conv((3, 3), 16 => 16)/cpu/reverse/Tracker/(64, 64, 16, 128) |
388732378 ns |
383045958 ns |
1.01 |
Conv((3, 3), 16 => 16)/cpu/reverse/ReverseDiff/(64, 64, 16, 128) |
437058555.5 ns |
460735044.5 ns |
0.95 |
Conv((3, 3), 16 => 16)/cpu/reverse/Flux/(64, 64, 16, 128) |
375210493 ns |
380286964 ns |
0.99 |
Conv((3, 3), 16 => 16)/cpu/reverse/SimpleChains/(64, 64, 16, 128) |
340892694 ns |
334237556 ns |
1.02 |
Conv((3, 3), 16 => 16)/cpu/reverse/Enzyme/(64, 64, 16, 128) |
44228683 ns |
44410897 ns |
1.00 |
Conv((3, 3), 16 => 16)/cpu/forward/NamedTuple/(64, 64, 16, 128) |
44336900 ns |
44630099.5 ns |
0.99 |
Conv((3, 3), 16 => 16)/cpu/forward/ComponentArray/(64, 64, 16, 128) |
44126115 ns |
44306468 ns |
1.00 |
Conv((3, 3), 16 => 16)/cpu/forward/Flux/(64, 64, 16, 128) |
60202256 ns |
50448519 ns |
1.19 |
Conv((3, 3), 16 => 16)/cpu/forward/SimpleChains/(64, 64, 16, 128) |
28376001 ns |
28143904 ns |
1.01 |
Dense(2000 => 2000)/cpu/reverse/ReverseDiff (compiled)/(2000, 128) |
19329164 ns |
19485606 ns |
0.99 |
Dense(2000 => 2000)/cpu/reverse/Zygote/(2000, 128) |
19767186.5 ns |
19747168 ns |
1.00 |
Dense(2000 => 2000)/cpu/reverse/Tracker/(2000, 128) |
23755998 ns |
23603854 ns |
1.01 |
Dense(2000 => 2000)/cpu/reverse/ReverseDiff/(2000, 128) |
24325973 ns |
24284477.5 ns |
1.00 |
Dense(2000 => 2000)/cpu/reverse/Flux/(2000, 128) |
19745972 ns |
19844857.5 ns |
1.00 |
Dense(2000 => 2000)/cpu/reverse/Enzyme/(2000, 128) |
6612217 ns |
6534898 ns |
1.01 |
Dense(2000 => 2000)/cpu/forward/NamedTuple/(2000, 128) |
6569116 ns |
6565242 ns |
1.00 |
Dense(2000 => 2000)/cpu/forward/ComponentArray/(2000, 128) |
6517957 ns |
6551615 ns |
0.99 |
Dense(2000 => 2000)/cpu/forward/Flux/(2000, 128) |
6517358 ns |
6522759 ns |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
avik-pal
force-pushed
the
ap/get_device_type
branch
3 times, most recently
from
July 13, 2024 05:34
4f87e2f
to
a431979
Compare
avik-pal
force-pushed
the
ap/get_device_type
branch
2 times, most recently
from
July 13, 2024 05:36
8b0f87f
to
5356d1b
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Needs
get_device_type
MLDataDevices.jl#54