Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: implement faster get_device_type #54

Merged
merged 8 commits into from
Jul 13, 2024
Merged

Conversation

avik-pal
Copy link
Member

@avik-pal avik-pal commented Jul 12, 2024

julia> using LuxCUDA, LuxDeviceUtils, BenchmarkTools

julia> x = rand(10, 10) |> LuxCUDADevice();

julia> get_device_type()^C

julia> @code_typed get_device_type((x, x, x))
CodeInfo(
1return LuxCUDADevice
) => Type{LuxCUDADevice}

julia> @code_typed get_device((x, x, x))
CodeInfo(
1nothing::Nothing%2 = invoke LuxDeviceUtils._get_device(x::Tuple{CuArray{Float32, 2, CUDA.DeviceMemory}, CuArray{Float32, 2, CUDA.DeviceMemory}, CuArray{Float32, 2, CUDA.DeviceMemory}})::LuxCUDADevice{CuDevice}
└──      return %2
) => LuxCUDADevice{CuDevice}

julia> @benchmark get_device_type(($x, $x, ($x, $x)))
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
 Range (min  max):  1.396 ns  39.530 ns  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     2.444 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   2.528 ns ±  0.640 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

                              ▃█ ▅         ▁ ▅                
  ▃▁▃▁▁▁▁▁▁▁▁▁▂▁▅▁▄▁▁▁▁▁▁▁▂▁▇▁██▁█▁▁▁▁▁▁▁▂▁█▁█▁▆▁▁▁▁▁▁▁▂▁▃▁▂ ▃
  1.4 ns         Histogram: frequency by time        3.35 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark get_device(($x, $x, ($x, $x)))
BenchmarkTools.Trial: 10000 samples with 303 evaluations.
 Range (min  max):  274.756 ns   1.738 μs  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     300.802 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   310.144 ns ± 41.508 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

   ▆▇▇▆▇██▇▇▆▅▄▃▃▂▂▂▂▁▁▁  ▁                                    ▃
  █████████████████████████████▇██▇█▇▇███▇▇▆▆▆▆▆▆▆▅▆▆▄▆▆▇▆▇▇▇▇ █
  275 ns        Histogram: log(frequency) by time       486 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

@avik-pal avik-pal force-pushed the ap/simple_getdevice branch 2 times, most recently from 5db42f2 to b9bbe69 Compare July 13, 2024 02:15
@avik-pal avik-pal merged commit d3a84fb into main Jul 13, 2024
18 of 20 checks passed
@avik-pal avik-pal deleted the ap/simple_getdevice branch July 13, 2024 03:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant