Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adapt to GPUArrays.jl changes. #519

Merged
merged 2 commits into from
Jan 16, 2025
Merged

Adapt to GPUArrays.jl changes. #519

merged 2 commits into from
Jan 16, 2025

Conversation

maleadt
Copy link
Member

@maleadt maleadt commented Jan 16, 2025

No description provided.

src/array.jl Outdated Show resolved Hide resolved
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
@maleadt maleadt merged commit 6c2df1f into main Jan 16, 2025
6 of 7 checks passed
@maleadt maleadt deleted the tb/gpuarrays branch January 16, 2025 15:14
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Metal Benchmarks

Benchmark suite Current: dd608e5 Previous: 6c2df1f Ratio
private array/construct 24253.5 ns 24825 ns 0.98
private array/broadcast 457792 ns 462208 ns 0.99
private array/random/randn/Float32 798583 ns 832125 ns 0.96
private array/random/randn!/Float32 630375 ns 655625 ns 0.96
private array/random/rand!/Int64 570417 ns 579666.5 ns 0.98
private array/random/rand!/Float32 595500 ns 609417 ns 0.98
private array/random/rand/Int64 779145.5 ns 762375 ns 1.02
private array/random/rand/Float32 595375 ns 605000 ns 0.98
private array/copyto!/gpu_to_gpu 657166 ns 629542 ns 1.04
private array/copyto!/cpu_to_gpu 664375 ns 827250 ns 0.80
private array/copyto!/gpu_to_cpu 815208 ns 671937.5 ns 1.21
private array/accumulate/1d 1336042 ns 1342208 ns 1.00
private array/accumulate/2d 1395208 ns 1396542 ns 1.00
private array/iteration/findall/int 2089333 ns 2092604 ns 1.00
private array/iteration/findall/bool 1790459 ns 1827500 ns 0.98
private array/iteration/findfirst/int 1689916 ns 1686875 ns 1.00
private array/iteration/findfirst/bool 1657458 ns 1668666.5 ns 0.99
private array/iteration/scalar 3889042 ns 3905125 ns 1.00
private array/iteration/logical 3197062.5 ns 3216917 ns 0.99
private array/iteration/findmin/1d 1769833 ns 1759791 ns 1.01
private array/iteration/findmin/2d 1350666 ns 1346416.5 ns 1.00
private array/reductions/reduce/1d 1044354 ns 1044167 ns 1.00
private array/reductions/reduce/2d 655791 ns 666229 ns 0.98
private array/reductions/mapreduce/1d 1047625 ns 1052875 ns 1.00
private array/reductions/mapreduce/2d 657583 ns 663500 ns 0.99
private array/permutedims/4d 2549999.5 ns 2572208 ns 0.99
private array/permutedims/2d 1028270.5 ns 1033625 ns 0.99
private array/permutedims/3d 1595667 ns 1594249.5 ns 1.00
private array/copy 580625 ns 570375 ns 1.02
latency/precompile 5771349667 ns 5771935000 ns 1.00
latency/ttfp 3037382958 ns 3043459271 ns 1.00
latency/import 1142069208 ns 1141145166 ns 1.00
integration/metaldevrt 690021 ns 708416.5 ns 0.97
integration/byval/slices=1 1624041 ns 1636104 ns 0.99
integration/byval/slices=3 9142104.5 ns 11296375 ns 0.81
integration/byval/reference 1601750 ns 1554291.5 ns 1.03
integration/byval/slices=2 2719166 ns 2595417 ns 1.05
kernel/indexing 467667 ns 459999.5 ns 1.02
kernel/indexing_checked 469208 ns 456125 ns 1.03
kernel/launch 8208 ns 9368 ns 0.88
metal/synchronization/stream 14167 ns 14791 ns 0.96
metal/synchronization/context 15000 ns 15042 ns 1.00
shared array/construct 25058.4 ns 23950 ns 1.05
shared array/broadcast 461437.5 ns 466167 ns 0.99
shared array/random/randn/Float32 806000 ns 804750.5 ns 1.00
shared array/random/randn!/Float32 470000 ns 642250 ns 0.73
shared array/random/rand!/Int64 567125 ns 564229.5 ns 1.01
shared array/random/rand!/Float32 593125 ns 598708 ns 0.99
shared array/random/rand/Int64 809709 ns 798792 ns 1.01
shared array/random/rand/Float32 605833 ns 587750 ns 1.03
shared array/copyto!/gpu_to_gpu 83791 ns 79458.5 ns 1.05
shared array/copyto!/cpu_to_gpu 81833 ns 78834 ns 1.04
shared array/copyto!/gpu_to_cpu 82583 ns 78041 ns 1.06
shared array/accumulate/1d 1341583 ns 1375584 ns 0.98
shared array/accumulate/2d 1387833 ns 1403041 ns 0.99
shared array/iteration/findall/int 1840917 ns 1863583 ns 0.99
shared array/iteration/findall/bool 1662667 ns 1621021 ns 1.03
shared array/iteration/findfirst/int 1398917 ns 1415917 ns 0.99
shared array/iteration/findfirst/bool 1368083 ns 1381895.5 ns 0.99
shared array/iteration/scalar 152937.5 ns 161458 ns 0.95
shared array/iteration/logical 2971354 ns 3022416 ns 0.98
shared array/iteration/findmin/1d 1466708 ns 1476208 ns 0.99
shared array/iteration/findmin/2d 1360583 ns 1359667 ns 1.00
shared array/reductions/reduce/1d 727020.5 ns 725771 ns 1.00
shared array/reductions/reduce/2d 665312.5 ns 675708 ns 0.98
shared array/reductions/mapreduce/1d 748708 ns 743041 ns 1.01
shared array/reductions/mapreduce/2d 665250 ns 680375 ns 0.98
shared array/permutedims/4d 2531083.5 ns 2571208 ns 0.98
shared array/permutedims/2d 1028750 ns 1041312.5 ns 0.99
shared array/permutedims/3d 1599292 ns 1596208 ns 1.00
shared array/copy 249083 ns 243562 ns 1.02

This comment was automatically generated by workflow using github-action-benchmark.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant