Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] NonTensor batched arg #2816

Open
wants to merge 3 commits into
base: gh/vmoens/94/base
Choose a base branch
from
Open

Conversation

[ghstack-poisoned]
Copy link

pytorch-bot bot commented Feb 28, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2816

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 New Failures, 2 Unrelated Failures

As of commit 63d5d93 with merge base 8c9dc05 (image):

NEW FAILURES - The following jobs have failed:

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 28, 2025
@vmoens vmoens added the enhancement New feature or request label Feb 28, 2025
Copy link

github-actions bot commented Feb 28, 2025

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}11$. Worsened: $\large\color{#d91a1a}7$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.6186s 0.5268s 1.8982 Ops/s 1.9473 Ops/s $\color{#d91a1a}-2.52\%$
test_transformed 1.1242s 1.0333s 0.9678 Ops/s 0.9521 Ops/s $\color{#35bf28}+1.65\%$
test_serial 1.6340s 1.5364s 0.6509 Ops/s 0.6477 Ops/s $\color{#35bf28}+0.49\%$
test_parallel 1.3683s 1.2843s 0.7786 Ops/s 0.7783 Ops/s $\color{#35bf28}+0.05\%$
test_step_mdp_speed[True-True-True-True-True] 0.2371ms 29.7004μs 33.6696 KOps/s 32.9516 KOps/s $\color{#35bf28}+2.18\%$
test_step_mdp_speed[True-True-True-True-False] 47.2580μs 17.5072μs 57.1193 KOps/s 55.8437 KOps/s $\color{#35bf28}+2.28\%$
test_step_mdp_speed[True-True-True-False-True] 45.7950μs 16.8247μs 59.4363 KOps/s 59.2652 KOps/s $\color{#35bf28}+0.29\%$
test_step_mdp_speed[True-True-True-False-False] 30.9880μs 9.8638μs 101.3805 KOps/s 100.2301 KOps/s $\color{#35bf28}+1.15\%$
test_step_mdp_speed[True-True-False-True-True] 71.9030μs 32.0105μs 31.2397 KOps/s 30.9994 KOps/s $\color{#35bf28}+0.78\%$
test_step_mdp_speed[True-True-False-True-False] 42.6200μs 19.4852μs 51.3209 KOps/s 50.1528 KOps/s $\color{#35bf28}+2.33\%$
test_step_mdp_speed[True-True-False-False-True] 52.5680μs 18.9116μs 52.8776 KOps/s 53.0712 KOps/s $\color{#d91a1a}-0.36\%$
test_step_mdp_speed[True-True-False-False-False] 33.2310μs 11.8705μs 84.2428 KOps/s 84.9323 KOps/s $\color{#d91a1a}-0.81\%$
test_step_mdp_speed[True-False-True-True-True] 79.3180μs 33.7389μs 29.6393 KOps/s 29.4261 KOps/s $\color{#35bf28}+0.72\%$
test_step_mdp_speed[True-False-True-True-False] 53.4090μs 21.4838μs 46.5467 KOps/s 46.3542 KOps/s $\color{#35bf28}+0.42\%$
test_step_mdp_speed[True-False-True-False-True] 48.7910μs 18.7775μs 53.2551 KOps/s 53.0081 KOps/s $\color{#35bf28}+0.47\%$
test_step_mdp_speed[True-False-True-False-False] 59.2000μs 11.8937μs 84.0784 KOps/s 84.0754 KOps/s $+0.00\%$
test_step_mdp_speed[True-False-False-True-True] 66.7650μs 35.5917μs 28.0964 KOps/s 28.1957 KOps/s $\color{#d91a1a}-0.35\%$
test_step_mdp_speed[True-False-False-True-False] 0.5976ms 23.1119μs 43.2677 KOps/s 43.0164 KOps/s $\color{#35bf28}+0.58\%$
test_step_mdp_speed[True-False-False-False-True] 87.2050μs 20.4555μs 48.8867 KOps/s 48.5889 KOps/s $\color{#35bf28}+0.61\%$
test_step_mdp_speed[True-False-False-False-False] 37.5600μs 13.6378μs 73.3256 KOps/s 73.7563 KOps/s $\color{#d91a1a}-0.58\%$
test_step_mdp_speed[False-True-True-True-True] 76.5120μs 33.8681μs 29.5263 KOps/s 29.3720 KOps/s $\color{#35bf28}+0.53\%$
test_step_mdp_speed[False-True-True-True-False] 56.7160μs 21.4442μs 46.6326 KOps/s 46.3356 KOps/s $\color{#35bf28}+0.64\%$
test_step_mdp_speed[False-True-True-False-True] 68.1380μs 21.3009μs 46.9464 KOps/s 44.9575 KOps/s $\color{#35bf28}+4.42\%$
test_step_mdp_speed[False-True-True-False-False] 35.2350μs 13.1501μs 76.0452 KOps/s 76.2898 KOps/s $\color{#d91a1a}-0.32\%$
test_step_mdp_speed[False-True-False-True-True] 68.3170μs 35.7323μs 27.9859 KOps/s 27.9394 KOps/s $\color{#35bf28}+0.17\%$
test_step_mdp_speed[False-True-False-True-False] 61.2340μs 23.2299μs 43.0479 KOps/s 43.2896 KOps/s $\color{#d91a1a}-0.56\%$
test_step_mdp_speed[False-True-False-False-True] 3.0224ms 23.4462μs 42.6508 KOps/s 42.5812 KOps/s $\color{#35bf28}+0.16\%$
test_step_mdp_speed[False-True-False-False-False] 43.2000μs 14.8975μs 67.1254 KOps/s 67.3777 KOps/s $\color{#d91a1a}-0.37\%$
test_step_mdp_speed[False-False-True-True-True] 96.8210μs 37.4690μs 26.6888 KOps/s 26.6159 KOps/s $\color{#35bf28}+0.27\%$
test_step_mdp_speed[False-False-True-True-False] 58.3090μs 24.9809μs 40.0305 KOps/s 39.9646 KOps/s $\color{#35bf28}+0.17\%$
test_step_mdp_speed[False-False-True-False-True] 57.3870μs 23.1738μs 43.1521 KOps/s 42.8083 KOps/s $\color{#35bf28}+0.80\%$
test_step_mdp_speed[False-False-True-False-False] 52.6480μs 15.0589μs 66.4057 KOps/s 66.6191 KOps/s $\color{#d91a1a}-0.32\%$
test_step_mdp_speed[False-False-False-True-True] 86.9520μs 39.2217μs 25.4961 KOps/s 25.6162 KOps/s $\color{#d91a1a}-0.47\%$
test_step_mdp_speed[False-False-False-True-False] 60.1220μs 26.5931μs 37.6037 KOps/s 37.4271 KOps/s $\color{#35bf28}+0.47\%$
test_step_mdp_speed[False-False-False-False-True] 0.6077ms 24.7040μs 40.4793 KOps/s 40.0812 KOps/s $\color{#35bf28}+0.99\%$
test_step_mdp_speed[False-False-False-False-False] 50.6040μs 16.6762μs 59.9658 KOps/s 60.9401 KOps/s $\color{#d91a1a}-1.60\%$
test_values[generalized_advantage_estimate-True-True] 12.3804ms 9.5965ms 104.2048 Ops/s 101.0846 Ops/s $\color{#35bf28}+3.09\%$
test_values[vec_generalized_advantage_estimate-True-True] 25.4731ms 24.2578ms 41.2239 Ops/s 37.2949 Ops/s $\textbf{\color{#35bf28}+10.54\%}$
test_values[td0_return_estimate-False-False] 0.2336ms 0.1805ms 5.5417 KOps/s 5.5315 KOps/s $\color{#35bf28}+0.18\%$
test_values[td1_return_estimate-False-False] 24.6130ms 24.0470ms 41.5853 Ops/s 39.2943 Ops/s $\textbf{\color{#35bf28}+5.83\%}$
test_values[vec_td1_return_estimate-False-False] 26.4050ms 24.4037ms 40.9775 Ops/s 37.1772 Ops/s $\textbf{\color{#35bf28}+10.22\%}$
test_values[td_lambda_return_estimate-True-False] 37.9497ms 34.5907ms 28.9095 Ops/s 28.4968 Ops/s $\color{#35bf28}+1.45\%$
test_values[vec_td_lambda_return_estimate-True-False] 26.5456ms 24.3774ms 41.0217 Ops/s 37.3148 Ops/s $\textbf{\color{#35bf28}+9.93\%}$
test_gae_speed[generalized_advantage_estimate-False-1-512] 11.7474ms 8.3133ms 120.2888 Ops/s 116.3130 Ops/s $\color{#35bf28}+3.42\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.2789ms 1.9594ms 510.3656 Ops/s 524.8669 Ops/s $\color{#d91a1a}-2.76\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.4654ms 0.3701ms 2.7019 KOps/s 2.6254 KOps/s $\color{#35bf28}+2.92\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 44.9751ms 41.0478ms 24.3618 Ops/s 21.9147 Ops/s $\textbf{\color{#35bf28}+11.17\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 4.7093ms 3.4521ms 289.6812 Ops/s 288.9536 Ops/s $\color{#35bf28}+0.25\%$
test_dqn_speed[False-None] 1.9162ms 1.4277ms 700.4312 Ops/s 682.3030 Ops/s $\color{#35bf28}+2.66\%$
test_dqn_speed[False-backward] 2.0912ms 1.9379ms 516.0247 Ops/s 503.7388 Ops/s $\color{#35bf28}+2.44\%$
test_dqn_speed[True-None] 0.7500ms 0.4835ms 2.0681 KOps/s 2.0156 KOps/s $\color{#35bf28}+2.61\%$
test_dqn_speed[True-backward] 1.0057ms 0.9246ms 1.0815 KOps/s 929.1667 Ops/s $\textbf{\color{#35bf28}+16.39\%}$
test_dqn_speed[reduce-overhead-None] 0.6197ms 0.4817ms 2.0761 KOps/s 2.0368 KOps/s $\color{#35bf28}+1.93\%$
test_dqn_speed[reduce-overhead-backward] 1.0523ms 0.9172ms 1.0903 KOps/s 1.0636 KOps/s $\color{#35bf28}+2.51\%$
test_ddpg_speed[False-None] 3.6448ms 2.9166ms 342.8660 Ops/s 332.6795 Ops/s $\color{#35bf28}+3.06\%$
test_ddpg_speed[False-backward] 4.2221ms 4.0878ms 244.6274 Ops/s 240.0193 Ops/s $\color{#35bf28}+1.92\%$
test_ddpg_speed[True-None] 1.6729ms 1.2412ms 805.6599 Ops/s 802.0525 Ops/s $\color{#35bf28}+0.45\%$
test_ddpg_speed[True-backward] 2.2360ms 2.1429ms 466.6578 Ops/s 464.4062 Ops/s $\color{#35bf28}+0.48\%$
test_ddpg_speed[reduce-overhead-None] 1.5068ms 1.2414ms 805.5606 Ops/s 802.4037 Ops/s $\color{#35bf28}+0.39\%$
test_ddpg_speed[reduce-overhead-backward] 2.8445ms 2.1340ms 468.6142 Ops/s 464.4393 Ops/s $\color{#35bf28}+0.90\%$
test_sac_speed[False-None] 8.8187ms 8.2356ms 121.4247 Ops/s 121.0204 Ops/s $\color{#35bf28}+0.33\%$
test_sac_speed[False-backward] 11.9508ms 11.0403ms 90.5769 Ops/s 90.5853 Ops/s $-0.01\%$
test_sac_speed[True-None] 2.6900ms 2.1413ms 467.0129 Ops/s 474.6134 Ops/s $\color{#d91a1a}-1.60\%$
test_sac_speed[True-backward] 4.7850ms 3.8567ms 259.2879 Ops/s 253.1358 Ops/s $\color{#35bf28}+2.43\%$
test_sac_speed[reduce-overhead-None] 2.7222ms 2.1109ms 473.7259 Ops/s 468.9618 Ops/s $\color{#35bf28}+1.02\%$
test_sac_speed[reduce-overhead-backward] 3.9727ms 3.8213ms 261.6906 Ops/s 260.7916 Ops/s $\color{#35bf28}+0.34\%$
test_redq_speed[False-None] 14.6509ms 13.4213ms 74.5085 Ops/s 74.8823 Ops/s $\color{#d91a1a}-0.50\%$
test_redq_speed[False-backward] 28.3027ms 22.6880ms 44.0761 Ops/s 43.0104 Ops/s $\color{#35bf28}+2.48\%$
test_redq_speed[True-None] 6.3521ms 5.2437ms 190.7055 Ops/s 186.1137 Ops/s $\color{#35bf28}+2.47\%$
test_redq_speed[True-backward] 13.5536ms 13.1070ms 76.2949 Ops/s 76.4595 Ops/s $\color{#d91a1a}-0.22\%$
test_redq_speed[reduce-overhead-None] 6.0980ms 5.5957ms 178.7078 Ops/s 171.9576 Ops/s $\color{#35bf28}+3.93\%$
test_redq_speed[reduce-overhead-backward] 13.3563ms 12.8937ms 77.5573 Ops/s 70.9739 Ops/s $\textbf{\color{#35bf28}+9.28\%}$
test_redq_deprec_speed[False-None] 15.3611ms 13.3690ms 74.7997 Ops/s 72.2310 Ops/s $\color{#35bf28}+3.56\%$
test_redq_deprec_speed[False-backward] 21.2725ms 19.3182ms 51.7646 Ops/s 51.2036 Ops/s $\color{#35bf28}+1.10\%$
test_redq_deprec_speed[True-None] 4.7818ms 3.9530ms 252.9722 Ops/s 241.7892 Ops/s $\color{#35bf28}+4.63\%$
test_redq_deprec_speed[True-backward] 9.4646ms 8.5323ms 117.2023 Ops/s 113.2892 Ops/s $\color{#35bf28}+3.45\%$
test_redq_deprec_speed[reduce-overhead-None] 4.9801ms 3.9537ms 252.9266 Ops/s 249.3564 Ops/s $\color{#35bf28}+1.43\%$
test_redq_deprec_speed[reduce-overhead-backward] 9.9693ms 8.9711ms 111.4693 Ops/s 111.1683 Ops/s $\color{#35bf28}+0.27\%$
test_td3_speed[False-None] 8.6132ms 8.2464ms 121.2651 Ops/s 118.4092 Ops/s $\color{#35bf28}+2.41\%$
test_td3_speed[False-backward] 11.4331ms 10.7851ms 92.7208 Ops/s 90.8561 Ops/s $\color{#35bf28}+2.05\%$
test_td3_speed[True-None] 1.9341ms 1.7748ms 563.4539 Ops/s 540.0170 Ops/s $\color{#35bf28}+4.34\%$
test_td3_speed[True-backward] 3.8924ms 3.4227ms 292.1705 Ops/s 277.4049 Ops/s $\textbf{\color{#35bf28}+5.32\%}$
test_td3_speed[reduce-overhead-None] 2.0060ms 1.7757ms 563.1469 Ops/s 540.1182 Ops/s $\color{#35bf28}+4.26\%$
test_td3_speed[reduce-overhead-backward] 3.7241ms 3.4225ms 292.1836 Ops/s 273.7125 Ops/s $\textbf{\color{#35bf28}+6.75\%}$
test_cql_speed[False-None] 39.8232ms 37.4130ms 26.7287 Ops/s 26.7397 Ops/s $\color{#d91a1a}-0.04\%$
test_cql_speed[False-backward] 70.4726ms 49.8844ms 20.0463 Ops/s 21.0875 Ops/s $\color{#d91a1a}-4.94\%$
test_cql_speed[True-None] 17.5359ms 16.2553ms 61.5183 Ops/s 60.3151 Ops/s $\color{#35bf28}+1.99\%$
test_cql_speed[True-backward] 24.3807ms 23.4697ms 42.6081 Ops/s 42.1221 Ops/s $\color{#35bf28}+1.15\%$
test_cql_speed[reduce-overhead-None] 17.5610ms 16.4091ms 60.9419 Ops/s 60.3142 Ops/s $\color{#35bf28}+1.04\%$
test_cql_speed[reduce-overhead-backward] 28.0830ms 23.8837ms 41.8696 Ops/s 42.1542 Ops/s $\color{#d91a1a}-0.68\%$
test_a2c_speed[False-None] 8.4533ms 7.2940ms 137.0990 Ops/s 131.8476 Ops/s $\color{#35bf28}+3.98\%$
test_a2c_speed[False-backward] 15.8320ms 14.8973ms 67.1261 Ops/s 66.3935 Ops/s $\color{#35bf28}+1.10\%$
test_a2c_speed[True-None] 4.6328ms 3.7729ms 265.0452 Ops/s 263.2114 Ops/s $\color{#35bf28}+0.70\%$
test_a2c_speed[True-backward] 11.0203ms 10.4447ms 95.7425 Ops/s 91.4669 Ops/s $\color{#35bf28}+4.67\%$
test_a2c_speed[reduce-overhead-None] 4.1880ms 3.7788ms 264.6308 Ops/s 261.6610 Ops/s $\color{#35bf28}+1.14\%$
test_a2c_speed[reduce-overhead-backward] 11.4277ms 10.4257ms 95.9170 Ops/s 95.8621 Ops/s $\color{#35bf28}+0.06\%$
test_ppo_speed[False-None] 8.9144ms 7.5194ms 132.9897 Ops/s 131.2601 Ops/s $\color{#35bf28}+1.32\%$
test_ppo_speed[False-backward] 16.6080ms 15.2116ms 65.7391 Ops/s 66.4534 Ops/s $\color{#d91a1a}-1.07\%$
test_ppo_speed[True-None] 4.9529ms 4.1195ms 242.7487 Ops/s 239.6518 Ops/s $\color{#35bf28}+1.29\%$
test_ppo_speed[True-backward] 11.4607ms 10.4894ms 95.3346 Ops/s 95.0902 Ops/s $\color{#35bf28}+0.26\%$
test_ppo_speed[reduce-overhead-None] 5.4650ms 4.4206ms 226.2130 Ops/s 240.5312 Ops/s $\textbf{\color{#d91a1a}-5.95\%}$
test_ppo_speed[reduce-overhead-backward] 11.3656ms 10.4934ms 95.2983 Ops/s 95.9117 Ops/s $\color{#d91a1a}-0.64\%$
test_reinforce_speed[False-None] 7.7643ms 6.7483ms 148.1845 Ops/s 147.8471 Ops/s $\color{#35bf28}+0.23\%$
test_reinforce_speed[False-backward] 11.5175ms 10.1849ms 98.1847 Ops/s 99.6143 Ops/s $\color{#d91a1a}-1.44\%$
test_reinforce_speed[True-None] 3.7627ms 3.1296ms 319.5275 Ops/s 314.9078 Ops/s $\color{#35bf28}+1.47\%$
test_reinforce_speed[True-backward] 10.1774ms 9.2521ms 108.0835 Ops/s 107.6932 Ops/s $\color{#35bf28}+0.36\%$
test_reinforce_speed[reduce-overhead-None] 3.6742ms 3.0717ms 325.5514 Ops/s 318.9451 Ops/s $\color{#35bf28}+2.07\%$
test_reinforce_speed[reduce-overhead-backward] 10.1413ms 9.3427ms 107.0356 Ops/s 105.5142 Ops/s $\color{#35bf28}+1.44\%$
test_iql_speed[False-None] 35.6386ms 33.0217ms 30.2831 Ops/s 30.0444 Ops/s $\color{#35bf28}+0.79\%$
test_iql_speed[False-backward] 55.2062ms 46.5362ms 21.4887 Ops/s 21.2764 Ops/s $\color{#35bf28}+1.00\%$
test_iql_speed[True-None] 12.9600ms 11.7109ms 85.3908 Ops/s 85.5761 Ops/s $\color{#d91a1a}-0.22\%$
test_iql_speed[True-backward] 24.4767ms 23.0819ms 43.3240 Ops/s 42.6642 Ops/s $\color{#35bf28}+1.55\%$
test_iql_speed[reduce-overhead-None] 12.6871ms 11.7401ms 85.1779 Ops/s 83.7368 Ops/s $\color{#35bf28}+1.72\%$
test_iql_speed[reduce-overhead-backward] 23.8716ms 22.9712ms 43.5327 Ops/s 43.1530 Ops/s $\color{#35bf28}+0.88\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.9265ms 5.0463ms 198.1652 Ops/s 203.1492 Ops/s $\color{#d91a1a}-2.45\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2.3371ms 0.5242ms 1.9078 KOps/s 1.9327 KOps/s $\color{#d91a1a}-1.29\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.7814ms 0.5063ms 1.9749 KOps/s 1.9951 KOps/s $\color{#d91a1a}-1.01\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 7.3716ms 4.8842ms 204.7400 Ops/s 209.0092 Ops/s $\color{#d91a1a}-2.04\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.8290ms 0.5166ms 1.9359 KOps/s 1.9399 KOps/s $\color{#d91a1a}-0.21\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.7547ms 0.4930ms 2.0286 KOps/s 2.0430 KOps/s $\color{#d91a1a}-0.71\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 2.6124ms 1.6654ms 600.4639 Ops/s 598.0698 Ops/s $\color{#35bf28}+0.40\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 2.0652ms 1.5746ms 635.0935 Ops/s 635.7219 Ops/s $\color{#d91a1a}-0.10\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 5.1606ms 4.8536ms 206.0333 Ops/s 206.0117 Ops/s $\color{#35bf28}+0.01\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.9683ms 0.6697ms 1.4933 KOps/s 1.5239 KOps/s $\color{#d91a1a}-2.01\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8476ms 0.6299ms 1.5875 KOps/s 1.5700 KOps/s $\color{#35bf28}+1.11\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.5635ms 4.8608ms 205.7293 Ops/s 208.5110 Ops/s $\color{#d91a1a}-1.33\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 3.0953ms 0.5308ms 1.8838 KOps/s 1.8913 KOps/s $\color{#d91a1a}-0.39\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.7145ms 0.5035ms 1.9862 KOps/s 1.9593 KOps/s $\color{#35bf28}+1.37\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 7.5400ms 4.8179ms 207.5613 Ops/s 210.9792 Ops/s $\color{#d91a1a}-1.62\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.1408ms 0.5184ms 1.9291 KOps/s 1.9157 KOps/s $\color{#35bf28}+0.70\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.7498ms 0.4953ms 2.0191 KOps/s 2.0134 KOps/s $\color{#35bf28}+0.28\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 7.5876ms 4.9966ms 200.1359 Ops/s 205.7112 Ops/s $\color{#d91a1a}-2.71\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.5281s 1.4459ms 691.6012 Ops/s 1.5178 KOps/s $\textbf{\color{#d91a1a}-54.43\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8229ms 0.6362ms 1.5717 KOps/s 1.5739 KOps/s $\color{#d91a1a}-0.14\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 7.1506ms 4.3679ms 228.9436 Ops/s 245.2088 Ops/s $\textbf{\color{#d91a1a}-6.63\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 5.3043ms 2.2934ms 436.0410 Ops/s 433.5663 Ops/s $\color{#35bf28}+0.57\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 6.8616ms 1.4351ms 696.8286 Ops/s 718.2718 Ops/s $\color{#d91a1a}-2.99\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 6.4551ms 4.2970ms 232.7221 Ops/s 238.3421 Ops/s $\color{#d91a1a}-2.36\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 6.4209ms 2.3206ms 430.9190 Ops/s 456.3277 Ops/s $\textbf{\color{#d91a1a}-5.57\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 7.1347ms 1.6783ms 595.8367 Ops/s 753.6451 Ops/s $\textbf{\color{#d91a1a}-20.94\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.4408s 13.2787ms 75.3086 Ops/s 31.5204 Ops/s $\textbf{\color{#35bf28}+138.92\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 8.9768ms 2.5751ms 388.3386 Ops/s 386.4882 Ops/s $\color{#35bf28}+0.48\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 5.1332ms 1.5946ms 627.1082 Ops/s 622.1540 Ops/s $\color{#35bf28}+0.80\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 13.1793ms 11.6246ms 86.0247 Ops/s 80.8298 Ops/s $\textbf{\color{#35bf28}+6.43\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 17.0167ms 15.5251ms 64.4118 Ops/s 68.2939 Ops/s $\textbf{\color{#d91a1a}-5.68\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 21.9180ms 20.6982ms 48.3133 Ops/s 46.9742 Ops/s $\color{#35bf28}+2.85\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 17.8077ms 15.8761ms 62.9877 Ops/s 67.2339 Ops/s $\textbf{\color{#d91a1a}-6.32\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 22.4136ms 20.6739ms 48.3703 Ops/s 46.9643 Ops/s $\color{#35bf28}+2.99\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 18.0435ms 16.7878ms 59.5672 Ops/s 61.1192 Ops/s $\color{#d91a1a}-2.54\%$

Copy link

github-actions bot commented Feb 28, 2025

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}10$. Worsened: $\large\color{#d91a1a}14$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.8977s 0.8102s 1.2342 Ops/s 1.2276 Ops/s $\color{#35bf28}+0.54\%$
test_transformed 1.5141s 1.4322s 0.6982 Ops/s 0.6955 Ops/s $\color{#35bf28}+0.40\%$
test_serial 2.3843s 2.2914s 0.4364 Ops/s 0.4344 Ops/s $\color{#35bf28}+0.46\%$
test_parallel 1.9341s 1.8554s 0.5390 Ops/s 0.5435 Ops/s $\color{#d91a1a}-0.83\%$
test_step_mdp_speed[True-True-True-True-True] 0.2195ms 38.4691μs 25.9949 KOps/s 26.0362 KOps/s $\color{#d91a1a}-0.16\%$
test_step_mdp_speed[True-True-True-True-False] 0.1218ms 22.4867μs 44.4707 KOps/s 44.4145 KOps/s $\color{#35bf28}+0.13\%$
test_step_mdp_speed[True-True-True-False-True] 61.6410μs 21.0637μs 47.4751 KOps/s 45.6948 KOps/s $\color{#35bf28}+3.90\%$
test_step_mdp_speed[True-True-True-False-False] 95.8420μs 12.4551μs 80.2884 KOps/s 80.4293 KOps/s $\color{#d91a1a}-0.18\%$
test_step_mdp_speed[True-True-False-True-True] 0.1511ms 40.3277μs 24.7968 KOps/s 24.1451 KOps/s $\color{#35bf28}+2.70\%$
test_step_mdp_speed[True-True-False-True-False] 0.2248ms 24.4169μs 40.9552 KOps/s 40.3632 KOps/s $\color{#35bf28}+1.47\%$
test_step_mdp_speed[True-True-False-False-True] 0.2107ms 23.3298μs 42.8637 KOps/s 41.2605 KOps/s $\color{#35bf28}+3.89\%$
test_step_mdp_speed[True-True-False-False-False] 47.1510μs 14.6231μs 68.3850 KOps/s 67.3522 KOps/s $\color{#35bf28}+1.53\%$
test_step_mdp_speed[True-False-True-True-True] 78.7720μs 42.4539μs 23.5550 KOps/s 22.9870 KOps/s $\color{#35bf28}+2.47\%$
test_step_mdp_speed[True-False-True-True-False] 57.9910μs 27.3633μs 36.5453 KOps/s 36.9114 KOps/s $\color{#d91a1a}-0.99\%$
test_step_mdp_speed[True-False-True-False-True] 0.1198ms 23.9874μs 41.6886 KOps/s 41.4395 KOps/s $\color{#35bf28}+0.60\%$
test_step_mdp_speed[True-False-True-False-False] 49.1710μs 14.8313μs 67.4248 KOps/s 68.5002 KOps/s $\color{#d91a1a}-1.57\%$
test_step_mdp_speed[True-False-False-True-True] 0.1430ms 44.9898μs 22.2273 KOps/s 21.9237 KOps/s $\color{#35bf28}+1.38\%$
test_step_mdp_speed[True-False-False-True-False] 64.3510μs 28.6692μs 34.8806 KOps/s 34.6093 KOps/s $\color{#35bf28}+0.78\%$
test_step_mdp_speed[True-False-False-False-True] 61.9310μs 25.4460μs 39.2988 KOps/s 38.7606 KOps/s $\color{#35bf28}+1.39\%$
test_step_mdp_speed[True-False-False-False-False] 51.7310μs 16.9315μs 59.0614 KOps/s 58.5149 KOps/s $\color{#35bf28}+0.93\%$
test_step_mdp_speed[False-True-True-True-True] 86.4320μs 42.7337μs 23.4007 KOps/s 22.9876 KOps/s $\color{#35bf28}+1.80\%$
test_step_mdp_speed[False-True-True-True-False] 61.1310μs 26.7032μs 37.4487 KOps/s 36.9460 KOps/s $\color{#35bf28}+1.36\%$
test_step_mdp_speed[False-True-True-False-True] 63.6710μs 28.1413μs 35.5350 KOps/s 35.5515 KOps/s $\color{#d91a1a}-0.05\%$
test_step_mdp_speed[False-True-True-False-False] 53.8710μs 16.5161μs 60.5470 KOps/s 60.0220 KOps/s $\color{#35bf28}+0.87\%$
test_step_mdp_speed[False-True-False-True-True] 0.1749ms 45.3144μs 22.0681 KOps/s 21.9033 KOps/s $\color{#35bf28}+0.75\%$
test_step_mdp_speed[False-True-False-True-False] 66.7910μs 29.3032μs 34.1259 KOps/s 34.4386 KOps/s $\color{#d91a1a}-0.91\%$
test_step_mdp_speed[False-True-False-False-True] 3.2409ms 29.9021μs 33.4424 KOps/s 33.0858 KOps/s $\color{#35bf28}+1.08\%$
test_step_mdp_speed[False-True-False-False-False] 54.4710μs 18.6570μs 53.5993 KOps/s 52.7631 KOps/s $\color{#35bf28}+1.58\%$
test_step_mdp_speed[False-False-True-True-True] 0.1126ms 47.1604μs 21.2042 KOps/s 20.7023 KOps/s $\color{#35bf28}+2.42\%$
test_step_mdp_speed[False-False-True-True-False] 66.9210μs 31.1944μs 32.0570 KOps/s 31.6273 KOps/s $\color{#35bf28}+1.36\%$
test_step_mdp_speed[False-False-True-False-True] 0.1548ms 29.9648μs 33.3725 KOps/s 34.1558 KOps/s $\color{#d91a1a}-2.29\%$
test_step_mdp_speed[False-False-True-False-False] 51.0610μs 18.5619μs 53.8739 KOps/s 53.8057 KOps/s $\color{#35bf28}+0.13\%$
test_step_mdp_speed[False-False-False-True-True] 88.6910μs 48.1228μs 20.7802 KOps/s 20.6245 KOps/s $\color{#35bf28}+0.75\%$
test_step_mdp_speed[False-False-False-True-False] 67.8510μs 33.0291μs 30.2763 KOps/s 29.7776 KOps/s $\color{#35bf28}+1.67\%$
test_step_mdp_speed[False-False-False-False-True] 67.6210μs 31.0115μs 32.2461 KOps/s 31.4375 KOps/s $\color{#35bf28}+2.57\%$
test_step_mdp_speed[False-False-False-False-False] 76.9110μs 20.3612μs 49.1131 KOps/s 47.4645 KOps/s $\color{#35bf28}+3.47\%$
test_values[generalized_advantage_estimate-True-True] 24.1255ms 23.1152ms 43.2616 Ops/s 43.6315 Ops/s $\color{#d91a1a}-0.85\%$
test_values[vec_generalized_advantage_estimate-True-True] 96.8377ms 2.8132ms 355.4618 Ops/s 317.5461 Ops/s $\textbf{\color{#35bf28}+11.94\%}$
test_values[td0_return_estimate-False-False] 0.1014ms 75.7467μs 13.2019 KOps/s 13.2485 KOps/s $\color{#d91a1a}-0.35\%$
test_values[td1_return_estimate-False-False] 52.9664ms 51.4850ms 19.4231 Ops/s 19.1585 Ops/s $\color{#35bf28}+1.38\%$
test_values[vec_td1_return_estimate-False-False] 1.3151ms 1.0572ms 945.9265 Ops/s 950.8560 Ops/s $\color{#d91a1a}-0.52\%$
test_values[td_lambda_return_estimate-True-False] 86.2479ms 82.4220ms 12.1327 Ops/s 12.3017 Ops/s $\color{#d91a1a}-1.37\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.3058ms 1.0484ms 953.8606 Ops/s 952.6039 Ops/s $\color{#35bf28}+0.13\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 23.2636ms 22.9385ms 43.5949 Ops/s 43.8392 Ops/s $\color{#d91a1a}-0.56\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.0004ms 0.7224ms 1.3842 KOps/s 1.3954 KOps/s $\color{#d91a1a}-0.80\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.8170ms 0.6403ms 1.5618 KOps/s 1.5762 KOps/s $\color{#d91a1a}-0.92\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.6073ms 1.4554ms 687.0946 Ops/s 686.8111 Ops/s $\color{#35bf28}+0.04\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.8335ms 0.6482ms 1.5427 KOps/s 1.5399 KOps/s $\color{#35bf28}+0.18\%$
test_dqn_speed[False-None] 1.6195ms 1.4497ms 689.7839 Ops/s 689.2850 Ops/s $\color{#35bf28}+0.07\%$
test_dqn_speed[False-backward] 2.1675ms 2.0364ms 491.0704 Ops/s 490.9889 Ops/s $\color{#35bf28}+0.02\%$
test_dqn_speed[True-None] 0.6889ms 0.5399ms 1.8522 KOps/s 1.8545 KOps/s $\color{#d91a1a}-0.13\%$
test_dqn_speed[True-backward] 1.2579ms 1.1935ms 837.8734 Ops/s 826.6923 Ops/s $\color{#35bf28}+1.35\%$
test_dqn_speed[reduce-overhead-None] 0.7059ms 0.5502ms 1.8176 KOps/s 1.8002 KOps/s $\color{#35bf28}+0.97\%$
test_dqn_speed[reduce-overhead-backward] 1.1762ms 1.0465ms 955.6048 Ops/s 1.0543 KOps/s $\textbf{\color{#d91a1a}-9.36\%}$
test_ddpg_speed[False-None] 3.1859ms 2.7122ms 368.7017 Ops/s 360.9600 Ops/s $\color{#35bf28}+2.14\%$
test_ddpg_speed[False-backward] 4.5025ms 4.0320ms 248.0159 Ops/s 250.6203 Ops/s $\color{#d91a1a}-1.04\%$
test_ddpg_speed[True-None] 1.5428ms 1.2861ms 777.5253 Ops/s 764.3794 Ops/s $\color{#35bf28}+1.72\%$
test_ddpg_speed[True-backward] 2.8165ms 2.4775ms 403.6260 Ops/s 419.1089 Ops/s $\color{#d91a1a}-3.69\%$
test_ddpg_speed[reduce-overhead-None] 1.5618ms 1.2985ms 770.1250 Ops/s 759.9179 Ops/s $\color{#35bf28}+1.34\%$
test_ddpg_speed[reduce-overhead-backward] 2.2451ms 1.9757ms 506.1575 Ops/s 538.5578 Ops/s $\textbf{\color{#d91a1a}-6.02\%}$
test_sac_speed[False-None] 8.2001ms 7.7332ms 129.3118 Ops/s 127.3353 Ops/s $\color{#35bf28}+1.55\%$
test_sac_speed[False-backward] 11.3436ms 10.8181ms 92.4378 Ops/s 93.8621 Ops/s $\color{#d91a1a}-1.52\%$
test_sac_speed[True-None] 2.0569ms 1.7784ms 562.2890 Ops/s 538.9671 Ops/s $\color{#35bf28}+4.33\%$
test_sac_speed[True-backward] 3.7567ms 3.6201ms 276.2374 Ops/s 273.4590 Ops/s $\color{#35bf28}+1.02\%$
test_sac_speed[reduce-overhead-None] 21.4458ms 12.0327ms 83.1065 Ops/s 82.3418 Ops/s $\color{#35bf28}+0.93\%$
test_sac_speed[reduce-overhead-backward] 1.8618ms 1.7257ms 579.4917 Ops/s 616.8634 Ops/s $\textbf{\color{#d91a1a}-6.06\%}$
test_redq_speed[False-None] 7.8294ms 7.3642ms 135.7915 Ops/s 129.7098 Ops/s $\color{#35bf28}+4.69\%$
test_redq_speed[False-backward] 12.1437ms 11.3693ms 87.9560 Ops/s 89.0547 Ops/s $\color{#d91a1a}-1.23\%$
test_redq_speed[True-None] 2.7416ms 2.2857ms 437.5094 Ops/s 436.8532 Ops/s $\color{#35bf28}+0.15\%$
test_redq_speed[True-backward] 4.3744ms 4.1233ms 242.5247 Ops/s 243.1138 Ops/s $\color{#d91a1a}-0.24\%$
test_redq_speed[reduce-overhead-None] 2.6149ms 2.2864ms 437.3700 Ops/s 435.2358 Ops/s $\color{#35bf28}+0.49\%$
test_redq_speed[reduce-overhead-backward] 4.3544ms 4.0947ms 244.2175 Ops/s 250.7433 Ops/s $\color{#d91a1a}-2.60\%$
test_redq_deprec_speed[False-None] 9.1894ms 8.7946ms 113.7066 Ops/s 113.1495 Ops/s $\color{#35bf28}+0.49\%$
test_redq_deprec_speed[False-backward] 12.5487ms 11.8418ms 84.4469 Ops/s 85.8736 Ops/s $\color{#d91a1a}-1.66\%$
test_redq_deprec_speed[True-None] 2.8818ms 2.5540ms 391.5446 Ops/s 387.4445 Ops/s $\color{#35bf28}+1.06\%$
test_redq_deprec_speed[True-backward] 4.6634ms 4.3577ms 229.4804 Ops/s 225.0350 Ops/s $\color{#35bf28}+1.98\%$
test_redq_deprec_speed[reduce-overhead-None] 2.8738ms 2.5724ms 388.7380 Ops/s 384.0824 Ops/s $\color{#35bf28}+1.21\%$
test_redq_deprec_speed[reduce-overhead-backward] 4.6303ms 4.3530ms 229.7245 Ops/s 225.5815 Ops/s $\color{#35bf28}+1.84\%$
test_td3_speed[False-None] 7.8047ms 7.6951ms 129.9536 Ops/s 129.4591 Ops/s $\color{#35bf28}+0.38\%$
test_td3_speed[False-backward] 10.8828ms 10.1502ms 98.5201 Ops/s 98.0657 Ops/s $\color{#35bf28}+0.46\%$
test_td3_speed[True-None] 1.6085ms 1.5777ms 633.8342 Ops/s 625.1592 Ops/s $\color{#35bf28}+1.39\%$
test_td3_speed[True-backward] 3.3819ms 3.2280ms 309.7904 Ops/s 302.5371 Ops/s $\color{#35bf28}+2.40\%$
test_td3_speed[reduce-overhead-None] 50.1407ms 25.8578ms 38.6731 Ops/s 39.8741 Ops/s $\color{#d91a1a}-3.01\%$
test_td3_speed[reduce-overhead-backward] 1.5960ms 1.4343ms 697.1941 Ops/s 690.5156 Ops/s $\color{#35bf28}+0.97\%$
test_cql_speed[False-None] 16.5489ms 16.1695ms 61.8447 Ops/s 61.1025 Ops/s $\color{#35bf28}+1.21\%$
test_cql_speed[False-backward] 21.9940ms 21.4207ms 46.6837 Ops/s 46.2946 Ops/s $\color{#35bf28}+0.84\%$
test_cql_speed[True-None] 3.4406ms 3.1666ms 315.7951 Ops/s 315.4657 Ops/s $\color{#35bf28}+0.10\%$
test_cql_speed[True-backward] 5.9164ms 5.5109ms 181.4576 Ops/s 183.1623 Ops/s $\color{#d91a1a}-0.93\%$
test_cql_speed[reduce-overhead-None] 22.4527ms 13.2680ms 75.3691 Ops/s 75.2594 Ops/s $\color{#35bf28}+0.15\%$
test_cql_speed[reduce-overhead-backward] 2.0988ms 1.9386ms 515.8427 Ops/s 508.5902 Ops/s $\color{#35bf28}+1.43\%$
test_a2c_speed[False-None] 3.2915ms 3.0472ms 328.1651 Ops/s 323.2590 Ops/s $\color{#35bf28}+1.52\%$
test_a2c_speed[False-backward] 6.5771ms 5.9999ms 166.6708 Ops/s 163.3346 Ops/s $\color{#35bf28}+2.04\%$
test_a2c_speed[True-None] 1.5660ms 1.3146ms 760.6995 Ops/s 759.3392 Ops/s $\color{#35bf28}+0.18\%$
test_a2c_speed[True-backward] 3.1452ms 2.9832ms 335.2057 Ops/s 330.6079 Ops/s $\color{#35bf28}+1.39\%$
test_a2c_speed[reduce-overhead-None] 16.1191ms 9.1146ms 109.7138 Ops/s 112.3263 Ops/s $\color{#d91a1a}-2.33\%$
test_a2c_speed[reduce-overhead-backward] 1.7736ms 1.5832ms 631.6498 Ops/s 628.3102 Ops/s $\color{#35bf28}+0.53\%$
test_ppo_speed[False-None] 3.9125ms 3.5489ms 281.7788 Ops/s 277.9469 Ops/s $\color{#35bf28}+1.38\%$
test_ppo_speed[False-backward] 7.1008ms 6.7200ms 148.8089 Ops/s 145.8049 Ops/s $\color{#35bf28}+2.06\%$
test_ppo_speed[True-None] 1.6312ms 1.3932ms 717.7676 Ops/s 716.1831 Ops/s $\color{#35bf28}+0.22\%$
test_ppo_speed[True-backward] 3.3355ms 3.1605ms 316.4064 Ops/s 312.5692 Ops/s $\color{#35bf28}+1.23\%$
test_ppo_speed[reduce-overhead-None] 1.1108ms 0.9540ms 1.0482 KOps/s 1.0636 KOps/s $\color{#d91a1a}-1.45\%$
test_ppo_speed[reduce-overhead-backward] 1.6163ms 1.4716ms 679.5543 Ops/s 636.6750 Ops/s $\textbf{\color{#35bf28}+6.73\%}$
test_reinforce_speed[False-None] 2.3507ms 2.1860ms 457.4587 Ops/s 451.4793 Ops/s $\color{#35bf28}+1.32\%$
test_reinforce_speed[False-backward] 3.4904ms 3.3213ms 301.0899 Ops/s 304.6123 Ops/s $\color{#d91a1a}-1.16\%$
test_reinforce_speed[True-None] 1.4414ms 1.2539ms 797.4812 Ops/s 779.8925 Ops/s $\color{#35bf28}+2.26\%$
test_reinforce_speed[True-backward] 3.0020ms 2.8661ms 348.9013 Ops/s 334.2550 Ops/s $\color{#35bf28}+4.38\%$
test_reinforce_speed[reduce-overhead-None] 19.4072ms 10.1208ms 98.8067 Ops/s 99.6409 Ops/s $\color{#d91a1a}-0.84\%$
test_reinforce_speed[reduce-overhead-backward] 1.6181ms 1.4680ms 681.1773 Ops/s 613.8223 Ops/s $\textbf{\color{#35bf28}+10.97\%}$
test_iql_speed[False-None] 9.3314ms 8.9051ms 112.2951 Ops/s 110.2602 Ops/s $\color{#35bf28}+1.85\%$
test_iql_speed[False-backward] 12.7878ms 12.3663ms 80.8647 Ops/s 78.1314 Ops/s $\color{#35bf28}+3.50\%$
test_iql_speed[True-None] 2.4371ms 2.1617ms 462.5976 Ops/s 452.0299 Ops/s $\color{#35bf28}+2.34\%$
test_iql_speed[True-backward] 5.1525ms 4.7960ms 208.5049 Ops/s 201.8928 Ops/s $\color{#35bf28}+3.28\%$
test_iql_speed[reduce-overhead-None] 0.4836s 12.7566ms 78.3910 Ops/s 89.9246 Ops/s $\textbf{\color{#d91a1a}-12.83\%}$
test_iql_speed[reduce-overhead-backward] 2.2142ms 1.9513ms 512.4896 Ops/s 524.1340 Ops/s $\color{#d91a1a}-2.22\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.4418ms 6.0439ms 165.4548 Ops/s 164.0422 Ops/s $\color{#35bf28}+0.86\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.6137ms 0.2850ms 3.5093 KOps/s 3.8113 KOps/s $\textbf{\color{#d91a1a}-7.93\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5487ms 0.2879ms 3.4734 KOps/s 4.0957 KOps/s $\textbf{\color{#d91a1a}-15.19\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.1300ms 5.7001ms 175.4343 Ops/s 171.8748 Ops/s $\color{#35bf28}+2.07\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.1925ms 0.3343ms 2.9917 KOps/s 3.5635 KOps/s $\textbf{\color{#d91a1a}-16.04\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6631ms 0.3422ms 2.9223 KOps/s 3.5306 KOps/s $\textbf{\color{#d91a1a}-17.23\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.4304ms 1.2150ms 823.0664 Ops/s 817.8338 Ops/s $\color{#35bf28}+0.64\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.3908ms 1.1339ms 881.9380 Ops/s 875.1711 Ops/s $\color{#35bf28}+0.77\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.2821ms 5.8713ms 170.3186 Ops/s 167.3946 Ops/s $\color{#35bf28}+1.75\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.0318ms 0.4030ms 2.4816 KOps/s 2.3561 KOps/s $\textbf{\color{#35bf28}+5.33\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.5955ms 0.3766ms 2.6555 KOps/s 2.3585 KOps/s $\textbf{\color{#35bf28}+12.59\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.0619ms 5.7766ms 173.1126 Ops/s 170.0139 Ops/s $\color{#35bf28}+1.82\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.5796ms 0.3302ms 3.0281 KOps/s 3.6930 KOps/s $\textbf{\color{#d91a1a}-18.00\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6019ms 0.3446ms 2.9023 KOps/s 4.1028 KOps/s $\textbf{\color{#d91a1a}-29.26\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.2263ms 5.7121ms 175.0684 Ops/s 171.7718 Ops/s $\color{#35bf28}+1.92\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.9979ms 0.3112ms 3.2135 KOps/s 3.0616 KOps/s $\color{#35bf28}+4.96\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.4658ms 0.2404ms 4.1592 KOps/s 3.3786 KOps/s $\textbf{\color{#35bf28}+23.10\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.2069ms 5.9144ms 169.0791 Ops/s 165.8325 Ops/s $\color{#35bf28}+1.96\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.0923ms 0.4161ms 2.4030 KOps/s 2.2610 KOps/s $\textbf{\color{#35bf28}+6.28\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.6761ms 0.4622ms 2.1635 KOps/s 2.5943 KOps/s $\textbf{\color{#d91a1a}-16.61\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 6.9160ms 5.2949ms 188.8613 Ops/s 184.4249 Ops/s $\color{#35bf28}+2.41\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 10.1423ms 2.0537ms 486.9243 Ops/s 430.5079 Ops/s $\textbf{\color{#35bf28}+13.10\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 6.5642ms 1.1828ms 845.4212 Ops/s 835.3129 Ops/s $\color{#35bf28}+1.21\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.4470s 14.1945ms 70.4500 Ops/s 184.9536 Ops/s $\textbf{\color{#d91a1a}-61.91\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 5.6950ms 2.0009ms 499.7832 Ops/s 412.2111 Ops/s $\textbf{\color{#35bf28}+21.24\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 10.2929ms 1.2402ms 806.3409 Ops/s 833.5677 Ops/s $\color{#d91a1a}-3.27\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 8.8693ms 5.5247ms 181.0067 Ops/s 31.6601 Ops/s $\textbf{\color{#35bf28}+471.72\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 8.5704ms 2.1843ms 457.8203 Ops/s 504.1175 Ops/s $\textbf{\color{#d91a1a}-9.18\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 8.8093ms 1.3879ms 720.5034 Ops/s 838.3193 Ops/s $\textbf{\color{#d91a1a}-14.05\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 13.4112ms 13.1167ms 76.2387 Ops/s 73.0354 Ops/s $\color{#35bf28}+4.39\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 18.2869ms 16.3974ms 60.9851 Ops/s 60.1364 Ops/s $\color{#35bf28}+1.41\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 18.0832ms 17.7877ms 56.2187 Ops/s 54.1828 Ops/s $\color{#35bf28}+3.76\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 18.3899ms 16.8211ms 59.4492 Ops/s 58.2315 Ops/s $\color{#35bf28}+2.09\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 18.2770ms 17.7877ms 56.2187 Ops/s 55.0122 Ops/s $\color{#35bf28}+2.19\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 19.1462ms 17.8599ms 55.9912 Ops/s 54.8371 Ops/s $\color{#35bf28}+2.10\%$

[ghstack-poisoned]
[ghstack-poisoned]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants