Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature,Deprecation] Split KLRewardTransform in more modules #2813

Open
wants to merge 1 commit into
base: gh/vmoens/92/base
Choose a base branch
from

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Feb 27, 2025

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Feb 27, 2025
ghstack-source-id: 884e9307a77f01ccee8bce110ebb6fbb2211287c
Pull Request resolved: #2813
Copy link

pytorch-bot bot commented Feb 27, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2813

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

❌ 5 New Failures, 1 Unrelated Failure

As of commit 9b8056d with merge base b538c66 (image):

NEW FAILURES - The following jobs have failed:

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 27, 2025
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}35$. Worsened: $\large\color{#d91a1a}4$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.6113s 0.5262s 1.9005 Ops/s 1.9362 Ops/s $\color{#d91a1a}-1.84\%$
test_transformed 1.1105s 1.0223s 0.9782 Ops/s 0.9546 Ops/s $\color{#35bf28}+2.48\%$
test_serial 1.6304s 1.5409s 0.6490 Ops/s 0.6390 Ops/s $\color{#35bf28}+1.56\%$
test_parallel 1.3835s 1.2997s 0.7694 Ops/s 0.7800 Ops/s $\color{#d91a1a}-1.36\%$
test_step_mdp_speed[True-True-True-True-True] 0.1690ms 29.8717μs 33.4765 KOps/s 31.5703 KOps/s $\textbf{\color{#35bf28}+6.04\%}$
test_step_mdp_speed[True-True-True-True-False] 53.4900μs 17.7451μs 56.3536 KOps/s 53.1219 KOps/s $\textbf{\color{#35bf28}+6.08\%}$
test_step_mdp_speed[True-True-True-False-True] 53.3200μs 17.2171μs 58.0818 KOps/s 55.6832 KOps/s $\color{#35bf28}+4.31\%$
test_step_mdp_speed[True-True-True-False-False] 46.8080μs 10.0426μs 99.5758 KOps/s 93.5963 KOps/s $\textbf{\color{#35bf28}+6.39\%}$
test_step_mdp_speed[True-True-False-True-True] 73.2280μs 32.1305μs 31.1231 KOps/s 28.6825 KOps/s $\textbf{\color{#35bf28}+8.51\%}$
test_step_mdp_speed[True-True-False-True-False] 46.6480μs 19.5251μs 51.2160 KOps/s 47.7231 KOps/s $\textbf{\color{#35bf28}+7.32\%}$
test_step_mdp_speed[True-True-False-False-True] 56.9170μs 18.8534μs 53.0409 KOps/s 48.6390 KOps/s $\textbf{\color{#35bf28}+9.05\%}$
test_step_mdp_speed[True-True-False-False-False] 43.8020μs 11.7705μs 84.9578 KOps/s 78.3993 KOps/s $\textbf{\color{#35bf28}+8.37\%}$
test_step_mdp_speed[True-False-True-True-True] 80.0700μs 33.9729μs 29.4352 KOps/s 27.3171 KOps/s $\textbf{\color{#35bf28}+7.75\%}$
test_step_mdp_speed[True-False-True-True-False] 64.5010μs 21.3270μs 46.8890 KOps/s 43.4692 KOps/s $\textbf{\color{#35bf28}+7.87\%}$
test_step_mdp_speed[True-False-True-False-True] 49.4530μs 18.9877μs 52.6656 KOps/s 49.9273 KOps/s $\textbf{\color{#35bf28}+5.48\%}$
test_step_mdp_speed[True-False-True-False-False] 53.9310μs 11.7760μs 84.9187 KOps/s 78.6308 KOps/s $\textbf{\color{#35bf28}+8.00\%}$
test_step_mdp_speed[True-False-False-True-True] 69.1300μs 35.5703μs 28.1133 KOps/s 26.1621 KOps/s $\textbf{\color{#35bf28}+7.46\%}$
test_step_mdp_speed[True-False-False-True-False] 58.2590μs 23.0088μs 43.4616 KOps/s 40.0988 KOps/s $\textbf{\color{#35bf28}+8.39\%}$
test_step_mdp_speed[True-False-False-False-True] 54.6330μs 20.5433μs 48.6777 KOps/s 45.4638 KOps/s $\textbf{\color{#35bf28}+7.07\%}$
test_step_mdp_speed[True-False-False-False-False] 53.5410μs 13.4823μs 74.1712 KOps/s 68.9734 KOps/s $\textbf{\color{#35bf28}+7.54\%}$
test_step_mdp_speed[False-True-True-True-True] 68.3480μs 33.8414μs 29.5496 KOps/s 27.6147 KOps/s $\textbf{\color{#35bf28}+7.01\%}$
test_step_mdp_speed[False-True-True-True-False] 63.6400μs 21.5029μs 46.5053 KOps/s 43.1106 KOps/s $\textbf{\color{#35bf28}+7.87\%}$
test_step_mdp_speed[False-True-True-False-True] 52.3880μs 21.4252μs 46.6739 KOps/s 42.3090 KOps/s $\textbf{\color{#35bf28}+10.32\%}$
test_step_mdp_speed[False-True-True-False-False] 50.4840μs 13.1912μs 75.8079 KOps/s 69.9806 KOps/s $\textbf{\color{#35bf28}+8.33\%}$
test_step_mdp_speed[False-True-False-True-True] 84.4080μs 35.6393μs 28.0589 KOps/s 26.1245 KOps/s $\textbf{\color{#35bf28}+7.40\%}$
test_step_mdp_speed[False-True-False-True-False] 62.5380μs 23.2197μs 43.0668 KOps/s 39.9911 KOps/s $\textbf{\color{#35bf28}+7.69\%}$
test_step_mdp_speed[False-True-False-False-True] 2.6680ms 23.3953μs 42.7435 KOps/s 39.4915 KOps/s $\textbf{\color{#35bf28}+8.23\%}$
test_step_mdp_speed[False-True-False-False-False] 40.1650μs 14.8893μs 67.1622 KOps/s 61.0621 KOps/s $\textbf{\color{#35bf28}+9.99\%}$
test_step_mdp_speed[False-False-True-True-True] 81.8630μs 37.4002μs 26.7378 KOps/s 24.8243 KOps/s $\textbf{\color{#35bf28}+7.71\%}$
test_step_mdp_speed[False-False-True-True-False] 55.6850μs 24.8479μs 40.2449 KOps/s 36.9764 KOps/s $\textbf{\color{#35bf28}+8.84\%}$
test_step_mdp_speed[False-False-True-False-True] 57.6090μs 23.1667μs 43.1654 KOps/s 39.5321 KOps/s $\textbf{\color{#35bf28}+9.19\%}$
test_step_mdp_speed[False-False-True-False-False] 44.3930μs 14.9434μs 66.9191 KOps/s 61.4719 KOps/s $\textbf{\color{#35bf28}+8.86\%}$
test_step_mdp_speed[False-False-False-True-True] 80.8210μs 38.8140μs 25.7639 KOps/s 23.4833 KOps/s $\textbf{\color{#35bf28}+9.71\%}$
test_step_mdp_speed[False-False-False-True-False] 73.6280μs 26.6321μs 37.5487 KOps/s 34.4542 KOps/s $\textbf{\color{#35bf28}+8.98\%}$
test_step_mdp_speed[False-False-False-False-True] 55.3140μs 24.5768μs 40.6888 KOps/s 37.1233 KOps/s $\textbf{\color{#35bf28}+9.60\%}$
test_step_mdp_speed[False-False-False-False-False] 55.4040μs 16.5001μs 60.6058 KOps/s 54.4911 KOps/s $\textbf{\color{#35bf28}+11.22\%}$
test_values[generalized_advantage_estimate-True-True] 10.9030ms 9.6563ms 103.5592 Ops/s 104.1829 Ops/s $\color{#d91a1a}-0.60\%$
test_values[vec_generalized_advantage_estimate-True-True] 29.3294ms 26.6862ms 37.4726 Ops/s 37.4573 Ops/s $\color{#35bf28}+0.04\%$
test_values[td0_return_estimate-False-False] 0.2367ms 0.1791ms 5.5830 KOps/s 5.6284 KOps/s $\color{#d91a1a}-0.81\%$
test_values[td1_return_estimate-False-False] 26.0576ms 23.8631ms 41.9057 Ops/s 42.4873 Ops/s $\color{#d91a1a}-1.37\%$
test_values[vec_td1_return_estimate-False-False] 29.7138ms 26.5565ms 37.6555 Ops/s 37.4498 Ops/s $\color{#35bf28}+0.55\%$
test_values[td_lambda_return_estimate-True-False] 38.2698ms 34.3355ms 29.1243 Ops/s 29.0930 Ops/s $\color{#35bf28}+0.11\%$
test_values[vec_td_lambda_return_estimate-True-False] 28.9578ms 26.4460ms 37.8129 Ops/s 37.3668 Ops/s $\color{#35bf28}+1.19\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 12.1336ms 8.4147ms 118.8401 Ops/s 119.3971 Ops/s $\color{#d91a1a}-0.47\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.2154ms 1.9303ms 518.0627 Ops/s 506.3976 Ops/s $\color{#35bf28}+2.30\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.4406ms 0.3690ms 2.7097 KOps/s 2.7040 KOps/s $\color{#35bf28}+0.21\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 45.6636ms 44.2629ms 22.5923 Ops/s 22.5075 Ops/s $\color{#35bf28}+0.38\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 4.3547ms 3.4500ms 289.8566 Ops/s 291.4153 Ops/s $\color{#d91a1a}-0.53\%$
test_dqn_speed[False-None] 6.0872ms 1.4537ms 687.8792 Ops/s 684.1387 Ops/s $\color{#35bf28}+0.55\%$
test_dqn_speed[False-backward] 2.0084ms 1.9566ms 511.0987 Ops/s 514.3705 Ops/s $\color{#d91a1a}-0.64\%$
test_dqn_speed[True-None] 0.7529ms 0.4888ms 2.0458 KOps/s 2.0425 KOps/s $\color{#35bf28}+0.16\%$
test_dqn_speed[True-backward] 1.0289ms 0.9609ms 1.0407 KOps/s 1.0595 KOps/s $\color{#d91a1a}-1.78\%$
test_dqn_speed[reduce-overhead-None] 0.6806ms 0.5003ms 1.9987 KOps/s 2.0526 KOps/s $\color{#d91a1a}-2.63\%$
test_dqn_speed[reduce-overhead-backward] 1.3965ms 1.0112ms 988.9493 Ops/s 1.0499 KOps/s $\textbf{\color{#d91a1a}-5.80\%}$
test_ddpg_speed[False-None] 3.5339ms 2.9547ms 338.4422 Ops/s 332.7192 Ops/s $\color{#35bf28}+1.72\%$
test_ddpg_speed[False-backward] 4.2120ms 4.1112ms 243.2409 Ops/s 242.9876 Ops/s $\color{#35bf28}+0.10\%$
test_ddpg_speed[True-None] 1.5849ms 1.2471ms 801.8388 Ops/s 801.3950 Ops/s $\color{#35bf28}+0.06\%$
test_ddpg_speed[True-backward] 2.2281ms 2.1540ms 464.2494 Ops/s 466.6436 Ops/s $\color{#d91a1a}-0.51\%$
test_ddpg_speed[reduce-overhead-None] 1.7198ms 1.2556ms 796.4322 Ops/s 802.6251 Ops/s $\color{#d91a1a}-0.77\%$
test_ddpg_speed[reduce-overhead-backward] 2.8626ms 2.2253ms 449.3872 Ops/s 469.2466 Ops/s $\color{#d91a1a}-4.23\%$
test_sac_speed[False-None] 9.9365ms 8.2818ms 120.7467 Ops/s 121.0893 Ops/s $\color{#d91a1a}-0.28\%$
test_sac_speed[False-backward] 11.2012ms 10.9164ms 91.6051 Ops/s 91.7766 Ops/s $\color{#d91a1a}-0.19\%$
test_sac_speed[True-None] 2.3395ms 2.1323ms 468.9733 Ops/s 468.0706 Ops/s $\color{#35bf28}+0.19\%$
test_sac_speed[True-backward] 4.1729ms 3.8396ms 260.4471 Ops/s 257.2136 Ops/s $\color{#35bf28}+1.26\%$
test_sac_speed[reduce-overhead-None] 3.2518ms 2.1436ms 466.5031 Ops/s 466.7040 Ops/s $\color{#d91a1a}-0.04\%$
test_sac_speed[reduce-overhead-backward] 4.6860ms 3.8759ms 258.0076 Ops/s 259.0814 Ops/s $\color{#d91a1a}-0.41\%$
test_redq_speed[False-None] 20.4081ms 14.0983ms 70.9306 Ops/s 70.5146 Ops/s $\color{#35bf28}+0.59\%$
test_redq_speed[False-backward] 29.0341ms 23.6782ms 42.2330 Ops/s 44.0688 Ops/s $\color{#d91a1a}-4.17\%$
test_redq_speed[True-None] 5.6420ms 4.9661ms 201.3635 Ops/s 194.1242 Ops/s $\color{#35bf28}+3.73\%$
test_redq_speed[True-backward] 13.0100ms 12.4980ms 80.0129 Ops/s 77.7652 Ops/s $\color{#35bf28}+2.89\%$
test_redq_speed[reduce-overhead-None] 6.7751ms 5.0849ms 196.6605 Ops/s 192.8247 Ops/s $\color{#35bf28}+1.99\%$
test_redq_speed[reduce-overhead-backward] 14.0591ms 12.5929ms 79.4098 Ops/s 77.9128 Ops/s $\color{#35bf28}+1.92\%$
test_redq_deprec_speed[False-None] 16.5421ms 13.0724ms 76.4972 Ops/s 76.6402 Ops/s $\color{#d91a1a}-0.19\%$
test_redq_deprec_speed[False-backward] 20.4440ms 18.5190ms 53.9985 Ops/s 52.6251 Ops/s $\color{#35bf28}+2.61\%$
test_redq_deprec_speed[True-None] 4.9553ms 3.8966ms 256.6313 Ops/s 253.5768 Ops/s $\color{#35bf28}+1.20\%$
test_redq_deprec_speed[True-backward] 10.8325ms 8.6373ms 115.7769 Ops/s 117.3322 Ops/s $\color{#d91a1a}-1.33\%$
test_redq_deprec_speed[reduce-overhead-None] 4.8656ms 3.9149ms 255.4336 Ops/s 252.3361 Ops/s $\color{#35bf28}+1.23\%$
test_redq_deprec_speed[reduce-overhead-backward] 9.4654ms 8.5819ms 116.5244 Ops/s 119.3987 Ops/s $\color{#d91a1a}-2.41\%$
test_td3_speed[False-None] 8.6715ms 8.2227ms 121.6150 Ops/s 119.8824 Ops/s $\color{#35bf28}+1.45\%$
test_td3_speed[False-backward] 12.8311ms 10.6908ms 93.5387 Ops/s 93.4958 Ops/s $\color{#35bf28}+0.05\%$
test_td3_speed[True-None] 1.9913ms 1.8144ms 551.1507 Ops/s 547.7426 Ops/s $\color{#35bf28}+0.62\%$
test_td3_speed[True-backward] 3.5160ms 3.4291ms 291.6241 Ops/s 291.0580 Ops/s $\color{#35bf28}+0.19\%$
test_td3_speed[reduce-overhead-None] 2.1191ms 1.8230ms 548.5419 Ops/s 544.2853 Ops/s $\color{#35bf28}+0.78\%$
test_td3_speed[reduce-overhead-backward] 3.4812ms 3.4265ms 291.8471 Ops/s 291.9141 Ops/s $\color{#d91a1a}-0.02\%$
test_cql_speed[False-None] 40.5127ms 37.0631ms 26.9810 Ops/s 26.7518 Ops/s $\color{#35bf28}+0.86\%$
test_cql_speed[False-backward] 51.0234ms 47.1005ms 21.2312 Ops/s 21.2260 Ops/s $\color{#35bf28}+0.02\%$
test_cql_speed[True-None] 17.6306ms 16.3637ms 61.1110 Ops/s 61.9633 Ops/s $\color{#d91a1a}-1.38\%$
test_cql_speed[True-backward] 24.9021ms 23.2108ms 43.0834 Ops/s 43.1971 Ops/s $\color{#d91a1a}-0.26\%$
test_cql_speed[reduce-overhead-None] 18.3872ms 16.1371ms 61.9689 Ops/s 61.9547 Ops/s $\color{#35bf28}+0.02\%$
test_cql_speed[reduce-overhead-backward] 25.2159ms 23.1919ms 43.1185 Ops/s 42.4213 Ops/s $\color{#35bf28}+1.64\%$
test_a2c_speed[False-None] 8.7085ms 7.2316ms 138.2828 Ops/s 137.8642 Ops/s $\color{#35bf28}+0.30\%$
test_a2c_speed[False-backward] 19.1222ms 15.0491ms 66.4490 Ops/s 68.1159 Ops/s $\color{#d91a1a}-2.45\%$
test_a2c_speed[True-None] 4.1311ms 3.7615ms 265.8488 Ops/s 266.8974 Ops/s $\color{#d91a1a}-0.39\%$
test_a2c_speed[True-backward] 12.8627ms 10.5274ms 94.9903 Ops/s 97.1322 Ops/s $\color{#d91a1a}-2.21\%$
test_a2c_speed[reduce-overhead-None] 4.3057ms 3.7482ms 266.7929 Ops/s 266.6225 Ops/s $\color{#35bf28}+0.06\%$
test_a2c_speed[reduce-overhead-backward] 11.1492ms 10.2028ms 98.0123 Ops/s 99.2367 Ops/s $\color{#d91a1a}-1.23\%$
test_ppo_speed[False-None] 9.4635ms 7.5901ms 131.7510 Ops/s 134.4628 Ops/s $\color{#d91a1a}-2.02\%$
test_ppo_speed[False-backward] 17.2275ms 15.0018ms 66.6589 Ops/s 68.5437 Ops/s $\color{#d91a1a}-2.75\%$
test_ppo_speed[True-None] 4.4980ms 4.1296ms 242.1564 Ops/s 242.6017 Ops/s $\color{#d91a1a}-0.18\%$
test_ppo_speed[True-backward] 10.2195ms 9.8540ms 101.4813 Ops/s 94.7932 Ops/s $\textbf{\color{#35bf28}+7.06\%}$
test_ppo_speed[reduce-overhead-None] 5.2453ms 4.1605ms 240.3567 Ops/s 244.2792 Ops/s $\color{#d91a1a}-1.61\%$
test_ppo_speed[reduce-overhead-backward] 12.3073ms 10.5651ms 94.6515 Ops/s 100.1700 Ops/s $\textbf{\color{#d91a1a}-5.51\%}$
test_reinforce_speed[False-None] 7.9763ms 6.6498ms 150.3799 Ops/s 152.1817 Ops/s $\color{#d91a1a}-1.18\%$
test_reinforce_speed[False-backward] 10.1283ms 9.8252ms 101.7790 Ops/s 101.4093 Ops/s $\color{#35bf28}+0.36\%$
test_reinforce_speed[True-None] 3.4038ms 3.0893ms 323.6999 Ops/s 327.0308 Ops/s $\color{#d91a1a}-1.02\%$
test_reinforce_speed[True-backward] 9.6646ms 9.0386ms 110.6369 Ops/s 111.8585 Ops/s $\color{#d91a1a}-1.09\%$
test_reinforce_speed[reduce-overhead-None] 3.8124ms 3.0944ms 323.1656 Ops/s 325.3915 Ops/s $\color{#d91a1a}-0.68\%$
test_reinforce_speed[reduce-overhead-backward] 9.4087ms 9.0634ms 110.3341 Ops/s 110.4693 Ops/s $\color{#d91a1a}-0.12\%$
test_iql_speed[False-None] 33.6932ms 32.6108ms 30.6647 Ops/s 30.6843 Ops/s $\color{#d91a1a}-0.06\%$
test_iql_speed[False-backward] 49.4497ms 45.5330ms 21.9621 Ops/s 21.9275 Ops/s $\color{#35bf28}+0.16\%$
test_iql_speed[True-None] 12.9130ms 11.3114ms 88.4065 Ops/s 88.6516 Ops/s $\color{#d91a1a}-0.28\%$
test_iql_speed[True-backward] 23.6932ms 22.1904ms 45.0645 Ops/s 44.0362 Ops/s $\color{#35bf28}+2.34\%$
test_iql_speed[reduce-overhead-None] 13.1957ms 11.4071ms 87.6650 Ops/s 87.3229 Ops/s $\color{#35bf28}+0.39\%$
test_iql_speed[reduce-overhead-backward] 24.5944ms 22.4849ms 44.4743 Ops/s 44.0459 Ops/s $\color{#35bf28}+0.97\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.0294ms 4.7271ms 211.5480 Ops/s 202.1048 Ops/s $\color{#35bf28}+4.67\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.7454ms 0.5346ms 1.8704 KOps/s 1.8347 KOps/s $\color{#35bf28}+1.95\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.8408ms 0.5134ms 1.9479 KOps/s 1.9429 KOps/s $\color{#35bf28}+0.26\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 7.0931ms 4.5018ms 222.1339 Ops/s 211.1264 Ops/s $\textbf{\color{#35bf28}+5.21\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.2851ms 0.5284ms 1.8925 KOps/s 1.8702 KOps/s $\color{#35bf28}+1.19\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.9259ms 0.5051ms 1.9798 KOps/s 1.9546 KOps/s $\color{#35bf28}+1.29\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 2.4432ms 1.7265ms 579.2040 Ops/s 562.8241 Ops/s $\color{#35bf28}+2.91\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 2.3772ms 1.6320ms 612.7625 Ops/s 591.4919 Ops/s $\color{#35bf28}+3.60\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 7.2158ms 4.6814ms 213.6129 Ops/s 206.9674 Ops/s $\color{#35bf28}+3.21\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.4177ms 0.6886ms 1.4522 KOps/s 1.4370 KOps/s $\color{#35bf28}+1.06\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.9998ms 0.6660ms 1.5015 KOps/s 1.4957 KOps/s $\color{#35bf28}+0.39\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 4.9111ms 4.5856ms 218.0731 Ops/s 209.7472 Ops/s $\color{#35bf28}+3.97\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2.1705ms 0.5428ms 1.8424 KOps/s 1.8311 KOps/s $\color{#35bf28}+0.62\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.7399ms 0.5112ms 1.9562 KOps/s 1.9039 KOps/s $\color{#35bf28}+2.75\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 7.1796ms 4.5162ms 221.4251 Ops/s 212.8238 Ops/s $\color{#35bf28}+4.04\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.2366ms 0.5421ms 1.8447 KOps/s 1.8409 KOps/s $\color{#35bf28}+0.21\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.7118ms 0.5031ms 1.9877 KOps/s 1.9711 KOps/s $\color{#35bf28}+0.84\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 7.1407ms 4.6772ms 213.8020 Ops/s 208.1054 Ops/s $\color{#35bf28}+2.74\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.0491ms 0.6826ms 1.4649 KOps/s 1.4270 KOps/s $\color{#35bf28}+2.66\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 1.0307ms 0.6517ms 1.5345 KOps/s 1.4984 KOps/s $\color{#35bf28}+2.41\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 8.0386ms 4.5068ms 221.8846 Ops/s 242.7736 Ops/s $\textbf{\color{#d91a1a}-8.60\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 6.9529ms 2.3434ms 426.7316 Ops/s 409.2505 Ops/s $\color{#35bf28}+4.27\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 4.5108ms 1.3134ms 761.3567 Ops/s 738.5646 Ops/s $\color{#35bf28}+3.09\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 5.4949ms 4.3097ms 232.0372 Ops/s 239.5653 Ops/s $\color{#d91a1a}-3.14\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 0.4572s 11.7103ms 85.3952 Ops/s 422.7298 Ops/s $\textbf{\color{#d91a1a}-79.80\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 2.0234ms 1.3633ms 733.5245 Ops/s 703.4495 Ops/s $\color{#35bf28}+4.28\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 6.3763ms 4.5148ms 221.4960 Ops/s 32.0889 Ops/s $\textbf{\color{#35bf28}+590.26\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 6.1958ms 2.5401ms 393.6839 Ops/s 361.3094 Ops/s $\textbf{\color{#35bf28}+8.96\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 6.9482ms 1.6230ms 616.1391 Ops/s 645.9227 Ops/s $\color{#d91a1a}-4.61\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 12.5591ms 12.2540ms 81.6061 Ops/s 84.0268 Ops/s $\color{#d91a1a}-2.88\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 15.3276ms 14.3402ms 69.7340 Ops/s 69.7118 Ops/s $\color{#35bf28}+0.03\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 23.0613ms 21.1635ms 47.2512 Ops/s 48.4133 Ops/s $\color{#d91a1a}-2.40\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 17.1334ms 14.5170ms 68.8845 Ops/s 69.0002 Ops/s $\color{#d91a1a}-0.17\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 22.7529ms 20.9074ms 47.8299 Ops/s 48.7005 Ops/s $\color{#d91a1a}-1.79\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 17.0582ms 15.7890ms 63.3351 Ops/s 62.9293 Ops/s $\color{#35bf28}+0.64\%$

Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}28$. Worsened: $\large\color{#d91a1a}7$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.9382s 0.8412s 1.1888 Ops/s 1.1707 Ops/s $\color{#35bf28}+1.54\%$
test_transformed 1.5366s 1.4516s 0.6889 Ops/s 0.6676 Ops/s $\color{#35bf28}+3.19\%$
test_serial 2.4335s 2.3525s 0.4251 Ops/s 0.4145 Ops/s $\color{#35bf28}+2.56\%$
test_parallel 1.9627s 1.8788s 0.5322 Ops/s 0.4949 Ops/s $\textbf{\color{#35bf28}+7.55\%}$
test_step_mdp_speed[True-True-True-True-True] 0.1765ms 42.2509μs 23.6681 KOps/s 23.7366 KOps/s $\color{#d91a1a}-0.29\%$
test_step_mdp_speed[True-True-True-True-False] 50.9210μs 24.7385μs 40.4227 KOps/s 40.8290 KOps/s $\color{#d91a1a}-0.99\%$
test_step_mdp_speed[True-True-True-False-True] 48.1100μs 22.9898μs 43.4976 KOps/s 44.3552 KOps/s $\color{#d91a1a}-1.93\%$
test_step_mdp_speed[True-True-True-False-False] 42.3600μs 13.5458μs 73.8238 KOps/s 73.5429 KOps/s $\color{#35bf28}+0.38\%$
test_step_mdp_speed[True-True-False-True-True] 77.7610μs 44.0967μs 22.6774 KOps/s 22.3228 KOps/s $\color{#35bf28}+1.59\%$
test_step_mdp_speed[True-True-False-True-False] 84.5600μs 26.0939μs 38.3231 KOps/s 37.0206 KOps/s $\color{#35bf28}+3.52\%$
test_step_mdp_speed[True-True-False-False-True] 59.4300μs 25.6982μs 38.9132 KOps/s 38.0383 KOps/s $\color{#35bf28}+2.30\%$
test_step_mdp_speed[True-True-False-False-False] 66.7710μs 16.0706μs 62.2253 KOps/s 62.3260 KOps/s $\color{#d91a1a}-0.16\%$
test_step_mdp_speed[True-False-True-True-True] 73.8010μs 47.2179μs 21.1784 KOps/s 21.1845 KOps/s $\color{#d91a1a}-0.03\%$
test_step_mdp_speed[True-False-True-True-False] 53.5100μs 29.6356μs 33.7431 KOps/s 33.4716 KOps/s $\color{#35bf28}+0.81\%$
test_step_mdp_speed[True-False-True-False-True] 57.6400μs 25.7721μs 38.8016 KOps/s 38.5201 KOps/s $\color{#35bf28}+0.73\%$
test_step_mdp_speed[True-False-True-False-False] 42.2510μs 16.0840μs 62.1734 KOps/s 61.7366 KOps/s $\color{#35bf28}+0.71\%$
test_step_mdp_speed[True-False-False-True-True] 0.1050ms 49.1859μs 20.3310 KOps/s 20.0772 KOps/s $\color{#35bf28}+1.26\%$
test_step_mdp_speed[True-False-False-True-False] 72.6610μs 31.8798μs 31.3678 KOps/s 31.0292 KOps/s $\color{#35bf28}+1.09\%$
test_step_mdp_speed[True-False-False-False-True] 93.2910μs 27.7241μs 36.0698 KOps/s 36.1041 KOps/s $\color{#d91a1a}-0.10\%$
test_step_mdp_speed[True-False-False-False-False] 49.9900μs 18.5431μs 53.9284 KOps/s 54.4833 KOps/s $\color{#d91a1a}-1.02\%$
test_step_mdp_speed[False-True-True-True-True] 74.2500μs 47.0150μs 21.2698 KOps/s 21.3789 KOps/s $\color{#d91a1a}-0.51\%$
test_step_mdp_speed[False-True-True-True-False] 63.2010μs 29.5037μs 33.8940 KOps/s 34.4604 KOps/s $\color{#d91a1a}-1.64\%$
test_step_mdp_speed[False-True-True-False-True] 73.6510μs 30.0229μs 33.3079 KOps/s 33.5956 KOps/s $\color{#d91a1a}-0.86\%$
test_step_mdp_speed[False-True-True-False-False] 49.4100μs 18.0435μs 55.4216 KOps/s 56.1442 KOps/s $\color{#d91a1a}-1.29\%$
test_step_mdp_speed[False-True-False-True-True] 97.1110μs 49.6079μs 20.1581 KOps/s 20.4034 KOps/s $\color{#d91a1a}-1.20\%$
test_step_mdp_speed[False-True-False-True-False] 62.7210μs 31.9416μs 31.3071 KOps/s 31.1583 KOps/s $\color{#35bf28}+0.48\%$
test_step_mdp_speed[False-True-False-False-True] 2.9560ms 32.9615μs 30.3384 KOps/s 30.9773 KOps/s $\color{#d91a1a}-2.06\%$
test_step_mdp_speed[False-True-False-False-False] 50.6810μs 20.4220μs 48.9668 KOps/s 48.8781 KOps/s $\color{#35bf28}+0.18\%$
test_step_mdp_speed[False-False-True-True-True] 80.2310μs 52.2594μs 19.1353 KOps/s 19.1190 KOps/s $\color{#35bf28}+0.09\%$
test_step_mdp_speed[False-False-True-True-False] 60.0510μs 34.7531μs 28.7744 KOps/s 29.1618 KOps/s $\color{#d91a1a}-1.33\%$
test_step_mdp_speed[False-False-True-False-True] 58.3400μs 32.1312μs 31.1224 KOps/s 31.1758 KOps/s $\color{#d91a1a}-0.17\%$
test_step_mdp_speed[False-False-True-False-False] 50.2500μs 20.3676μs 49.0975 KOps/s 49.3595 KOps/s $\color{#d91a1a}-0.53\%$
test_step_mdp_speed[False-False-False-True-True] 87.6100μs 53.6196μs 18.6499 KOps/s 18.8195 KOps/s $\color{#d91a1a}-0.90\%$
test_step_mdp_speed[False-False-False-True-False] 64.2610μs 36.9605μs 27.0559 KOps/s 27.2238 KOps/s $\color{#d91a1a}-0.62\%$
test_step_mdp_speed[False-False-False-False-True] 68.1310μs 34.4172μs 29.0552 KOps/s 29.5335 KOps/s $\color{#d91a1a}-1.62\%$
test_step_mdp_speed[False-False-False-False-False] 53.1510μs 22.5049μs 44.4348 KOps/s 43.6019 KOps/s $\color{#35bf28}+1.91\%$
test_values[generalized_advantage_estimate-True-True] 25.1925ms 24.7891ms 40.3404 Ops/s 39.5351 Ops/s $\color{#35bf28}+2.04\%$
test_values[vec_generalized_advantage_estimate-True-True] 0.1021s 2.9478ms 339.2335 Ops/s 319.6940 Ops/s $\textbf{\color{#35bf28}+6.11\%}$
test_values[td0_return_estimate-False-False] 0.1068ms 80.9944μs 12.3465 KOps/s 12.0411 KOps/s $\color{#35bf28}+2.54\%$
test_values[td1_return_estimate-False-False] 55.7271ms 55.2801ms 18.0897 Ops/s 17.4810 Ops/s $\color{#35bf28}+3.48\%$
test_values[vec_td1_return_estimate-False-False] 1.3587ms 1.0906ms 916.9637 Ops/s 909.7712 Ops/s $\color{#35bf28}+0.79\%$
test_values[td_lambda_return_estimate-True-False] 94.8496ms 88.8029ms 11.2609 Ops/s 11.0019 Ops/s $\color{#35bf28}+2.35\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.3990ms 1.0982ms 910.6065 Ops/s 912.7412 Ops/s $\color{#d91a1a}-0.23\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 26.8235ms 26.5665ms 37.6414 Ops/s 39.3138 Ops/s $\color{#d91a1a}-4.25\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.0670ms 0.7604ms 1.3151 KOps/s 1.2827 KOps/s $\color{#35bf28}+2.53\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.7734ms 0.6774ms 1.4762 KOps/s 1.4559 KOps/s $\color{#35bf28}+1.40\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.5510ms 1.4935ms 669.5500 Ops/s 663.9799 Ops/s $\color{#35bf28}+0.84\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.7387ms 0.6928ms 1.4433 KOps/s 1.4256 KOps/s $\color{#35bf28}+1.24\%$
test_dqn_speed[False-None] 1.6153ms 1.5281ms 654.4062 Ops/s 642.3951 Ops/s $\color{#35bf28}+1.87\%$
test_dqn_speed[False-backward] 2.1962ms 2.1442ms 466.3743 Ops/s 459.4542 Ops/s $\color{#35bf28}+1.51\%$
test_dqn_speed[True-None] 0.6312ms 0.5501ms 1.8179 KOps/s 1.6736 KOps/s $\textbf{\color{#35bf28}+8.62\%}$
test_dqn_speed[True-backward] 1.3202ms 1.2494ms 800.3846 Ops/s 797.3067 Ops/s $\color{#35bf28}+0.39\%$
test_dqn_speed[reduce-overhead-None] 0.6147ms 0.5676ms 1.7619 KOps/s 1.6727 KOps/s $\textbf{\color{#35bf28}+5.33\%}$
test_dqn_speed[reduce-overhead-backward] 1.1354ms 1.0707ms 933.9587 Ops/s 897.4881 Ops/s $\color{#35bf28}+4.06\%$
test_ddpg_speed[False-None] 3.1539ms 2.8520ms 350.6332 Ops/s 330.3259 Ops/s $\textbf{\color{#35bf28}+6.15\%}$
test_ddpg_speed[False-backward] 4.6600ms 4.2345ms 236.1541 Ops/s 226.9019 Ops/s $\color{#35bf28}+4.08\%$
test_ddpg_speed[True-None] 1.4509ms 1.3418ms 745.2650 Ops/s 739.5180 Ops/s $\color{#35bf28}+0.78\%$
test_ddpg_speed[True-backward] 2.4597ms 2.4223ms 412.8390 Ops/s 380.4450 Ops/s $\textbf{\color{#35bf28}+8.51\%}$
test_ddpg_speed[reduce-overhead-None] 1.4283ms 1.3477ms 742.0214 Ops/s 728.8352 Ops/s $\color{#35bf28}+1.81\%$
test_ddpg_speed[reduce-overhead-backward] 1.9539ms 1.8987ms 526.6729 Ops/s 482.9659 Ops/s $\textbf{\color{#35bf28}+9.05\%}$
test_sac_speed[False-None] 8.5316ms 8.1045ms 123.3878 Ops/s 120.8076 Ops/s $\color{#35bf28}+2.14\%$
test_sac_speed[False-backward] 11.5406ms 11.0372ms 90.6025 Ops/s 86.6710 Ops/s $\color{#35bf28}+4.54\%$
test_sac_speed[True-None] 1.9770ms 1.8379ms 544.0858 Ops/s 533.9717 Ops/s $\color{#35bf28}+1.89\%$
test_sac_speed[True-backward] 3.7320ms 3.6072ms 277.2247 Ops/s 260.6077 Ops/s $\textbf{\color{#35bf28}+6.38\%}$
test_sac_speed[reduce-overhead-None] 22.2954ms 12.5689ms 79.5613 Ops/s 78.9786 Ops/s $\color{#35bf28}+0.74\%$
test_sac_speed[reduce-overhead-backward] 1.6660ms 1.6156ms 618.9485 Ops/s 562.0084 Ops/s $\textbf{\color{#35bf28}+10.13\%}$
test_redq_speed[False-None] 8.1637ms 7.7314ms 129.3426 Ops/s 124.4343 Ops/s $\color{#35bf28}+3.94\%$
test_redq_speed[False-backward] 12.1237ms 11.5871ms 86.3030 Ops/s 83.2280 Ops/s $\color{#35bf28}+3.69\%$
test_redq_speed[True-None] 2.4238ms 2.3443ms 426.5583 Ops/s 415.6553 Ops/s $\color{#35bf28}+2.62\%$
test_redq_speed[True-backward] 4.4569ms 4.0603ms 246.2886 Ops/s 227.6315 Ops/s $\textbf{\color{#35bf28}+8.20\%}$
test_redq_speed[reduce-overhead-None] 2.4498ms 2.3655ms 422.7404 Ops/s 409.4921 Ops/s $\color{#35bf28}+3.24\%$
test_redq_speed[reduce-overhead-backward] 4.5693ms 4.0795ms 245.1270 Ops/s 236.1388 Ops/s $\color{#35bf28}+3.81\%$
test_redq_deprec_speed[False-None] 9.4647ms 9.1214ms 109.6321 Ops/s 106.7034 Ops/s $\color{#35bf28}+2.74\%$
test_redq_deprec_speed[False-backward] 12.4748ms 12.0941ms 82.6853 Ops/s 80.8047 Ops/s $\color{#35bf28}+2.33\%$
test_redq_deprec_speed[True-None] 2.7128ms 2.6388ms 378.9635 Ops/s 362.1557 Ops/s $\color{#35bf28}+4.64\%$
test_redq_deprec_speed[True-backward] 4.3663ms 4.3152ms 231.7368 Ops/s 210.6512 Ops/s $\textbf{\color{#35bf28}+10.01\%}$
test_redq_deprec_speed[reduce-overhead-None] 2.7308ms 2.6493ms 377.4607 Ops/s 361.5454 Ops/s $\color{#35bf28}+4.40\%$
test_redq_deprec_speed[reduce-overhead-backward] 4.8055ms 4.3388ms 230.4775 Ops/s 214.1761 Ops/s $\textbf{\color{#35bf28}+7.61\%}$
test_td3_speed[False-None] 8.1298ms 8.0265ms 124.5877 Ops/s 122.5514 Ops/s $\color{#35bf28}+1.66\%$
test_td3_speed[False-backward] 10.9166ms 10.3937ms 96.2125 Ops/s 93.4858 Ops/s $\color{#35bf28}+2.92\%$
test_td3_speed[True-None] 1.7105ms 1.6751ms 596.9842 Ops/s 583.2378 Ops/s $\color{#35bf28}+2.36\%$
test_td3_speed[True-backward] 3.2158ms 3.1671ms 315.7485 Ops/s 288.1193 Ops/s $\textbf{\color{#35bf28}+9.59\%}$
test_td3_speed[reduce-overhead-None] 54.6560ms 28.0629ms 35.6342 Ops/s 36.0803 Ops/s $\color{#d91a1a}-1.24\%$
test_td3_speed[reduce-overhead-backward] 1.4616ms 1.3428ms 744.7397 Ops/s 663.7494 Ops/s $\textbf{\color{#35bf28}+12.20\%}$
test_cql_speed[False-None] 17.6108ms 16.9767ms 58.9041 Ops/s 57.7476 Ops/s $\color{#35bf28}+2.00\%$
test_cql_speed[False-backward] 22.6271ms 22.1490ms 45.1489 Ops/s 43.7896 Ops/s $\color{#35bf28}+3.10\%$
test_cql_speed[True-None] 3.4531ms 3.2698ms 305.8285 Ops/s 299.7231 Ops/s $\color{#35bf28}+2.04\%$
test_cql_speed[True-backward] 6.1367ms 5.7212ms 174.7890 Ops/s 174.7284 Ops/s $\color{#35bf28}+0.03\%$
test_cql_speed[reduce-overhead-None] 21.9327ms 13.5781ms 73.6480 Ops/s 72.5266 Ops/s $\color{#35bf28}+1.55\%$
test_cql_speed[reduce-overhead-backward] 1.9524ms 1.8297ms 546.5329 Ops/s 533.2712 Ops/s $\color{#35bf28}+2.49\%$
test_a2c_speed[False-None] 3.3742ms 3.1875ms 313.7263 Ops/s 302.8036 Ops/s $\color{#35bf28}+3.61\%$
test_a2c_speed[False-backward] 6.9841ms 6.0957ms 164.0507 Ops/s 158.4333 Ops/s $\color{#35bf28}+3.55\%$
test_a2c_speed[True-None] 1.4563ms 1.3598ms 735.3867 Ops/s 728.7735 Ops/s $\color{#35bf28}+0.91\%$
test_a2c_speed[True-backward] 3.0127ms 2.9274ms 341.6003 Ops/s 333.1554 Ops/s $\color{#35bf28}+2.53\%$
test_a2c_speed[reduce-overhead-None] 16.7717ms 9.3117ms 107.3922 Ops/s 106.3995 Ops/s $\color{#35bf28}+0.93\%$
test_a2c_speed[reduce-overhead-backward] 1.5552ms 1.4835ms 674.0975 Ops/s 667.1187 Ops/s $\color{#35bf28}+1.05\%$
test_ppo_speed[False-None] 3.7915ms 3.6963ms 270.5416 Ops/s 262.0252 Ops/s $\color{#35bf28}+3.25\%$
test_ppo_speed[False-backward] 7.3153ms 6.8778ms 145.3947 Ops/s 141.0100 Ops/s $\color{#35bf28}+3.11\%$
test_ppo_speed[True-None] 1.5286ms 1.4237ms 702.3769 Ops/s 669.9089 Ops/s $\color{#35bf28}+4.85\%$
test_ppo_speed[True-backward] 3.1304ms 3.0884ms 323.7896 Ops/s 312.8590 Ops/s $\color{#35bf28}+3.49\%$
test_ppo_speed[reduce-overhead-None] 1.2060ms 0.9901ms 1.0100 KOps/s 1.0054 KOps/s $\color{#35bf28}+0.46\%$
test_ppo_speed[reduce-overhead-backward] 1.5398ms 1.4273ms 700.6240 Ops/s 670.9589 Ops/s $\color{#35bf28}+4.42\%$
test_reinforce_speed[False-None] 2.3607ms 2.2734ms 439.8634 Ops/s 426.5001 Ops/s $\color{#35bf28}+3.13\%$
test_reinforce_speed[False-backward] 3.7268ms 3.2718ms 305.6442 Ops/s 296.4337 Ops/s $\color{#35bf28}+3.11\%$
test_reinforce_speed[True-None] 1.3821ms 1.2969ms 771.0660 Ops/s 736.8766 Ops/s $\color{#35bf28}+4.64\%$
test_reinforce_speed[True-backward] 3.0195ms 2.9427ms 339.8185 Ops/s 317.8979 Ops/s $\textbf{\color{#35bf28}+6.90\%}$
test_reinforce_speed[reduce-overhead-None] 19.4049ms 10.6570ms 93.8354 Ops/s 94.7103 Ops/s $\color{#d91a1a}-0.92\%$
test_reinforce_speed[reduce-overhead-backward] 1.5818ms 1.5271ms 654.8240 Ops/s 585.7920 Ops/s $\textbf{\color{#35bf28}+11.78\%}$
test_iql_speed[False-None] 9.7808ms 9.3022ms 107.5018 Ops/s 104.7582 Ops/s $\color{#35bf28}+2.62\%$
test_iql_speed[False-backward] 13.4653ms 12.9359ms 77.3040 Ops/s 74.0107 Ops/s $\color{#35bf28}+4.45\%$
test_iql_speed[True-None] 2.5289ms 2.2479ms 444.8639 Ops/s 433.8804 Ops/s $\color{#35bf28}+2.53\%$
test_iql_speed[True-backward] 5.1255ms 4.7598ms 210.0944 Ops/s 194.8156 Ops/s $\textbf{\color{#35bf28}+7.84\%}$
test_iql_speed[reduce-overhead-None] 0.4853s 13.4504ms 74.3470 Ops/s 85.5429 Ops/s $\textbf{\color{#d91a1a}-13.09\%}$
test_iql_speed[reduce-overhead-backward] 2.1287ms 1.9449ms 514.1704 Ops/s 464.6230 Ops/s $\textbf{\color{#35bf28}+10.66\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.9170ms 6.3383ms 157.7713 Ops/s 154.5595 Ops/s $\color{#35bf28}+2.08\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.4907ms 0.2664ms 3.7538 KOps/s 3.7237 KOps/s $\color{#35bf28}+0.81\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.4756ms 0.2453ms 4.0764 KOps/s 3.1272 KOps/s $\textbf{\color{#35bf28}+30.35\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.3882ms 6.0987ms 163.9699 Ops/s 163.2602 Ops/s $\color{#35bf28}+0.43\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.1601ms 0.2878ms 3.4745 KOps/s 3.3555 KOps/s $\color{#35bf28}+3.54\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5246ms 0.3120ms 3.2053 KOps/s 3.7688 KOps/s $\textbf{\color{#d91a1a}-14.95\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.5683ms 1.3406ms 745.9358 Ops/s 730.5538 Ops/s $\color{#35bf28}+2.11\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.4377ms 1.2171ms 821.6292 Ops/s 810.9006 Ops/s $\color{#35bf28}+1.32\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.3984ms 6.2369ms 160.3349 Ops/s 157.2746 Ops/s $\color{#35bf28}+1.95\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.0802ms 0.4043ms 2.4734 KOps/s 2.2301 KOps/s $\textbf{\color{#35bf28}+10.91\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7112ms 0.4645ms 2.1530 KOps/s 2.4656 KOps/s $\textbf{\color{#d91a1a}-12.68\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.2588ms 6.0880ms 164.2564 Ops/s 162.4827 Ops/s $\color{#35bf28}+1.09\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2.0415ms 0.3219ms 3.1065 KOps/s 3.7215 KOps/s $\textbf{\color{#d91a1a}-16.53\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5008ms 0.3047ms 3.2820 KOps/s 4.0107 KOps/s $\textbf{\color{#d91a1a}-18.17\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.3054ms 6.0323ms 165.7740 Ops/s 163.7680 Ops/s $\color{#35bf28}+1.22\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.0018ms 0.2619ms 3.8188 KOps/s 3.4453 KOps/s $\textbf{\color{#35bf28}+10.84\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.4842ms 0.2480ms 4.0327 KOps/s 3.7132 KOps/s $\textbf{\color{#35bf28}+8.61\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.6538ms 6.2634ms 159.6568 Ops/s 158.4101 Ops/s $\color{#35bf28}+0.79\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.8359ms 0.4138ms 2.4166 KOps/s 1.9921 KOps/s $\textbf{\color{#35bf28}+21.31\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.5866ms 0.3834ms 2.6079 KOps/s 2.2760 KOps/s $\textbf{\color{#35bf28}+14.58\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 7.2146ms 5.5834ms 179.1011 Ops/s 174.2885 Ops/s $\color{#35bf28}+2.76\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 10.3436ms 2.0806ms 480.6290 Ops/s 415.5386 Ops/s $\textbf{\color{#35bf28}+15.66\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 6.6097ms 1.2050ms 829.8518 Ops/s 834.8470 Ops/s $\color{#d91a1a}-0.60\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.4689s 14.9274ms 66.9909 Ops/s 176.5337 Ops/s $\textbf{\color{#d91a1a}-62.05\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 5.6596ms 1.9853ms 503.7095 Ops/s 415.5805 Ops/s $\textbf{\color{#35bf28}+21.21\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 10.0345ms 1.2605ms 793.3070 Ops/s 810.0128 Ops/s $\color{#d91a1a}-2.06\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 9.1494ms 5.8540ms 170.8232 Ops/s 30.9813 Ops/s $\textbf{\color{#35bf28}+451.37\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 8.4888ms 2.1853ms 457.5974 Ops/s 479.7515 Ops/s $\color{#d91a1a}-4.62\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 8.8820ms 1.4065ms 711.0008 Ops/s 816.9399 Ops/s $\textbf{\color{#d91a1a}-12.97\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 14.3749ms 13.6383ms 73.3231 Ops/s 70.3483 Ops/s $\color{#35bf28}+4.23\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 19.2080ms 17.0262ms 58.7329 Ops/s 58.5112 Ops/s $\color{#35bf28}+0.38\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 18.4452ms 18.0722ms 55.3335 Ops/s 53.4164 Ops/s $\color{#35bf28}+3.59\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 18.6069ms 17.1311ms 58.3735 Ops/s 54.8580 Ops/s $\textbf{\color{#35bf28}+6.41\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 18.7717ms 18.1029ms 55.2398 Ops/s 54.1387 Ops/s $\color{#35bf28}+2.03\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 20.6872ms 18.8525ms 53.0434 Ops/s 51.2170 Ops/s $\color{#35bf28}+3.57\%$

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants