-
Notifications
You must be signed in to change notification settings - Fork 537
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix backward_dense_test #3702
base: main
Are you sure you want to change the base?
Fix backward_dense_test #3702
Conversation
✅ Deploy Preview for pytorch-fbgemm-docs ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
@q10 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@@ -296,17 +296,14 @@ std::string tensor_on_same_gpu_if_not_optional_check( | |||
|
|||
inline at::Tensor aligned_grad_output_tensor_for_cuda_backwards( | |||
const at::Tensor& grad_output) { | |||
auto aligned_grad_output = grad_output; | |||
auto aligned_grad_output = at::empty_like(grad_output).copy_(grad_output); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should not do this every time. It will be costly. Is there a reason why you would like to do this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should not do this every time. It will be costly. Is there a reason why you would like to do this?
@sryap Agree on that we shouldn't do it every time. However, with code bisecting, this allows to pass the unit test. I need some help from your side:
- What is the intention of aligned_grad_output_tensor_for_cuda_backwards() function? I can assume that we get the grad_output without copy if the data is aligned to 16B, otherwise get the "aligned" tensor from input with potential memory copy. Is the tensor constructed with .contiguous() or empty_like is guaranteed to be aligned?
- Could you please clarify what is tested here?:
https://github.com/pytorch/FBGEMM/pull/3702/files#diff-dc94c00639d812c6bddd3a893aa08255d1ca5819cc8c3cfa524706d5a21a65baR331-R340
We want to make sure that sequential call of bwd will produce the same gradient w.r.t. feature_requires_grad? - Is there any possible sync issues that might occur in this test scenario?
- The parameter set to test the failure is:
(
T=1,
D=2,
B=2,
log_E=1,
L=1,
weights_precision=SparseType.FP16,
weighted=False,
mixed=False,
mixed_B=True,
long_segments=False,
pooling_mode=PoolingMode.SUM,
use_cpu=False,
output_dtype=SparseType.FP32,
)
Also the random seed needs to be fixed at the start of test_backward_dense:
np.random.seed(2007)
torch.manual_seed(2007)
Are those parameters valid?
Attempt to fix the following five existing issues in the dense unit test:
Issues 1, 2, 5 are also observed in pytorch/pytorch#141904
The initial intention of aligned_grad_output_tensor_for_cuda_backwards() function is unclear to me, so this fix particular might be "sub-optimal". Thus asking for some reviews