You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
for i in indices:
t = th.tensor([i] * b, device=device) # change_code_note
if self.conf.cfg:
img = th.cat([img]*2, dim = 0)
t = th.cat([t]*2)
img_new = F.pad(img, (halfp, halfp, halfp, halfp), 'constant')
img_new = rearrange(img_new, 'b c (p1 h) (p2 w) -> (b p1 p2) c h w', h = patch_size, w = patch_size)
if isinstance(model_kwargs, list):
# index dependent model kwargs
# (T-1, ..., 0)
_kwargs = model_kwargs[i]
else:
_kwargs = model_kwargs
with th.no_grad():
out = self.ddim_sample(
model,
img_new,
t,
shapes = shapes,
clip_denoised=clip_denoised,
denoised_fn=denoised_fn,
cond_fn=cond_fn,
patch_size = patch_size,
model_kwargs=_kwargs,
eta=eta,
)
img_new = rearrange(out['sample'], '(b p1 p2) c h w -> b c (p1 h) (p2 w)', p1 = patch_num_x+1, p2 = patch_num_y+1)
img = img_new[:, :, halfp:-halfp, halfp:-halfp]
out['sample'] = img
yield out
img = out["sample"]
Hello,thank you for your contribution to the community. The code attached is used during testing for sampling (around line 1150 in base.py).
Allow me to use an example with res=256x256 and patch_size=64. In each DDIM iteration within this code, the noisy image is padded along the edges and divided into 5x5 patches. Then, 25 patches are denoised, and the process is repeated iteratively (I hope I’ve understood this correctly).
I would like to ask: in each sampling step, the same style of patches is used for inference without feature stitching operations like those used during network training. Is this because the model has already learned how to correctly handle the patch edges during training? Or is there stitching involved in the inference code, and I might have misunderstood?
Thank you for your clarification!
The text was updated successfully, but these errors were encountered:
Hello,thank you for your contribution to the community. The code attached is used during testing for sampling (around line 1150 in base.py).
Allow me to use an example with res=256x256 and patch_size=64. In each DDIM iteration within this code, the noisy image is padded along the edges and divided into 5x5 patches. Then, 25 patches are denoised, and the process is repeated iteratively (I hope I’ve understood this correctly).
I would like to ask: in each sampling step, the same style of patches is used for inference without feature stitching operations like those used during network training. Is this because the model has already learned how to correctly handle the patch edges during training? Or is there stitching involved in the inference code, and I might have misunderstood?
Thank you for your clarification!
The text was updated successfully, but these errors were encountered: