You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for your great works.
I am currently focusing on the augmentation path that utilized inside Kinetics dataset.
foriinrange(num_decode):
for_inrange(num_aug):
idx+=1f_out[idx] =frames_decoded[i].clone()
time_idx_out[idx] =time_idx_decoded[i, :]
f_out[idx] =f_out[idx].float()
f_out[idx] =f_out[idx] /255.0ifself.modein ["train"] andself.cfg.DATA.SSL_COLOR_JITTER:
f_out[idx] =transform.color_jitter_video_ssl(
f_out[idx],
bri_con_sat=self.cfg.DATA.SSL_COLOR_BRI_CON_SAT,
hue=self.cfg.DATA.SSL_COLOR_HUE,
p_convert_gray=self.p_convert_gray,
moco_v2_aug=self.cfg.DATA.SSL_MOCOV2_AUG,
gaussan_sigma_min=self.cfg.DATA.SSL_BLUR_SIGMA_MIN,
gaussan_sigma_max=self.cfg.DATA.SSL_BLUR_SIGMA_MAX,
)
ifself.augandself.cfg.AUG.AA_TYPE:
aug_transform=create_random_augment(
input_size=(f_out[idx].size(1), f_out[idx].size(2)),
auto_augment=self.cfg.AUG.AA_TYPE,
interpolation=self.cfg.AUG.INTERPOLATION,
)
# T H W C -> T C H W.f_out[idx] =f_out[idx].permute(0, 3, 1, 2)
list_img=self._frame_to_list_img(f_out[idx])
list_img=aug_transform(list_img)
f_out[idx] =self._list_img_to_frames(list_img)
f_out[idx] =f_out[idx].permute(0, 2, 3, 1)
Above is the flow inside Kinetics' getitem function.
After decoding backend returns decoded frames as unsigned int pixel values, above shows each frame is typecasted to float and normalized.
However, inside randaugment implementation (rand_augment.py), it uses several PIL image operations for augmentation, and some of these assume that target frame for augmentation is in form of unsigned int value (which will range from 0 to 255), for example autocontrast function.
Is this intended procedure? I'm quite confused whether it is valid to apply PIL's ImageOps to float-casted frames.
Thanks, always
The text was updated successfully, but these errors were encountered:
Thanks for your great works.
I am currently focusing on the augmentation path that utilized inside Kinetics dataset.
Above is the flow inside Kinetics' getitem function.
After decoding backend returns decoded frames as unsigned int pixel values, above shows each frame is typecasted to float and normalized.
However, inside randaugment implementation (rand_augment.py), it uses several PIL image operations for augmentation, and some of these assume that target frame for augmentation is in form of unsigned int value (which will range from 0 to 255), for example autocontrast function.
Is this intended procedure? I'm quite confused whether it is valid to apply PIL's ImageOps to float-casted frames.
Thanks, always
The text was updated successfully, but these errors were encountered: