You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jul 5, 2021. It is now read-only.
Please specify the following information when submitting an issue:
What are your command line arguments?:
Command line args:
CUDA_VISIBLE_DEVICES=0 python -m pdb train.py --num_epochs 301 --continue_training false --dataset dataset --crop_height 352 --crop_width 480 --batch_size 4 --num_val_images 100 --model DeepLabV3_plus --frontend ResNet50
Have you written any custom code?:
I removed data augmentation by adding "return input_image, output_image" right at the beginning and removing an empty line to not change where other lines are later for breakpoints. I also tried both with is_training=False and is_training=True.
What have you done to try and solve this issue?:
Googled why this might happen. Tried other models.
TensorFlow version?:
'1.13.1'
Describe the problem
When calling sess.run the output will be different with the same images depending on the size of the batch they were included in.
Source code / logs
Running in pdb, this can be done with a fresh checkout to replicate the problem. I originally found it when trying to implement batch inference into predict.py but I doing this in train.py is the quickest way for you to reproduce the problem.
(Pdb) break train.py:197
...
(Pdb) output_image_last = sess.run(network,feed_dict={net_input:np.expand_dims(input_image, axis=0)})
(Pdb) output_images = sess.run(network,feed_dict={net_input:input_image_batch})
(Pdb) (input_image - input_image_batch[3]).max()
0.0
(Pdb) (output_image_last - output_images[3]).max()
1.0644385
The following is another set of commands I tested from the breakpoint at 197 if you want to copy paste quickly, for these you must remove data augmentation. These commands setup a batch within pdb of size 2 and 4 and just tries to generally test that the same input images produce different outputs depending on batch size.
(input_image - input_image_batch[3]).max() #input image is the 4th image in the batch
(input_image - input_image_batch_manual2[1]).max() #input image is the 2nd image in this manually loaded batch loaded in pdb
(input_image - input_image_batch_manual4[3]).max() #input image is the 4th image in this manually loaded batch loaded in pdb
(output_image_last_alone - output_images_orig4[3]).max() #the single batch run produces a different output
(output_image_last_alone - output_images_batch2[1]).max() #the single batch run produces a different output
(output_image_last_alone - output_images_batch4[3]).max() #the single batch run produces a different output
(output_images_batch2[1] - output_images_batch4[3]).max() #batch size 2 produces different output than batch size 4
(output_images_orig4 - output_images_batch4).max() #the manually loaded batch produces the same output as the original batch
The text was updated successfully, but these errors were encountered:
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Bug report
Information
Please specify the following information when submitting an issue:
What are your command line arguments?:
Command line args:
CUDA_VISIBLE_DEVICES=0 python -m pdb train.py --num_epochs 301 --continue_training false --dataset dataset --crop_height 352 --crop_width 480 --batch_size 4 --num_val_images 100 --model DeepLabV3_plus --frontend ResNet50
Have you written any custom code?:
I removed data augmentation by adding "return input_image, output_image" right at the beginning and removing an empty line to not change where other lines are later for breakpoints. I also tried both with is_training=False and is_training=True.
What have you done to try and solve this issue?:
Googled why this might happen. Tried other models.
TensorFlow version?:
'1.13.1'
Describe the problem
When calling sess.run the output will be different with the same images depending on the size of the batch they were included in.
Source code / logs
Running in pdb, this can be done with a fresh checkout to replicate the problem. I originally found it when trying to implement batch inference into predict.py but I doing this in train.py is the quickest way for you to reproduce the problem.
(Pdb) break train.py:197
...
(Pdb) output_image_last = sess.run(network,feed_dict={net_input:np.expand_dims(input_image, axis=0)})
(Pdb) output_images = sess.run(network,feed_dict={net_input:input_image_batch})
(Pdb) (input_image - input_image_batch[3]).max()
0.0
(Pdb) (output_image_last - output_images[3]).max()
1.0644385
The following is another set of commands I tested from the breakpoint at 197 if you want to copy paste quickly, for these you must remove data augmentation. These commands setup a batch within pdb of size 2 and 4 and just tries to generally test that the same input images produce different outputs depending on batch size.
output_image_last_alone = sess.run(network,feed_dict={net_input:np.expand_dims(input_image, axis=0)})
output_images_orig4 = sess.run(network,feed_dict={net_input:input_image_batch})
input_image_batch_manual2 = []
index = i * args.batch_size + j-1
id = id_list[index]
input_image2 = utils.load_image(train_input_names[id])
output_image2 = utils.load_image(train_output_names[id])
index = i * args.batch_size + j
id = id_list[index]
input_image3 = utils.load_image(train_input_names[id])
output_image3 = utils.load_image(train_output_names[id])
input_image2, output_image2 = data_augmentation(input_image2, output_image2)
input_image3, output_image3 = data_augmentation(input_image3, output_image3)
input_image2 = np.float32(input_image2) / 255.0
input_image3 = np.float32(input_image3) / 255.0
input_image_batch_manual2.append(np.expand_dims(input_image2, axis=0))
input_image_batch_manual2.append(np.expand_dims(input_image3, axis=0))
input_image_batch_manual2 = np.squeeze(np.stack(input_image_batch_manual2, axis=1))
output_images_batch2 = sess.run(network,feed_dict={net_input:input_image_batch_manual2})
input_image_batch_manual4 = []
index = i * args.batch_size + j-3
id = id_list[index]
input_image0 = utils.load_image(train_input_names[id])
output_image0 = utils.load_image(train_output_names[id])
index = i * args.batch_size + j-2
id = id_list[index]
input_image1 = utils.load_image(train_input_names[id])
output_image1 = utils.load_image(train_output_names[id])
input_image0, output_image0 = data_augmentation(input_image0, output_image0)
input_image1, output_image1 = data_augmentation(input_image1, output_image1)
input_image0 = np.float32(input_image0) / 255.0
input_image1 = np.float32(input_image1) / 255.0
input_image_batch_manual4.append(np.expand_dims(input_image0, axis=0))
input_image_batch_manual4.append(np.expand_dims(input_image1, axis=0))
index = i * args.batch_size + j-1
id = id_list[index]
input_image2 = utils.load_image(train_input_names[id])
output_image2 = utils.load_image(train_output_names[id])
index = i * args.batch_size + j
id = id_list[index]
input_image3 = utils.load_image(train_input_names[id])
output_image3 = utils.load_image(train_output_names[id])
input_image2, output_image2 = data_augmentation(input_image2, output_image2)
input_image3, output_image3 = data_augmentation(input_image3, output_image3)
input_image2 = np.float32(input_image2) / 255.0
input_image3 = np.float32(input_image3) / 255.0
input_image_batch_manual4.append(np.expand_dims(input_image2, axis=0))
input_image_batch_manual4.append(np.expand_dims(input_image3, axis=0))
input_image_batch_manual4 = np.squeeze(np.stack(input_image_batch_manual4, axis=1))
output_images_batch4 = sess.run(network,feed_dict={net_input:input_image_batch_manual4})
(input_image - input_image_batch[3]).max() #input image is the 4th image in the batch
(input_image - input_image_batch_manual2[1]).max() #input image is the 2nd image in this manually loaded batch loaded in pdb
(input_image - input_image_batch_manual4[3]).max() #input image is the 4th image in this manually loaded batch loaded in pdb
(output_image_last_alone - output_images_orig4[3]).max() #the single batch run produces a different output
(output_image_last_alone - output_images_batch2[1]).max() #the single batch run produces a different output
(output_image_last_alone - output_images_batch4[3]).max() #the single batch run produces a different output
(output_images_batch2[1] - output_images_batch4[3]).max() #batch size 2 produces different output than batch size 4
(output_images_orig4 - output_images_batch4).max() #the manually loaded batch produces the same output as the original batch
The text was updated successfully, but these errors were encountered: