Skip to content
This repository has been archived by the owner on Aug 10, 2023. It is now read-only.

Astronet Walkthrough Prediction - Restoring From Checkpoint Failed #7

Open
ianfowler opened this issue Dec 1, 2019 · 2 comments
Open

Comments

@ianfowler
Copy link

ianfowler commented Dec 1, 2019

When following the instructions to use a trained Astronet model to generate predictions:

# Generate a prediction for a new TCE.
bazel-bin/astronet/predict \
  --model=AstroCNNModel \
  --config_name=local_global \
  --model_dir=${MODEL_DIR} \
  --kepler_data_dir=${KEPLER_DATA_DIR} \
  --kepler_id=11442793 \
  --period=14.44912 \
  --t0=2.2 \
  --duration=0.11267 \
  --output_image_file="${HOME}/astronet/kepler-90i.png"

The following error occurs:

I1201 10:39:14.347496 140735978513280 saver.py:1280] Restoring parameters from /Users/ian/exoplanet-ml/astronet/model/model.ckpt-625
2019-12-01 10:39:14.439242: W tensorflow/core/framework/op_kernel.cc:1502] OP_REQUIRES failed at save_restore_v2_ops.cc:184 : Invalid argument: tensor_name = global_view_hidden/block_1/conv_1/bias; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_1/conv_1/kernel; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_1/conv_2/bias; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_1/conv_2/kernel; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_2/conv_1/bias; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_2/conv_1/kernel; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_2/conv_2/bias; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_2/conv_2/kernel; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_3/conv_1/bias; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_3/conv_1/kernel; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_3/conv_2/bias; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_3/conv_2/kernel; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_4/conv_1/bias; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_4/conv_1/kernel; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_4/conv_2/bias; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_4/conv_2/kernel; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_5/conv_1/bias; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_5/conv_1/kernel; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_5/conv_2/bias; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_5/conv_2/kernel; expected dtype double does not equal original dtype float
tensor_name = local_view_hidden/block_1/conv_1/bias; expected dtype double does not equal original dtype float
tensor_name = local_view_hidden/block_1/conv_1/kernel; expected dtype double does not equal original dtype float
tensor_name = local_view_hidden/block_1/conv_2/bias; expected dtype double does not equal original dtype float
tensor_name = local_view_hidden/block_1/conv_2/kernel; expected dtype double does not equal original dtype float
tensor_name = local_view_hidden/block_2/conv_1/bias; expected dtype double does not equal original dtype float
tensor_name = local_view_hidden/block_2/conv_1/kernel; expected dtype double does not equal original dtype float
tensor_name = local_view_hidden/block_2/conv_2/bias; expected dtype double does not equal original dtype float
tensor_name = local_view_hidden/block_2/conv_2/kernel; expected dtype double does not equal original dtype float
tensor_name = logits/bias; expected dtype double does not equal original dtype float
tensor_name = logits/kernel; expected dtype double does not equal original dtype float
tensor_name = pre_logits_hidden/fully_connected_1/bias; expected dtype double does not equal original dtype float
tensor_name = pre_logits_hidden/fully_connected_1/kernel; expected dtype double does not equal original dtype float
tensor_name = pre_logits_hidden/fully_connected_2/bias; expected dtype double does not equal original dtype float
tensor_name = pre_logits_hidden/fully_connected_2/kernel; expected dtype double does not equal original dtype float
tensor_name = pre_logits_hidden/fully_connected_3/bias; expected dtype double does not equal original dtype float
tensor_name = pre_logits_hidden/fully_connected_3/kernel; expected dtype double does not equal original dtype float
tensor_name = pre_logits_hidden/fully_connected_4/bias; expected dtype double does not equal original dtype float
tensor_name = pre_logits_hidden/fully_connected_4/kernel; expected dtype double does not equal original dtype float
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1356, in _do_call
    return fn(*args)
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1341, in _run_fn
    options, feed_dict, fetch_list, target_list, run_metadata)
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1429, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: tensor_name = global_view_hidden/block_1/conv_1/bias; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_1/conv_1/kernel; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_1/conv_2/bias; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_1/conv_2/kernel; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_2/conv_1/bias; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_2/conv_1/kernel; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_2/conv_2/bias; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_2/conv_2/kernel; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_3/conv_1/bias; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_3/conv_1/kernel; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_3/conv_2/bias; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_3/conv_2/kernel; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_4/conv_1/bias; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_4/conv_1/kernel; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_4/conv_2/bias; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_4/conv_2/kernel; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_5/conv_1/bias; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_5/conv_1/kernel; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_5/conv_2/bias; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_5/conv_2/kernel; expected dtype double does not equal original dtype float
tensor_name = local_view_hidden/block_1/conv_1/bias; expected dtype double does not equal original dtype float
tensor_name = local_view_hidden/block_1/conv_1/kernel; expected dtype double does not equal original dtype float
tensor_name = local_view_hidden/block_1/conv_2/bias; expected dtype double does not equal original dtype float
tensor_name = local_view_hidden/block_1/conv_2/kernel; expected dtype double does not equal original dtype float
tensor_name = local_view_hidden/block_2/conv_1/bias; expected dtype double does not equal original dtype float
tensor_name = local_view_hidden/block_2/conv_1/kernel; expected dtype double does not equal original dtype float
tensor_name = local_view_hidden/block_2/conv_2/bias; expected dtype double does not equal original dtype float
tensor_name = local_view_hidden/block_2/conv_2/kernel; expected dtype double does not equal original dtype float
tensor_name = logits/bias; expected dtype double does not equal original dtype float
tensor_name = logits/kernel; expected dtype double does not equal original dtype float
tensor_name = pre_logits_hidden/fully_connected_1/bias; expected dtype double does not equal original dtype float
tensor_name = pre_logits_hidden/fully_connected_1/kernel; expected dtype double does not equal original dtype float
tensor_name = pre_logits_hidden/fully_connected_2/bias; expected dtype double does not equal original dtype float
tensor_name = pre_logits_hidden/fully_connected_2/kernel; expected dtype double does not equal original dtype float
tensor_name = pre_logits_hidden/fully_connected_3/bias; expected dtype double does not equal original dtype float
tensor_name = pre_logits_hidden/fully_connected_3/kernel; expected dtype double does not equal original dtype float
tensor_name = pre_logits_hidden/fully_connected_4/bias; expected dtype double does not equal original dtype float
tensor_name = pre_logits_hidden/fully_connected_4/kernel; expected dtype double does not equal original dtype float
	 [[{{node save/RestoreV2}}]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/training/saver.py", line 1286, in restore
    {self.saver_def.filename_tensor_name: save_path})
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 950, in run
    run_metadata_ptr)
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1173, in _run
    feed_dict_tensor, options, run_metadata)
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1350, in _do_run
    run_metadata)
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1370, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: tensor_name = global_view_hidden/block_1/conv_1/bias; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_1/conv_1/kernel; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_1/conv_2/bias; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_1/conv_2/kernel; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_2/conv_1/bias; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_2/conv_1/kernel; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_2/conv_2/bias; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_2/conv_2/kernel; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_3/conv_1/bias; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_3/conv_1/kernel; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_3/conv_2/bias; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_3/conv_2/kernel; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_4/conv_1/bias; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_4/conv_1/kernel; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_4/conv_2/bias; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_4/conv_2/kernel; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_5/conv_1/bias; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_5/conv_1/kernel; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_5/conv_2/bias; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_5/conv_2/kernel; expected dtype double does not equal original dtype float
tensor_name = local_view_hidden/block_1/conv_1/bias; expected dtype double does not equal original dtype float
tensor_name = local_view_hidden/block_1/conv_1/kernel; expected dtype double does not equal original dtype float
tensor_name = local_view_hidden/block_1/conv_2/bias; expected dtype double does not equal original dtype float
tensor_name = local_view_hidden/block_1/conv_2/kernel; expected dtype double does not equal original dtype float
tensor_name = local_view_hidden/block_2/conv_1/bias; expected dtype double does not equal original dtype float
tensor_name = local_view_hidden/block_2/conv_1/kernel; expected dtype double does not equal original dtype float
tensor_name = local_view_hidden/block_2/conv_2/bias; expected dtype double does not equal original dtype float
tensor_name = local_view_hidden/block_2/conv_2/kernel; expected dtype double does not equal original dtype float
tensor_name = logits/bias; expected dtype double does not equal original dtype float
tensor_name = logits/kernel; expected dtype double does not equal original dtype float
tensor_name = pre_logits_hidden/fully_connected_1/bias; expected dtype double does not equal original dtype float
tensor_name = pre_logits_hidden/fully_connected_1/kernel; expected dtype double does not equal original dtype float
tensor_name = pre_logits_hidden/fully_connected_2/bias; expected dtype double does not equal original dtype float
tensor_name = pre_logits_hidden/fully_connected_2/kernel; expected dtype double does not equal original dtype float
tensor_name = pre_logits_hidden/fully_connected_3/bias; expected dtype double does not equal original dtype float
tensor_name = pre_logits_hidden/fully_connected_3/kernel; expected dtype double does not equal original dtype float
tensor_name = pre_logits_hidden/fully_connected_4/bias; expected dtype double does not equal original dtype float
tensor_name = pre_logits_hidden/fully_connected_4/kernel; expected dtype double does not equal original dtype float
	 [[node save/RestoreV2 (defined at Users/ian/exoplanet-ml/bazel-bin/astronet/predict.runfiles/__main__/astronet/predict.py:175) ]]

Original stack trace for 'save/RestoreV2':
  File "Users/ian/exoplanet-ml/bazel-bin/astronet/predict.runfiles/__main__/astronet/predict.py", line 183, in <module>
    tf.app.run(main=main, argv=[sys.argv[0]] + unparsed)
  File "usr/local/lib/python3.7/site-packages/tensorflow/python/platform/app.py", line 40, in run
    _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
  File "usr/local/lib/python3.7/site-packages/absl/app.py", line 300, in run
    _run_main(main, args)
  File "usr/local/lib/python3.7/site-packages/absl/app.py", line 251, in _run_main
    sys.exit(main(argv))
  File "Users/ian/exoplanet-ml/bazel-bin/astronet/predict.runfiles/__main__/astronet/predict.py", line 175, in main
    for predictions in estimator.predict(input_fn):
  File "usr/local/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 635, in predict
    hooks=all_hooks) as mon_sess:
  File "usr/local/lib/python3.7/site-packages/tensorflow/python/training/monitored_session.py", line 1007, in __init__
    stop_grace_period_secs=stop_grace_period_secs)
  File "usr/local/lib/python3.7/site-packages/tensorflow/python/training/monitored_session.py", line 725, in __init__
    self._sess = _RecoverableSession(self._coordinated_creator)
  File "usr/local/lib/python3.7/site-packages/tensorflow/python/training/monitored_session.py", line 1200, in __init__
    _WrappedSession.__init__(self, self._create_session())
  File "usr/local/lib/python3.7/site-packages/tensorflow/python/training/monitored_session.py", line 1205, in _create_session
    return self._sess_creator.create_session()
  File "usr/local/lib/python3.7/site-packages/tensorflow/python/training/monitored_session.py", line 871, in create_session
    self.tf_sess = self._session_creator.create_session()
  File "usr/local/lib/python3.7/site-packages/tensorflow/python/training/monitored_session.py", line 638, in create_session
    self._scaffold.finalize()
  File "usr/local/lib/python3.7/site-packages/tensorflow/python/training/monitored_session.py", line 229, in finalize
    self._saver = training_saver._get_saver_or_default()  # pylint: disable=protected-access
  File "usr/local/lib/python3.7/site-packages/tensorflow/python/training/saver.py", line 599, in _get_saver_or_default
    saver = Saver(sharded=True, allow_empty=True)
  File "usr/local/lib/python3.7/site-packages/tensorflow/python/training/saver.py", line 825, in __init__
    self.build()
  File "usr/local/lib/python3.7/site-packages/tensorflow/python/training/saver.py", line 837, in build
    self._build(self._filename, build_save=True, build_restore=True)
  File "usr/local/lib/python3.7/site-packages/tensorflow/python/training/saver.py", line 875, in _build
    build_restore=build_restore)
  File "usr/local/lib/python3.7/site-packages/tensorflow/python/training/saver.py", line 502, in _build_internal
    restore_sequentially, reshape)
  File "usr/local/lib/python3.7/site-packages/tensorflow/python/training/saver.py", line 381, in _AddShardedRestoreOps
    name="restore_shard"))
  File "usr/local/lib/python3.7/site-packages/tensorflow/python/training/saver.py", line 328, in _AddRestoreOps
    restore_sequentially)
  File "usr/local/lib/python3.7/site-packages/tensorflow/python/training/saver.py", line 575, in bulk_restore
    return io_ops.restore_v2(filename_tensor, names, slices, dtypes)
  File "usr/local/lib/python3.7/site-packages/tensorflow/python/ops/gen_io_ops.py", line 1696, in restore_v2
    name=name)
  File "usr/local/lib/python3.7/site-packages/tensorflow/python/framework/op_def_library.py", line 788, in _apply_op_helper
    op_def=op_def)
  File "usr/local/lib/python3.7/site-packages/tensorflow/python/util/deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "usr/local/lib/python3.7/site-packages/tensorflow/python/framework/ops.py", line 3616, in create_op
    op_def=op_def)
  File "usr/local/lib/python3.7/site-packages/tensorflow/python/framework/ops.py", line 2005, in __init__
    self._traceback = tf_stack.extract_stack()


During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/ian/exoplanet-ml/bazel-bin/astronet/predict.runfiles/__main__/astronet/predict.py", line 183, in <module>
    tf.app.run(main=main, argv=[sys.argv[0]] + unparsed)
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/platform/app.py", line 40, in run
    _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
  File "/usr/local/lib/python3.7/site-packages/absl/app.py", line 300, in run
    _run_main(main, args)
  File "/usr/local/lib/python3.7/site-packages/absl/app.py", line 251, in _run_main
    sys.exit(main(argv))
  File "/Users/ian/exoplanet-ml/bazel-bin/astronet/predict.runfiles/__main__/astronet/predict.py", line 175, in main
    for predictions in estimator.predict(input_fn):
  File "/usr/local/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 635, in predict
    hooks=all_hooks) as mon_sess:
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/training/monitored_session.py", line 1007, in __init__
    stop_grace_period_secs=stop_grace_period_secs)
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/training/monitored_session.py", line 725, in __init__
    self._sess = _RecoverableSession(self._coordinated_creator)
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/training/monitored_session.py", line 1200, in __init__
    _WrappedSession.__init__(self, self._create_session())
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/training/monitored_session.py", line 1205, in _create_session
    return self._sess_creator.create_session()
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/training/monitored_session.py", line 871, in create_session
    self.tf_sess = self._session_creator.create_session()
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/training/monitored_session.py", line 647, in create_session
    init_fn=self._scaffold.init_fn)
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/training/session_manager.py", line 290, in prepare_session
    config=config)
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/training/session_manager.py", line 204, in _restore_checkpoint
    saver.restore(sess, checkpoint_filename_with_path)
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/training/saver.py", line 1322, in restore
    err, "a mismatch between the current graph and the graph")
tensorflow.python.framework.errors_impl.InvalidArgumentError: Restoring from checkpoint failed. This is most likely due to a mismatch between the current graph and the graph from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:

tensor_name = global_view_hidden/block_1/conv_1/bias; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_1/conv_1/kernel; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_1/conv_2/bias; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_1/conv_2/kernel; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_2/conv_1/bias; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_2/conv_1/kernel; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_2/conv_2/bias; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_2/conv_2/kernel; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_3/conv_1/bias; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_3/conv_1/kernel; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_3/conv_2/bias; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_3/conv_2/kernel; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_4/conv_1/bias; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_4/conv_1/kernel; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_4/conv_2/bias; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_4/conv_2/kernel; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_5/conv_1/bias; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_5/conv_1/kernel; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_5/conv_2/bias; expected dtype double does not equal original dtype float
tensor_name = global_view_hidden/block_5/conv_2/kernel; expected dtype double does not equal original dtype float
tensor_name = local_view_hidden/block_1/conv_1/bias; expected dtype double does not equal original dtype float
tensor_name = local_view_hidden/block_1/conv_1/kernel; expected dtype double does not equal original dtype float
tensor_name = local_view_hidden/block_1/conv_2/bias; expected dtype double does not equal original dtype float
tensor_name = local_view_hidden/block_1/conv_2/kernel; expected dtype double does not equal original dtype float
tensor_name = local_view_hidden/block_2/conv_1/bias; expected dtype double does not equal original dtype float
tensor_name = local_view_hidden/block_2/conv_1/kernel; expected dtype double does not equal original dtype float
tensor_name = local_view_hidden/block_2/conv_2/bias; expected dtype double does not equal original dtype float
tensor_name = local_view_hidden/block_2/conv_2/kernel; expected dtype double does not equal original dtype float
tensor_name = logits/bias; expected dtype double does not equal original dtype float
tensor_name = logits/kernel; expected dtype double does not equal original dtype float
tensor_name = pre_logits_hidden/fully_connected_1/bias; expected dtype double does not equal original dtype float
tensor_name = pre_logits_hidden/fully_connected_1/kernel; expected dtype double does not equal original dtype float
tensor_name = pre_logits_hidden/fully_connected_2/bias; expected dtype double does not equal original dtype float
tensor_name = pre_logits_hidden/fully_connected_2/kernel; expected dtype double does not equal original dtype float
tensor_name = pre_logits_hidden/fully_connected_3/bias; expected dtype double does not equal original dtype float
tensor_name = pre_logits_hidden/fully_connected_3/kernel; expected dtype double does not equal original dtype float
tensor_name = pre_logits_hidden/fully_connected_4/bias; expected dtype double does not equal original dtype float
tensor_name = pre_logits_hidden/fully_connected_4/kernel; expected dtype double does not equal original dtype float
	 [[node save/RestoreV2 (defined at Users/ian/exoplanet-ml/bazel-bin/astronet/predict.runfiles/__main__/astronet/predict.py:175) ]]

Original stack trace for 'save/RestoreV2':
  File "Users/ian/exoplanet-ml/bazel-bin/astronet/predict.runfiles/__main__/astronet/predict.py", line 183, in <module>
    tf.app.run(main=main, argv=[sys.argv[0]] + unparsed)
  File "usr/local/lib/python3.7/site-packages/tensorflow/python/platform/app.py", line 40, in run
    _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
  File "usr/local/lib/python3.7/site-packages/absl/app.py", line 300, in run
    _run_main(main, args)
  File "usr/local/lib/python3.7/site-packages/absl/app.py", line 251, in _run_main
    sys.exit(main(argv))
  File "Users/ian/exoplanet-ml/bazel-bin/astronet/predict.runfiles/__main__/astronet/predict.py", line 175, in main
    for predictions in estimator.predict(input_fn):
  File "usr/local/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 635, in predict
    hooks=all_hooks) as mon_sess:
  File "usr/local/lib/python3.7/site-packages/tensorflow/python/training/monitored_session.py", line 1007, in __init__
    stop_grace_period_secs=stop_grace_period_secs)
  File "usr/local/lib/python3.7/site-packages/tensorflow/python/training/monitored_session.py", line 725, in __init__
    self._sess = _RecoverableSession(self._coordinated_creator)
  File "usr/local/lib/python3.7/site-packages/tensorflow/python/training/monitored_session.py", line 1200, in __init__
    _WrappedSession.__init__(self, self._create_session())
  File "usr/local/lib/python3.7/site-packages/tensorflow/python/training/monitored_session.py", line 1205, in _create_session
    return self._sess_creator.create_session()
  File "usr/local/lib/python3.7/site-packages/tensorflow/python/training/monitored_session.py", line 871, in create_session
    self.tf_sess = self._session_creator.create_session()
  File "usr/local/lib/python3.7/site-packages/tensorflow/python/training/monitored_session.py", line 638, in create_session
    self._scaffold.finalize()
  File "usr/local/lib/python3.7/site-packages/tensorflow/python/training/monitored_session.py", line 229, in finalize
    self._saver = training_saver._get_saver_or_default()  # pylint: disable=protected-access
  File "usr/local/lib/python3.7/site-packages/tensorflow/python/training/saver.py", line 599, in _get_saver_or_default
    saver = Saver(sharded=True, allow_empty=True)
  File "usr/local/lib/python3.7/site-packages/tensorflow/python/training/saver.py", line 825, in __init__
    self.build()
  File "usr/local/lib/python3.7/site-packages/tensorflow/python/training/saver.py", line 837, in build
    self._build(self._filename, build_save=True, build_restore=True)
  File "usr/local/lib/python3.7/site-packages/tensorflow/python/training/saver.py", line 875, in _build
    build_restore=build_restore)
  File "usr/local/lib/python3.7/site-packages/tensorflow/python/training/saver.py", line 502, in _build_internal
    restore_sequentially, reshape)
  File "usr/local/lib/python3.7/site-packages/tensorflow/python/training/saver.py", line 381, in _AddShardedRestoreOps
    name="restore_shard"))
  File "usr/local/lib/python3.7/site-packages/tensorflow/python/training/saver.py", line 328, in _AddRestoreOps
    restore_sequentially)
  File "usr/local/lib/python3.7/site-packages/tensorflow/python/training/saver.py", line 575, in bulk_restore
    return io_ops.restore_v2(filename_tensor, names, slices, dtypes)
  File "usr/local/lib/python3.7/site-packages/tensorflow/python/ops/gen_io_ops.py", line 1696, in restore_v2
    name=name)
  File "usr/local/lib/python3.7/site-packages/tensorflow/python/framework/op_def_library.py", line 788, in _apply_op_helper
    op_def=op_def)
  File "usr/local/lib/python3.7/site-packages/tensorflow/python/util/deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "usr/local/lib/python3.7/site-packages/tensorflow/python/framework/ops.py", line 3616, in create_op
    op_def=op_def)
  File "usr/local/lib/python3.7/site-packages/tensorflow/python/framework/ops.py", line 2005, in __init__
    self._traceback = tf_stack.extract_stack()

As the error suggests (buried in there), the directory given for the model isn't lining up to where things are stored. I solved it by changing the path:

# Create directory for the extracted TFRecord files.
BASE_DIR="${HOME}/astronet/"

# Generate a prediction for a new TCE.
bazel-bin/astronet/predict \
  --model=AstroCNNModel \
  --config_name=local_global \
  --model_dir=${BASE_DIR} \
  --kepler_data_dir=${KEPLER_DATA_DIR} \
  --kepler_id=11442793 \
  --period=14.44912 \
  --t0=2.2 \
  --duration=0.11267 \
  --output_image_file="${HOME}/astronet/kepler-90i.png"

This prediction for following the demo exactly is slightly different every time from the expected 0.9480018 (which I assume shouldn't be the case first in the sense that it's different every time and second in the sense that it's off by the demo's result by 45%) and gravitates around 50%. (Three trials: 0.5015401002407824, 0.49691700167065095, 0.49994740353818445). Is this an issue with my solution for the first problem (changing the path of the model), or is this an entirely different issue?

@ianfowler
Copy link
Author

I figured out why the prediction is wrong:

I1201 00:19:08.316953 140735978513280 estimator.py:612] Could not find trained model in model_dir: /Users/ian/exoplanet-ml/astronet, running initialization to predict.
I1201 00:19:08.334474 140735978513280 estimator.py:1145] Calling model_fn.
W1201 00:19:08.335051 140735978513280 deprecation_wrapper.py:119] From /Users/ian/exoplanet-ml/bazel-bin/astronet/predict.runfiles/__main__/astronet/astro_model/astro_model.py:303: The name tf.train.get_or_create_global_step is deprecated. Please use tf.compat.v1.train.get_or_create_global_step instead.

Going back a level in the directory didn't find the right files.

Now I'm sure that the correct files are indeed at the specified directory.

exoplanet-ml ian$ ls $MODEL_DIR
checkpoint				eval_test				events.out.tfevents.1575219078.bear-mbp	model.ckpt-625.data-00000-of-00001	model.ckpt-625.meta
config.json				eval_val				graph.pbtxt				model.ckpt-625.index

If so, why would there be problems restoring?

@caitlynlee
Copy link

The original error isn't due to the model not being in the given model directory - as you said in your second post the model is in the right place and the model_dir argument is correct. The issue is a data type mismatch between tf.float32 and tf.float64.

In astronet/predict.py you just need to force the features to be floats instead of doubles (numpy floats).

115: global_view =  preprocess.global_view(time, flux, FLAGS.period).astype(np.float32)
...
120: local_view = preprocess.local_view(time, flux, FLAGS.period, FLAGS.duration).astype(np.float32) 

ritwik12 added a commit to ritwik12/exoplanet-ml that referenced this issue Feb 24, 2020
As per @caitlynlee in google-research#7 , Make predictions give errors for various tensors as `expected dtype double does not equal original dtype float`. Forcing tensors  to be floats instead of doubles in `astronet/predict.py ` solves the issue.

```
115: global_view =  preprocess.global_view(time, flux, FLAGS.period).astype(np.float32)
...
120: local_view = preprocess.local_view(time, flux, FLAGS.period, FLAGS.duration).astype(np.float32) 
```
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants