Skip to content

Latest commit

 

History

History
136 lines (84 loc) · 3.64 KB

整体框架.md

File metadata and controls

136 lines (84 loc) · 3.64 KB

参数设置

tf.app.flags.DEFINE_float()

tf.app.flags.DEFINE_bool()

训练数据导入

tensorflow数据读取

网络框架

  • 卷积等等,求损失。。。
  • 模型存储
  • 可视化
  • 创建全局步数 global_step = slim.create_global_step()

反向传播,参数更新

==重点== : 如果用到了btach_normal ,需要

update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)

然后进行dependencies:

from tensorflow.python.ops import control_flow_ops
train_op = control_flow_ops.with_dependencies([update_ops],total_loss,nmae = "train_op")

tf.losses 类: 会将计算的损失自动的加入到tf.GraphKeys.LOSSES集合中,可以通过tf.get_collection(tf.GraphKeys.LOSSES)获得


  1. 搜集损失

    all_loss = []
    
    regularization_losses = tf.get_collection(tf.GraphKeys.REGULARIZATION_LOSSES)
    regularization_losses = tf.add_n(regularization_losses.name="regularization_losses")
    all_loss.append(regularization_losses)
    
    loss = tf.get_collection(tf.GraphKeys.LOSSES)
    loss = tf.add_n(clone_losses, name='clone_loss')
    all_loss.append(loss)
    
    sum_loss = tf.add_n(all_losses)
    
    

2.计算梯度

​ 计算学习率

tensorflow学习率变化的方法

选择一个优化器  例如:`optimizer= tf.train.GradientDescentOptimizer()`

​ 搜集所有可以训练的变量:

​ 如果要训练所有变量 :variables_to_train = tf.trainable_variables()

​ 如果要训练部分scope下的变量:

scopes = [scope.strip() for scope in flags.trainable_scopes.split(',')]

variables_to_train = []

for scope in scopes:

variables = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, scope)

variables_to_train.extend(variables)

​ 计算梯度:grad = optimizer.compute_gradients(sum_loss,variables_to_train)

  1. 更新参数:

    grad_updates = optimizer.apply_gradients(clones_gradients,
                                             global_step=global_step)
    
  2. 加到update_ops 中:update_ops.append(grad_updates)

  3. update_op = tf.group(*update_ops)
    train_tensor = control_flow_ops.with_dependencies([update_op], total_loss,
                                                      name='train_op')
    
  4. 设置GPU信息等:

    # gpu_memory_fraction gpu占用比例
    gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=FLAGS.gpu_memory_fraction)
    
    # log_device_placement:是否打印GPU日志信息
    config = tf.ConfigProto(log_device_placement=False,gpu_options=gpu_options)
    
    saver = tf.train.Saver(max_to_keep=5, keep_checkpoint_every_n_hours=1.0,
                                          write_version=2,
                                          pad_step_number=False)
                                          
    
  5. 开始训练:

    slim.learning.train(
                train_tensor, 
                logdir=FLAGS.train_dir,# 参数存储地址
                master='',
                is_chief=True,
                init_fn=init_fn,# 参数初始化方式,一般默认为None,如果需要进行fine tune的化,需要先进						    # 行参数初始化,具体见模型存储与导入
                summary_op=summary_op,
                number_of_steps=FLAGS.max_number_of_steps,
                log_every_n_steps=FLAGS.log_every_n_steps,
                save_summaries_secs=FLAGS.save_summaries_secs,
                saver=saver,
                save_interval_secs=FLAGS.save_interval_secs,
                session_config=config,
                sync_optimizer=None)