Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

权重找不到 #4

Open
bosslv opened this issue Apr 19, 2024 · 1 comment
Open

权重找不到 #4

bosslv opened this issue Apr 19, 2024 · 1 comment

Comments

@bosslv
Copy link

bosslv commented Apr 19, 2024

我没有slurm,更改了sh文件后
#!/bin/env bash

设置可见的GPU设备(例如,这里设置只使用第一块GPU)

export CUDA_VISIBLE_DEVICES=0,1

环境配置

source ~/.bashrc

设置日期时间

dt=date '+%Y%m%d_%H%M%S'

定义参数

dataset="csqa"
model='roberta-large'
elr="1e-5"
dlr="1e-3"
bs=64
mbs=2
n_epochs=30
num_relation=38
k=5 #num of gnn layers
gnndim=200

输出超参数信息

echo "***** hyperparameters "
echo "dataset: $dataset"
echo "enc_name: $model"
echo "batch_size: $bs"
echo "learning_rate: elr $elr dlr $dlr"
echo "gnn: dim $gnndim layer $k"
echo "
*************************"

设置保存模型和日志的目录

save_dir_pref='saved_models'
mkdir -p $save_dir_pref
mkdir -p logs

循环运行脚本

for seed in 0; do
python3 -u jointlk.py --dataset $dataset
--encoder $model -k $k --gnn_dim $gnndim -elr $elr -dlr $dlr -bs $bs -mbs $mbs --seed $seed
--num_relation $num_relation
--n_epochs $n_epochs --max_epochs_before_stop 10
--train_adj data/${dataset}/graph/train.graph.adj.pk
--dev_adj data/${dataset}/graph/dev.graph.adj.pk
--test_adj data/${dataset}/graph/test.graph.adj.pk
--train_statements data/${dataset}/statement/train.statement.jsonl
--dev_statements data/${dataset}/statement/dev.statement.jsonl
--test_statements data/${dataset}/statement/test.statement.jsonl
--save_model
--save_dir ${save_dir_pref}/${dataset}/enc-${model}__k${k}__gnndim${gnndim}__bs${bs}seed${seed}${dt} $args \

logs/train_${dataset}__enc-${model}__k${k}__gnndim${gnndim}__bs${bs}seed${seed}${dt}.log.txt
done
运行此脚本后,说是没有roberta-large权重文件,我去hugging face下载pytorch-bin后,添加--load_model_path后,显示too many values to unpack

@chit-ang
Copy link

请问你目前解决了吗,我也在hugging face下载了roberta-large模型,但是训练效果很差,你知道什么原因吗?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants