You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
Training MovieLens-100K on algorithms DiffRec and LDiffRec crashes with exception "RuntimeError: shape mismatch: value tensor of shape [4040, 4040] cannot be broadcast to indexing result of shape [4040]".
CUDA available: True
command line args [--data_set_name MovieLens-100K --model_name LDiffRec] will not be used in RecBole
24 Jan 15:52 INFO
General Hyper Parameters:
gpu_id = 0
use_gpu = True
seed = 42
state = INFO
reproducibility = True
data_path = ./data_sets/MovieLens-100K
checkpoint_dir = ./data_sets/MovieLens-100K/recbole_checkpoints/
show_progress = True
save_dataset = False
dataset_save_path = None
save_dataloaders = False
dataloaders_save_path = None
log_wandb = False
Training Hyper Parameters:
epochs = 50
train_batch_size = 2048
learner = adam
learning_rate = 0.001
train_neg_sample_args = {'distribution': 'uniform', 'sample_num': 1, 'alpha': 1.0, 'dynamic': False, 'candidate_num': 0}
eval_step = 5
stopping_step = 10
clip_grad_norm = None
weight_decay = 0.0
loss_decimal_place = 4
Evaluation Hyper Parameters:
eval_args = {'split': {'LS': 'valid_and_test'}, 'order': 'RO', 'group_by': 'user', 'mode': {'valid': 'uni100', 'test': 'uni100'}}
repeatable = False
metrics = ['Recall', 'MRR', 'NDCG', 'Hit', 'MAP', 'Precision', 'GAUC', 'ItemCoverage', 'AveragePopularity', 'GiniIndex', 'ShannonEntropy', 'TailPercentage']
topk = [1, 3, 5, 10, 20]
valid_metric = NDCG@10
valid_metric_bigger = True
eval_batch_size = 4096
metric_decimal_place = 4
Dataset Hyper Parameters:
field_separator =
seq_separator =
USER_ID_FIELD = user_id
ITEM_ID_FIELD = item_id
RATING_FIELD = rating
TIME_FIELD = timestamp
seq_len = {}
LABEL_FIELD = label
threshold = None
NEG_PREFIX = neg_
load_col = {'inter': ['user_id', 'item_id', 'rating']}
unload_col = {}
unused_col = {}
additional_feat_suffix = []
rm_dup_inter = None
val_interval = {}
filter_inter_by_user_or_item = True
user_inter_num_interval = [0, inf)
item_inter_num_interval = [0, inf)
alias_of_user_id = None
alias_of_item_id = None
alias_of_entity_id = None
alias_of_relation_id = None
preload_weight = {}
normalize_field = []
normalize_all = False
ITEM_LIST_LENGTH_FIELD = item_length
LIST_SUFFIX = _list
MAX_ITEM_LIST_LENGTH = 50
POSITION_FIELD = position_id
HEAD_ENTITY_ID_FIELD = head_id
TAIL_ENTITY_ID_FIELD = tail_id
RELATION_ID_FIELD = relation_id
ENTITY_ID_FIELD = entity_id
benchmark_filename = None
Other Hyper Parameters:
worker = 0
wandb_project = recbole
shuffle = True
require_pow = False
enable_amp = False
enable_scaler = False
transform = None
n_cate = 1
reparam = True
in_dims = [300]
out_dims = []
ae_act_func = tanh
lamda = 0.03
anneal_cap = 0.005
anneal_steps = 1000
vae_anneal_cap = 0.3
vae_anneal_steps = 200
noise_schedule = linear
noise_scale = 0.1
noise_min = 0.001
noise_max = 0.005
sampling_noise = False
sampling_steps = 0
reweight = True
mean_type = x0
steps = 5
history_num_per_term = 10
beta_fixed = True
dims_dnn = [300]
embedding_size = 10
mlp_act_func = tanh
time-aware = False
w_max = 1
w_min = 0.1
numerical_features = []
discretization = None
kg_reverse_r = False
entity_kg_num_interval = [0, inf)
relation_kg_num_interval = [0, inf)
MODEL_TYPE = ModelType.GENERAL
encoding = utf-8
training_neg_sample_args = {'distribution': 'uniform', 'sample_num': 1, 'dynamic': False, 'candidate_num': 0}
MODEL_INPUT_TYPE = InputType.LISTWISE
eval_type = EvaluatorType.RANKING
single_spec = True
local_rank = 0
device = cuda
valid_neg_sample_args = {'distribution': 'uniform', 'sample_num': 100}
test_neg_sample_args = {'distribution': 'uniform', 'sample_num': 100}
24 Jan 15:52 INFO MovieLens-100K
The number of users: 944
Average actions of users: 106.04453870625663
The number of items: 1683
Average actions of items: 59.45303210463734
The number of inters: 100000
The sparsity of the dataset: 93.70575143257098%
Remain Fields: ['user_id', 'item_id', 'rating']
24 Jan 15:52 INFO [Training]: train_batch_size = [2048] train_neg_sample_args: [{'distribution': 'uniform', 'sample_num': 1, 'alpha': 1.0, 'dynamic': False, 'candidate_num': 0}]
24 Jan 15:52 INFO [Evaluation]: eval_batch_size = [4096] eval_args: [{'split': {'LS': 'valid_and_test'}, 'order': 'RO', 'group_by': 'user', 'mode': {'valid': 'uni100', 'test': 'uni100'}}]
24 Jan 15:52 WARNING Max value of users history interaction records has reached 43.672014260249554% of the total.
24 Jan 15:52 INFO LDiffRec(
(mlp): DNN(
(emb_layer): Linear(in_features=10, out_features=10, bias=True)
(mlp_layers): MLPLayers(
(mlp_layers): Sequential(
(0): Dropout(p=0, inplace=False)
(1): Linear(in_features=310, out_features=300, bias=True)
(2): Tanh()
(3): Dropout(p=0, inplace=False)
(4): Linear(in_features=300, out_features=300, bias=True)
)
)
(drop): Dropout(p=0.5, inplace=False)
)
(autoencoder): AutoEncoder(
(dropout): Dropout(p=0.1, inplace=False)
(encoder): MLPLayers(
(mlp_layers): Sequential(
(0): Dropout(p=0.0, inplace=False)
(1): Linear(in_features=1683, out_features=600, bias=True)
(2): Tanh()
)
)
(decoder): MLPLayers(
(mlp_layers): Sequential(
(0): Dropout(p=0.0, inplace=False)
(1): Linear(in_features=300, out_features=1683, bias=True)
)
)
)
)
Trainable parameters: 1700693
24 Jan 15:52 INFO epoch 0 training [time: 2.65s, train loss: 1853.4353]
24 Jan 15:52 INFO epoch 1 training [time: 0.18s, train loss: 1684.0792]
24 Jan 15:52 INFO epoch 2 training [time: 0.14s, train loss: 1610.4366]
24 Jan 15:52 INFO epoch 3 training [time: 0.13s, train loss: 1545.5997]
24 Jan 15:52 INFO epoch 4 training [time: 0.14s, train loss: 1487.6795]
Traceback (most recent call last):
File "/mnt/./run_recbole_test.py", line 158, in <module>
best_valid_score, best_valid_result = trainer.fit(train_data, valid_data)
File "/usr/local/lib/python3.10/site-packages/recbole/trainer/trainer.py", line 464, in fit
valid_score, valid_result = self._valid_epoch(
File "/usr/local/lib/python3.10/site-packages/recbole/trainer/trainer.py", line 283, in _valid_epoch
valid_result = self.evaluate(
File "/usr/local/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/recbole/trainer/trainer.py", line 616, in evaluate
interaction, scores, positive_u, positive_i = eval_func(batched_data)
File "/usr/local/lib/python3.10/site-packages/recbole/trainer/trainer.py", line 558, in _neg_sample_batch_eval
scores[row_idx, col_idx] = origin_scores
RuntimeError: shape mismatch: value tensor of shape [4040, 4040] cannot be broadcast to indexing result of shape [4040]
Describe the bug
Training MovieLens-100K on algorithms DiffRec and LDiffRec crashes with exception "RuntimeError: shape mismatch: value tensor of shape [4040, 4040] cannot be broadcast to indexing result of shape [4040]".
To Reproduce
Steps to reproduce the behavior:
Expected behavior
Models from the algorithms DiffRec and LDiffRec should be trained and evaluated on the MovieLens-100K data set without crashing.
Desktop (please complete the following information):
The text was updated successfully, but these errors were encountered: