Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MemoryError: Unable to allocate 168. GiB for an array with shape (76821, 542, 542) and data type float64 #17

Open
Al-Dailami opened this issue Feb 22, 2021 · 6 comments

Comments

@Al-Dailami
Copy link

loading training set
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 76821/76821 [02:21<00:00, 543.88it/s]
Traceback (most recent call last):
File "train.py", line 39, in
train_adj, train_mask = preprocess_adj(train_adj)
File "~/TextING/utils.py", line 153, in preprocess_adj
return np.array(list(adj)), mask # coo_to_tuple(sparse.COO(np.array(list(adj)))), mask

@Al-Dailami
Copy link
Author

Hello

Can you help me fix this problem!!!

@Magicat128
Copy link
Collaborator

Hi @Al-Dailami

Which dataset are you using? You may try processing training samples in batches and concatenate them with NumPy.

@Al-Dailami
Copy link
Author

Thanks a lot for your reply.

I'm working in a dataset that contains around 500,000 record of short texts. Can you please help me on how to modify the code to be able to process the data in batches.

Thanks a lot in advance for your valuable help.

@Al-Dailami
Copy link
Author

Al-Dailami commented Feb 24, 2021

Hello,
I have modified the trainer to process data batch by batch..
Is this a right way?

# Construct feed dictionary

b_train_adj, b_train_mask = preprocess_adj(train_adj[idx])
b_train_feature = preprocess_features(train_feature[idx])
feed_dict = construct_feed_dict(b_train_feature, b_train_adj, b_train_mask, train_y[idx], placeholders)
feed_dict.update({placeholders['dropout']: FLAGS.dropout})

@Magicat128
Copy link
Collaborator

Hello,
I have modified the trainer to process data batch by batch..
Is this a right way?

# Construct feed dictionary

b_train_adj, b_train_mask = preprocess_adj(train_adj[idx])
b_train_feature = preprocess_features(train_feature[idx])
feed_dict = construct_feed_dict(b_train_feature, b_train_adj, train_mask, train_y[idx], placeholders)
feed_dict.update({placeholders['dropout']: FLAGS.dropout})

@Al-Dailami
Yes, you can do it. And it's your b_train_mask in feed_dict rather than train_mask :)

@bp20200202
Copy link

Hello,
I have modified the trainer to process data batch by batch..
Is this a right way?

# Construct feed dictionary

b_train_adj, b_train_mask = preprocess_adj(train_adj[idx])
b_train_feature = preprocess_features(train_feature[idx])
feed_dict = construct_feed_dict(b_train_feature, b_train_adj, b_train_mask, train_y[idx], placeholders)
feed_dict.update({placeholders['dropout']: FLAGS.dropout})

Hello, I would like to ask if I am still reporting memoryerror after changing the code given by you, have you ever experienced this situation?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants