Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

munmap_chunk(): invalid pointer \n segmentation fault (core dump), when feature dim is big in index.train #4038

Closed
2 tasks
cywuuuu opened this issue Nov 20, 2024 · 7 comments

Comments

@cywuuuu
Copy link

cywuuuu commented Nov 20, 2024

Summary

I am trying to run a baseline model VQREC pq.py I met munmap_chunk(): invalid pointer \n segmentation fault (core dump), when feat_size is bigger than 1024, but when feat_size = 768 its ok. I minimized the code below, so it can help with Reproduction the issue. (I am new to faiss, but I would like know how to fix it, thanks)

# -*- coding: utf-8 -*-
import numpy as np
import faiss
res = faiss.StandardGpuResources()
feat_size = 1024 # segment fault core dump
feat_size = 768 # ok 

index_cpu = faiss.index_factory(feat_size, "OPQ32,IVF1,PQ8x8", faiss.METRIC_INNER_PRODUCT)
index_cpu.verbose = True



filtered_feat = np.random.randn(1000, feat_size).astype('float32')

index_cpu.train(filtered_feat)

print('Training completed.')

Platform

OS:
Ubuntu 20.04
Faiss version:
faiss 1.9.0
Installed from:
anaconda
Faiss compilation options:

Running on:

  • [yes] CPU
  • GPU

Interface:

  • C++
  • [yes] Python

Reproduction instructions

Just run the code, with different feat_size.

@cywuuuu
Copy link
Author

cywuuuu commented Nov 20, 2024

My RAM is 500GB, and i have not observe the surge of ram usage when running the code (or maybe everything happened too fast?), If anyone could help me understand the exact cause, I would greatly appreciate it.

@junjieqi
Copy link
Contributor

hi @cywuuuu, I just tried to re-run the code you provided and use feat_size = 1024. The code is running fine. Probably you want to isolate the issue from your code to see where the seg_fault happened

@cywuuuu
Copy link
Author

cywuuuu commented Nov 21, 2024

Well, it is quite strange because I do have that issue when rerunning the code, I give my detail as below(The segment fault core dump is described in Chinese 已放弃 (核心已转储), sorry for that because i am using Chinese Ver of Ubuntu)
image
image

And the following is the package I am using:
image

I am curious why. Hopefully you can help me out, I would greatly appreciate it,thanks @junjieqi

@junjieqi
Copy link
Contributor

@cywuuuu thanks for providing additional information. I'm wondering why you are using faiss-gpu. Since I saw you initialize res = faiss.StandardGpuResources(), but that parameter never got used. Do you want to remove that one first and re-run the code?

@cywuuuu
Copy link
Author

cywuuuu commented Nov 24, 2024

Sorry, my bad. Now I delete that res = faiss.StandardGpuResources(), and use faiss-cpu , still I got this, (768 works fine, but core dump when 1024), hope you can help, thank you so much! @junjieqi
image
image

_libgcc_mutex             0.1                        main    defaults
_openmp_mutex             5.1                       1_gnu    defaults
absl-py                   2.1.0                    pypi_0    pypi
accelerate                0.21.0                   pypi_0    pypi
aiohttp                   3.9.3                    pypi_0    pypi
aiosignal                 1.3.1                    pypi_0    pypi
async-timeout             4.0.3                    pypi_0    pypi
attrs                     23.2.0                   pypi_0    pypi
bitsandbytes              0.40.2                   pypi_0    pypi
blas                      1.0                         mkl    defaults
blobfile                  2.1.1                    pypi_0    pypi
ca-certificates           2024.9.24            h06a4308_0    defaults
certifi                   2024.2.2                 pypi_0    pypi
charset-normalizer        3.3.2                    pypi_0    pypi
click                     8.1.7                    pypi_0    pypi
colorama                  0.4.6                    pypi_0    pypi
datasets                  2.18.0                   pypi_0    pypi
dill                      0.3.8                    pypi_0    pypi
evaluate                  0.4.1                    pypi_0    pypi
fairscale                 0.4.13                   pypi_0    pypi
faiss-cpu                 1.9.0           py3.9_hadc1362_0_cpu    pytorch
filelock                  3.9.0                    pypi_0    pypi
fire                      0.6.0                    pypi_0    pypi
frozenlist                1.4.1                    pypi_0    pypi
fsspec                    2024.2.0                 pypi_0    pypi
grpcio                    1.62.1                   pypi_0    pypi
huggingface-hub           0.22.2                   pypi_0    pypi
idna                      3.6                      pypi_0    pypi
importlib-metadata        7.1.0                    pypi_0    pypi
intel-openmp              2023.1.0         hdb19cb5_46306    defaults
ipadic                    1.0.0                    pypi_0    pypi
jinja2                    3.1.2                    pypi_0    pypi
joblib                    1.4.0                    pypi_0    pypi
ld_impl_linux-64          2.40                 h12ee557_0    defaults
libfaiss                  1.9.0            hf65b397_0_cpu    pytorch
libffi                    3.4.4                h6a678d5_1    defaults
libgcc-ng                 11.2.0               h1234567_1    defaults
libgomp                   11.2.0               h1234567_1    defaults
libstdcxx-ng              11.2.0               h1234567_1    defaults
loguru                    0.7.2                    pypi_0    pypi
lxml                      4.9.4                    pypi_0    pypi
markdown                  3.6                      pypi_0    pypi
markupsafe                2.1.3                    pypi_0    pypi
mecab-python3             1.0.6                    pypi_0    pypi
mkl                       2023.1.0         h213fc3f_46344    defaults
mkl-service               2.4.0            py39h5eee18b_1    defaults
mkl_fft                   1.3.11           py39h5eee18b_0    defaults
mkl_random                1.2.8            py39h1128e8f_0    defaults
mpmath                    1.3.0                    pypi_0    pypi
multidict                 6.0.5                    pypi_0    pypi
multiprocess              0.70.16                  pypi_0    pypi
ncurses                   6.4                  h6a678d5_0    defaults
networkx                  3.2.1                    pypi_0    pypi
nltk                      3.8.1                    pypi_0    pypi
numpy                     1.24.1                   pypi_0    pypi
numpy-base                1.26.4           py39hb5e798b_0    defaults
nvidia-cublas-cu12        12.1.3.1                 pypi_0    pypi
nvidia-cuda-cupti-cu12    12.1.105                 pypi_0    pypi
nvidia-cuda-nvrtc-cu12    12.1.105                 pypi_0    pypi
nvidia-cuda-runtime-cu12  12.1.105                 pypi_0    pypi
nvidia-cudnn-cu12         8.9.2.26                 pypi_0    pypi
nvidia-cufft-cu12         11.0.2.54                pypi_0    pypi
nvidia-curand-cu12        10.3.2.106               pypi_0    pypi
nvidia-cusolver-cu12      11.4.5.107               pypi_0    pypi
nvidia-cusparse-cu12      12.1.0.106               pypi_0    pypi
nvidia-nccl-cu12          2.19.3                   pypi_0    pypi
nvidia-nvjitlink-cu12     12.1.105                 pypi_0    pypi
nvidia-nvtx-cu12          12.1.105                 pypi_0    pypi
openssl                   3.0.15               h5eee18b_0    defaults
packaging                 24.0                     pypi_0    pypi
pandas                    2.2.1                    pypi_0    pypi
peft                      0.4.0                    pypi_0    pypi
pillow                    10.2.0                   pypi_0    pypi
pip                       24.0                     pypi_0    pypi
portalocker               2.8.2                    pypi_0    pypi
protobuf                  5.26.1                   pypi_0    pypi
psutil                    5.9.8                    pypi_0    pypi
pyarrow                   15.0.2                   pypi_0    pypi
pyarrow-hotfix            0.6                      pypi_0    pypi
pycryptodomex             3.20.0                   pypi_0    pypi
pysocks                   1.7.1                    pypi_0    pypi
python                    3.9.19               h955ad1f_0    defaults
python-dateutil           2.9.0.post0              pypi_0    pypi
pytz                      2024.1                   pypi_0    pypi
pyyaml                    6.0.1                    pypi_0    pypi
readline                  8.2                  h5eee18b_0    defaults
regex                     2023.12.25               pypi_0    pypi
requests                  2.31.0                   pypi_0    pypi
responses                 0.18.0                   pypi_0    pypi
rouge-score               0.1.2                    pypi_0    pypi
sacrebleu                 2.4.2                    pypi_0    pypi
safetensors               0.4.2                    pypi_0    pypi
scikit-learn              1.4.2                    pypi_0    pypi
scipy                     1.13.0                   pypi_0    pypi
sentence-transformers     2.7.0                    pypi_0    pypi
sentencepiece             0.2.0                    pypi_0    pypi
setuptools                75.1.0           py39h06a4308_0    defaults
six                       1.16.0                   pypi_0    pypi
sqlite                    3.45.3               h5eee18b_0    defaults
sympy                     1.12                     pypi_0    pypi
tabulate                  0.9.0                    pypi_0    pypi
tbb                       2021.8.0             hdb19cb5_0    defaults
tensorboard               2.16.2                   pypi_0    pypi
tensorboard-data-server   0.7.2                    pypi_0    pypi
termcolor                 2.4.0                    pypi_0    pypi
threadpoolctl             3.4.0                    pypi_0    pypi
tiktoken                  0.4.0                    pypi_0    pypi
tk                        8.6.14               h39e8969_0    defaults
tokenizers                0.19.1                   pypi_0    pypi
torch                     2.2.1+cu121              pypi_0    pypi
torchaudio                2.2.1+cu121              pypi_0    pypi
torchvision               0.17.1+cu121             pypi_0    pypi
tqdm                      4.66.2                   pypi_0    pypi
transformers              4.40.0                   pypi_0    pypi
triton                    2.2.0                    pypi_0    pypi
trl                       0.4.7                    pypi_0    pypi
typing-extensions         4.8.0                    pypi_0    pypi
tzdata                    2024.1                   pypi_0    pypi
urllib3                   2.2.1                    pypi_0    pypi
werkzeug                  3.0.2                    pypi_0    pypi
wheel                     0.44.0           py39h06a4308_0    defaults
xxhash                    3.4.1                    pypi_0    pypi
xz                        5.4.6                h5eee18b_1    defaults
yarl                      1.9.4                    pypi_0    pypi
zipp                      3.18.1                   pypi_0    pypi
zlib                      1.2.13               h5eee18b_1    defaults

Copy link

github-actions bot commented Dec 2, 2024

This issue is stale because it has been open for 7 days with no activity.

@github-actions github-actions bot added the stale label Dec 2, 2024
Copy link

github-actions bot commented Dec 9, 2024

This issue was closed because it has been inactive for 7 days since being marked as stale.

@github-actions github-actions bot closed this as completed Dec 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants