Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

word embedding支持中文吗? #3

Open
JenningsL opened this issue Oct 26, 2016 · 4 comments
Open

word embedding支持中文吗? #3

JenningsL opened this issue Oct 26, 2016 · 4 comments

Comments

@JenningsL
Copy link

我构造了一份中文的词向量文件,每行是分词后得到的词语、单字或者短语,以及对应的向量。在使用过程中,出现了段错误 (core dumped)。想请问一下这个用法是否支持,还是我构造的词向量文件有问题。

@JenningsL
Copy link
Author

另外,建议提供一份word_embeddings 的示例文件。

@chilynn
Copy link
Owner

chilynn commented Nov 1, 2016

支持的,假设有一共有2个单字,每个单字是3维的向量,格式如下:
2 3
你 1 0 1
好 0 0 1

@guotong1988
Copy link

没看懂楼上的例子,希望能再解释下,多谢!

@chilynn
Copy link
Owner

chilynn commented Nov 17, 2016

embedding的格式其实就是gensim的word2vec的模型输出格式,调用的函数就是model.save_word2vec_format(output_path, binary=False)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants