請教idf的部分是如何產生的 #2

babyandy0111 · 2018-06-11T14:37:15Z

Hello, 接觸這部分沒有很深, 請問idf的檔案是如何產生的呢？

gaussic · 2018-06-12T00:35:08Z

IDF档案的生成来自于 gen_idf.py 脚本。

babyandy0111 · 2018-06-12T10:33:47Z

Hi @gaussic
我用了gen_idf.py 腳本產生idf, 但檔案出現的格式和原本提供的idf不太一樣
他出現了類似以下的編碼
0120 312e 300a 0020 312e 300a 0320 312e
300a 0220 312e 300a 0420 312e 300a 0820

我在segment.py 添加了
jieba.set_dictionary('./data/dict.txt.big') #jieba下載的
jieba.load_userdict('./data/keyword.txt') #隨意整理
jieba.analyse.set_stop_words('./data/stop_words.txt') #jieba下載的

這是正常的嗎？

gaussic · 2018-06-13T05:08:33Z

妳好，關於妳的問題，還請給出妳的運行環境。

操作系統
Python 版本
檔案編碼格式
其他描述性信息

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

請教idf的部分是如何產生的 #2

請教idf的部分是如何產生的 #2

babyandy0111 commented Jun 11, 2018

gaussic commented Jun 12, 2018

babyandy0111 commented Jun 12, 2018

gaussic commented Jun 13, 2018

請教idf的部分是如何產生的 #2

請教idf的部分是如何產生的 #2

Comments

babyandy0111 commented Jun 11, 2018

gaussic commented Jun 12, 2018

babyandy0111 commented Jun 12, 2018

gaussic commented Jun 13, 2018