-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathrequirement.txt
37 lines (31 loc) · 1.22 KB
/
requirement.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
Requirements:
Please add the following to your LaD environment:
------------------------------------------------------------------------------------------------------------------------
Packages used:
- Beautiful Soup 4.9.3
- Pandas 1.3.4
- Stanza 1.3.0
- Statistics 3.10.0
- gensim 3.8.3
- wordcloud 1.8.1
- wikipedia2vec 1.0.5
- html5lib 1.1
- Matplotlib 3.32
- Numpy 1.19.2
- Requests 2.24.0
- Scikit-learn 0.23.2
- tabulate 0.8.9
- nltk.corpus
- wiki.zh.vec
- wiki-news-300d-1M.vec
- spacy
- dataframe_image
- IPython.display
please download the 'wiki.zh.vec'(chinese) and 'wiki-news-300d-1M.vec' (english)
from "https://fasttext.cc/docs/en/pretrained-vectors.html"
------------------------------------------------------------------------------------------------------------------------
1) In get_all_documents.py, we improt html5lib, re, requests, and beautiful soup to parse the texts and to extract 100 documents
for 2 languages.
2) In evaluate_annotation.py, we improt pandas, glob, itertools, random, and sklearn.metrics to evaluate the inter-agreement.
3) In run_all_analyses.py, we improt pandas, stanza, counter from collection, stopwords from nltk.corpus, genism, and wikivec
to run all analyses and outputs results and plots.