h #12

Yannik101010 · 2021-10-14T14:39:57Z

h

Merge classifier branch

Implemented a tokenizer in tokenizer.py by hand as seen in the lecture. Modified all corresponding files involved in the preprocessing.

Updated order (tokenize BEFORE export)

Created documentation file with basic structure. More to be added

Tokenizer

Added option to supply input column

Modified punctuation_remover.py

This reverts commit 8c1addb, reversing changes made to 00d49da.

Resolving merge conflict manually by deleting affected files, then re-creating them by pulling from upstream

Implemented stopword remover based on nltk.corpus.stopwords. Needs further refining, currently raises KeyError

Changed method of iterating over the input column. Still doesn't work, no words removed

Still not working, but raises no error. Problem with properly accessing the column?

Still doesn't work, problem with literal_eval()? Code works as supposed in jupyter though

Comment, imports

Stop word removal

Implemented a lemmatizer based on nltk.stem.WordNetLemmatizer()

Edited corresponding files, added Lemmatizer() to script, added command line args and suffix

Cleaner code, added downloader for spaCy's language model to script. Tested pipeline.

This reverts commit 5df0545.

NER feature extraction

Added both classifier & command line args. Testing successful. Minor changes to documentation

Added MLP classifier

+ added parameter logging for the classifier

The Bayes classifier can't work with negative values, hence the moved scale for the sentiment analysis. Added note to documentation.

Complement NB classifier

Added more comments and cleaned up code.

Added all of our new parameters to the README.md

Runs all configurations

Include all param configurations and document search for optimal parameters

Parameter optimization and documentation

Documentation + visuals

dhesenkamp and others added 30 commits October 4, 2021 22:05

Update .gitignore to exclude OSX specific files

aa7eb11

Merge remote-tracking branch 'upstream/main' into main

f6e4072

Merge branch 'lbechberger:main' into main

00d49da

Added uniform classifier

beee080

Added F1 score evaluation metric

d806664

Merge pull request #1 from dhesenkamp/classifier

8c1addb

Merge classifier branch

Added tweet tokenization

5d35082

Implemented a tokenizer in tokenizer.py by hand as seen in the lecture. Modified all corresponding files involved in the preprocessing.

Update preprocessing.sh

4b43523

Updated order (tokenize BEFORE export)

Create Documentation.md

5f186cc

Created documentation file with basic structure. More to be added

Merge pull request #2 from dhesenkamp/tokenizer

537acf7

Tokenizer

Modified punctuation_remover.py

ec5eeeb

Added option to supply input column

Merge pull request #4 from dhesenkamp/tokenizer

841745f

Modified punctuation_remover.py

Revert "Merge pull request #1 from dhesenkamp/classifier"

473af68

This reverts commit 8c1addb, reversing changes made to 00d49da.

Merge branch 'main' of https://github.com/dhesenkamp/MLinPractice

f7e9e15

Resolve merge conflict

7dfd46a

Resolving merge conflict manually by deleting affected files, then re-creating them by pulling from upstream

Resolve merge conflict

2f6b559

Merge branch 'lbechberger:main' into main

b92fea9

Update Documentation.md

a377bcc

Merge branch 'main' of https://github.com/dhesenkamp/MLinPractice

189843b

Testing of tokenize_input

69fbf07

Added stopword remover

5d4f975

Implemented stopword remover based on nltk.corpus.stopwords. Needs further refining, currently raises KeyError

Refined stopword remover

d577dc3

Changed method of iterating over the input column. Still doesn't work, no words removed

Update stopword_remover.py

50c422f

Still not working, but raises no error. Problem with properly accessing the column?

Further refining of stopword remover

126812d

Still doesn't work, problem with literal_eval()? Code works as supposed in jupyter though

StopwordRemover(), minor changes

e8d5b86

Comment, imports

Short info on Cohen's kappa

45e049b

Merge pull request #6 from dhesenkamp/stop_word_removal

f8f9ef3

Stop word removal

Added Lemmatizer() class

f83dc31

Implemented a lemmatizer based on nltk.stem.WordNetLemmatizer()

Added command line arguments etc

ca27622

Edited corresponding files, added Lemmatizer() to script, added command line args and suffix

Merge branch 'lbechberger:main' into main

2e25f66

dhesenkamp and others added 30 commits October 30, 2021 13:20

Fine tuning for NER() feature extractor

5df0545

Cleaner code, added downloader for spaCy's language model to script. Tested pipeline.

Revert "Fine tuning for NER() feature extractor"

a29e5d1

This reverts commit 5df0545.

Fine tuning NER() - manually resolving merge conflict

d48bb55

manually resolve merge conflict

3f94e33

Merge pull request #17 from dhesenkamp/ner

6a76e57

NER feature extraction

Documentation + minor cleanup

7456d26

Added MLP classifier

857007d

Added both classifier & command line args. Testing successful. Minor changes to documentation

Merge pull request #18 from dhesenkamp/classifier_mlp

f8970ae

Added MLP classifier

Added Gaussian NB classifier

01c18f3

Changed from Gaussian to Complement NB

b4517de

+ added parameter logging for the classifier

Updated SentimentAnalyzer to only return pos values

69b68b8

The Bayes classifier can't work with negative values, hence the moved scale for the sentiment analysis. Added note to documentation.

Merge pull request #19 from dhesenkamp/classifier_bayes

5c0f26d

Complement NB classifier

Update: Clean Code

6532381

Added more comments and cleaned up code.

Updated documentation

45ed17b

Update README.md

87025a5

Added all of our new parameters to the README.md

Merge branch 'Readme' into main1

0b69335

Added evaluation section to documentation

b150f53

Updated classifier to work for param optimization

68e1101

Hyperparameter optimization script

ecd08a0

Runs all configurations

Update Documentation.md

c3c6665

Include all param configurations and document search for optimal parameters

Summary plots for evaluation metrics

605bb61

Added plots for visualization of results to documentation

f572b02

Added more plots with summary stats

ef63032

Merge pull request #20 from dhesenkamp/param_optimization

a325be9

Parameter optimization and documentation

Documentation + visuals

eb22e87

Merge pull request #21 from dhesenkamp/param_optimization

2a5d349

Documentation + visuals

Added .py file for plots

f09d753

Update Documentation.md

1ac4d25

Added tracking results from param optimization

8c94aba

Added missing resources & citations

8cebc47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

h #12

h #12

Yannik101010 commented Oct 14, 2021

h #12

Are you sure you want to change the base?

h #12

Conversation

Yannik101010 commented Oct 14, 2021