Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pull Request for the Algorithm Comparison Tool #4

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

mihaivavram
Copy link

Knowledge Linker Algorithm Comparison tool, taking the output files from linkpred, and running comparison metrics and confusion statistics from any two given algorithms, can print to screen or to files and can be run from the command line

…rom linkpred, and running comparison metrics and confusion statistics from any two given algorithms, can print to screen or to files and can be run from the command line
@@ -0,0 +1,317 @@
#Author: Mihai Avram - [email protected]

import sys, getopt
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use argparse instead of sys.argv, remove unused getopt import


def computeAndPrint(writeToFile):
#Importing algorithms to be compared
alg1 = pd.read_csv("../input/presidentcouplesNODES.csv")
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Input file should not be hardcoded into source code. Should be passed as command line arguments instead.

def computeAndPrint(writeToFile):
#Importing algorithms to be compared
alg1 = pd.read_csv("../input/presidentcouplesNODES.csv")
alg2 = pd.read_csv("../input/presidentcouplesRSIM.csv")
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above



#Algorithm comparison
if noNulls:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of enclosing all computations in a if block, exit if there are nulls (e.g. sys.exit(1) to return error status code), and then continue the rest of the computation without the if block:

if there_is_an_error:
    print("The was an error", file=sys.stderr)
    sys.exit(1) 

<NORMAL OPERATIONS>


#WRITING THE RESULTS TO FILES
else:
fileComp = open("../output/algorithmscomparison.txt",'w')
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let user specify output path, do not hardcode file paths.

else:
fileComp = open("../output/algorithmscomparison.txt",'w')

#Printing Results
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code below is duplicated. Use a function instead of copy-pasting the same code block. Code reuse decreases the chances of bugs and make the code more readable.

import string


def main(argv):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use argparse instead of writing your own error reporting.

@glciampaglia glciampaglia self-assigned this Jan 28, 2017
@glciampaglia
Copy link
Owner

glciampaglia commented Jan 28, 2017

Can you please:

  • move the contents of the README in a comment at the top of the script and delete the file.
  • move the script file under scripts, remove algcomptool/tool; too many folders are unnecessary
  • delete the input and output folders and all their contents. In general DO NOT commit data to the repository unless explicitly needed. Other users in general do not need to download your data, so it's a burden on their connection and storage.

Mihai Avram and others added 4 commits March 15, 2017 16:34
…rse instead of sys.argv, passed input/output files through command line, implemented sys.exit(1) error calls, created functions to reuse code, fixed file writing bug, and changed the README file
Updated README.md to include markdown text.
…ple of what input files are expected and they are not so large so I believe they are okay for reusability purposes to keep them there
def main():
parser = argparse.ArgumentParser()
#Required parameters -a1 and -a2 which denote the paths to the two input algorithm files to be compared (print output to the console)
parser.add_argument("-a1", "--alg1file", type=str, help="the relative or absolute location and name of the input file of the first algorithm", required=True)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can just pass add_argument("alg1file", type=str...)and that will become a positional argument. Positional arguments are required. Also, I recommend using more descriptive names for the user, e.g. input-file-1; if you still want the destination to be called alg1file you may pass dest="alg1file" so that you don't need to change the rest of your code.

fileComp.write("Alg1 < Alg2 for relation: " + " ID: " + str(idSubjectMapping[subject]) + " - Subject: " + str(subject.replace("dbr:","")) + "\n")
elif alg1Ratio < alg2Ratio:
fileComp.write("Alg1 > Alg2 for relation: " + " ID: " + str(idSubjectMapping[subject]) + " - Subject: " + str(subject.replace("dbr:","")) + "\n")
fileComp.close
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing function call?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing all "missing function calls" you mentioned except this one, because this requires a different output format.

else:
fileConfAlg1.write(str(alg1.iloc[sample, idColLocation]) + "," + str(alg1.iloc[sample, subjectColLocation].replace("dbr:","")) + "," + str(alg1.iloc[sample, objectIDColLocation]) + "," + str(alg1.iloc[sample, objectColLocation].replace("dbr:","")) + ",TN" + "\n")

fileConfAlg1.close
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing function call?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed the fileConfAlg1.write and fileConfAlg2.write procedures to call a function which can write both. In many cases it is possible to include functions; however, we are only repeating an event once or twice, and in my opinion there is no need to further abstract code if we call the block only a few times (i.e. 1-2) but if we call code 3+ times then yes it's a good idea to write a function. Let me know if you agree with this.

else:
fileConfAlg2.write(str(alg2.iloc[sample, idColLocation]) + "," + str(alg2.iloc[sample, subjectColLocation].replace("dbr:","")) + "," + str(alg2.iloc[sample, objectIDColLocation]) + "," + str(alg2.iloc[sample, objectColLocation].replace("dbr:","")) + ",TN" + "\n")

fileConfAlg2.close
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing function call?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are all your "Missing function call?" comments regarding code-structure abstraction i.e. turn each stand-by-itself block under a module function and call that function from main? Or did you mean something else?

… function abstraction for confusion output files
AlgorithmComparisonTool: Updating README.md and fixing some formatting problems within this file.
AlgorithmComparisonTool: More README.md formatting
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants