Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ia fix 377 core #383

Closed
wants to merge 11 commits into from
Closed

Conversation

idhamari
Copy link
Contributor

@idhamari idhamari commented May 3, 2022

  • Added method to evaluate with a testing subset
  • Added a new example to show the use case
  • This modification will be used in the next commit for modifying the GUI so user can select to evaluate with K-fold or not

@@ -74,7 +74,8 @@ public static ClassificationConfigurationRandomForest createRandomForest() {
private int vectorLength = DEFAULT_VECTOR_LENGTH;
/** Modified */
private boolean modified = false;

/** EvaluateWithKfold */
private boolean EvaluateWithKfold = true;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Java properties start with lower case letters.

* This class implements an example on how to compare data mining performance
* It shows how to use subset for training and different subset for testing
* @author Fabian Prasser
* @author Florian Kohlmayer
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add your name here.

* @author Fabian Prasser
* @author Florian Kohlmayer
*/
public class Example39_subset extends Example {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Convention for examples is Example[NUMBER]

data.getDefinition().setResponseVariable("marital-status", true);

// Create a training subset data with a specific percentage of the original data e.g 80%

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

File needs to be cleaned up. Remove blank lines, unnecessary comments etc.

@@ -74,7 +74,8 @@ public static ClassificationConfigurationRandomForest createRandomForest() {
private int vectorLength = DEFAULT_VECTOR_LENGTH;
/** Modified */
private boolean modified = false;

/** EvaluateWithKfold */
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Choose nice description, e.g. "Use k-fold cross validation"

// For each fold as a validation set
for (int evaluationFold = 0; evaluationFold < folds.size(); evaluationFold++) {

if (config.getEvaluateWithKfold()) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The two versions should be put into two different methods, e.g.

evaluateWithCrossValidation(...)
evaluateWithTrainingSet(...)

Currently it is (a) hard to review, (b) there are some redundancies. For example, you are calculating the folds in line 351, even if you dont need them because you are using a training and test set.

@@ -450,7 +547,7 @@ private static ClassificationMethod getClassifier(WrappedBoolean interrupt,
} else {
throw new UnexpectedErrorException(e);
}
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove artifacts

@prasser
Copy link
Collaborator

prasser commented May 4, 2022

Also please refer to "#377" in your description so that this gets linked with the issue.

@idhamari
Copy link
Contributor Author

@prasser

  • Could you please review the recent commit!

  • In this line, subsetTrain is null when the inputHandle comes via a GUI call:

                           DataSubset subsetTrain = inputHandle.getSubset(); 
    

@prasser
Copy link
Collaborator

prasser commented Jun 16, 2022

Merged other PR into feature branch. Closing.

@prasser prasser closed this Jun 16, 2022
@idhamari idhamari deleted the ia_fix_377_core branch June 20, 2022 07:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants