Skip to content
This repository has been archived by the owner on Dec 21, 2023. It is now read-only.

[WIP] kwargs flow through for Image Similarity -> Nearest Neighbors #3233

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

nishantnath
Copy link

This will allow users to use methods for nearest neighbor model apart from 'auto' along with other params like distance & nearest neighbor's kwargs

Nearest neighbor already checks for validity of kwargs downstream.

Allowing kwargs from image_similarity.create() to pass to downstream method invocations like nearest_neighbors.create()
kwargs flow through for image_similarity.create()
@TobyRoseman TobyRoseman self-requested a review June 22, 2020 23:52
@TobyRoseman
Copy link
Collaborator

@nishantnath - thanks for the pull request. I like this idea and the change looks good. Would it be possible to add a unit test for this new functionality?

@nishantnath
Copy link
Author

@TobyRoseman I did ponder over a unit test but considering kwargs would be passing through to downstream functions and each downstream function would (or should) have its own checks for params & unit tests - seemed redundant.

Downstream methods:
feature_extractor.extract_features (doesn't support kwargs)
nearest_neighbors.create (supports kwargs): has own unit test + kwargs param check

Happy to add an unit test if you can provide some pointers on the direction you are envisioning for the unit test.

@@ -63,6 +69,9 @@ def create(
batch_size : int, optional
If you are getting memory errors, try decreasing this value. If you
have a powerful computer, increasing this value may improve performance.

**kwargs : optional
Options for downstream methods like tc.nearest_neighbors.create().
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please change tc.nearest_neighbors.create() to :py:func:turicreate.nearest_neighbors.create()`. That will cause the text, in our API docs, to be linked to the appropriate documentation page for that function.

@TobyRoseman
Copy link
Collaborator

TobyRoseman commented Jun 23, 2020

@TobyRoseman I did ponder over a unit test but considering kwargs would be passing through to downstream functions and each downstream function would (or should) have its own checks for params & unit tests - seemed redundant.

Downstream methods:
feature_extractor.extract_features (doesn't support kwargs)
nearest_neighbors.create (supports kwargs): has own unit test + kwargs param check

Happy to add an unit test if you can provide some pointers on the direction you are envisioning for the unit test.

@nishantnath - I agree a unit test here is largely redundant. However I still think it's worth doing since it will ensure we never accidentally remove this functionality.

If I was going to add a unit test for this, I would probably add a new add a new testing class to our image similarity unit test. The setUpClass method could create a model with a small dataset and pass it a kwarg value.
Something like this:

self.model = tc.image_similarity.create(data.head(), distance='squared_euclidean')

Then we could have a single test method that just checks the similarity model has that distance. I think it would be something like:

self.assertEqual(
    self.model.similarity_model.distance[0][1],
    'squared_euclidean'
)

@nishantnath nishantnath changed the title kwargs flow through for Image Similarity -> Nearest Neighbors [WIP] kwargs flow through for Image Similarity -> Nearest Neighbors Jun 29, 2020
Copy link
Collaborator

@TobyRoseman TobyRoseman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nishantnath - thanks for adding a unit test.

"""
The setup class method for the basic test case with all default values.
"""
self.feature = "awesome_image"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you need the majority of the code in this setUpClass. It looks like the only member variable that's actually getting used is self.model. I think everything else should be removed.

tc_distances = tc_ret.sort("reference_label")["distance"].to_numpy()
psnr_value = get_psnr(coreml_distances, tc_distances)
self.assertTrue(psnr_value > 50)
if self.feature == "awesome_image":
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are you trying to accomplish by this change?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants