Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add model tests #49

Merged
merged 13 commits into from
Oct 20, 2020
Merged

Add model tests #49

merged 13 commits into from
Oct 20, 2020

Conversation

hypercubestart
Copy link
Collaborator

@hypercubestart hypercubestart commented Oct 9, 2020

memoization PR moved to: #59

@gussmith23
Copy link
Owner

@hypercubestart I should have warned you! I honestly didn't think about it until now. A little speedup on these tests is actually to be expected, though the speedup we're seeing is perhaps more than I would have guessed. However, the speedups on interpreting larger workloads (i.e. a whole model) should be significant.

@gussmith23
Copy link
Owner

The reason is because memoization really only helps when expressions are being interpreted multiple times. In simple tests like these, that's not happening, and so memoization is just added overhead. However, in large expressions with repeated subexpressions, we should see big savings (from exponential time to linear).

@gussmith23
Copy link
Owner

What we need to do is to enable some larger interpreter tests, too. I can write some for you, or you can take a crack at it. Here are my suggestions:

  • Biggest things we could test right now: Resnet and Mobilenet
  • We could also write tests to just interpret the first few layers of Resnet/Mobilenet

@gussmith23 gussmith23 changed the title [WIP] Memoize interpreter Memoize interpreter Oct 13, 2020
Copy link
Owner

@gussmith23 gussmith23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good so far, just a few comments. We will merge it with the PR that improves our usage of ndarray like we discussed!

src/language/interpreter.rs Outdated Show resolved Hide resolved
@@ -1808,4 +1813,126 @@ def @main(%x: Tensor[(3), float32]) -> Tensor[(3), float32] {
(compute softmax (access (compute softmax (access (access-tensor x) 0)) 0))
"#
);

#[bench]
fn mobilenet(b: &mut Bencher) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fn mobilenet(b: &mut Bencher) {
fn mobilenet_shallow(b: &mut Bencher) {

src/language/from_relay/mod.rs Outdated Show resolved Hide resolved
src/language/from_relay/mod.rs Outdated Show resolved Hide resolved
Comment on lines 1873 to 1879
Uniform::new(0f32, 255f32),
&mut tensor_rng,
)
} else {
ndarray::ArrayD::<f32>::random_using(
shape.clone(),
Uniform::new(-1f32, 1f32),
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to double check: is it expected that, in a normal run of Mobilenet (e.g. in Relay), the image data would be [0, 255] and the weights would be [-1, 1]? That's surprising to me, though I'm not sure why.

Perhaps this reflects more on my own understanding of neural networks...unless we're using pretrained weights, I'm actually unsure why it matters what range the input activations have. Like, is there some operator that specifically expects a [0,255] range? What effect does changing the range from [-1, 1] to [0, 255] have? Or are we just trying to generate values that mimic real values?

FWIW I'm not actually suggesting any change here, haha. Just wondering out loud mostly.

src/language/interpreter.rs Outdated Show resolved Hide resolved
src/language/interpreter.rs Outdated Show resolved Hide resolved
src/language/interpreter.rs Outdated Show resolved Hide resolved
@hypercubestart hypercubestart changed the title Memoize interpreter Add model tests Oct 16, 2020
Copy link
Owner

@gussmith23 gussmith23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you rename mobilenet-simplified-for-inference.relay? I think I prefer the original name, as it's more descriptive. Specifically, that's not the mobilenet that comes out of Relay; it's the mobilenet that comes out after you run SimplifyInference. Similarly, Resnet is the same way, as are the shallow versions of the networks you created.

I know it seems pedantic, but I am actually now working with the "real" mobilenet (i.e. mobilenet without the nn.batch_norms simplified out), and so i'm probably going to be adding a mobilenet.relay fairly soon! If you think I should name that file differently, though, definitely willing to hear your argument.

As for the other stuff, there are a few code comments to address. One big thing that I didn't put in the comments: I think we should probably move these model tests to their own file in the tests/ directory. They seem more and more like integration tests (i.e. they are testing multiple parts of Glenside rather heavily.) Perhaps test/ingest-and-interpret-relay-models.rs?

src/language/from_relay/mod.rs Outdated Show resolved Hide resolved
src/language/from_relay/mod.rs Outdated Show resolved Hide resolved
src/language/from_relay/mod.rs Outdated Show resolved Hide resolved
src/language/from_relay/mod.rs Outdated Show resolved Hide resolved
src/language/from_relay/mod.rs Outdated Show resolved Hide resolved
Comment on lines 1935 to 1937
// TODO: this was a hack because NAN != NAN
// relay_output.mapv_inplace(|x| if x.is_nan() { 0.0 } else { x });
// interpreter_output.mapv_inplace(|x| if x.is_nan() { 0.0 } else { x });
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we consider adding a check to assert that there are no NaN values? It seems like there shouldn't be any.

Copy link
Collaborator Author

@hypercubestart hypercubestart Oct 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NaN values don't appear in the shallow model testing, but previously it appeared for the full model test (possibly because of weird input data)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think we can come back to this when I do memo + ndarray since its hard to tell currently whether there are Nan values or not :\

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would something like

interpreter_output.iter().all(|v| !v.is_nan())

work?

Comment on lines 1972 to 1984
// TODO: TESTS ARE IGNORED BECAUSE INTERPRETER TOO SLOW
// ====================================================
// Creates a Relay-to-Glenside test over a model
// (currently the models we test are convolutional and only image-based)
// The test does the following:
// 1. Reads the $relay_str_path file and parses as relay module
// 2. Converts the relay module to glenside
// 3. Generates random input (uniform [0, 255]) and parameters (uniform [-1, 1])
// 4. Runs relay module through TVM and benchmarks running glenside expr through interpreter
// 5. Compare output of using TVM vs interpreter
// $test_name: the name of the created test
// $relay_str_path: the path of the file containing the Relay code
macro_rules! test_model {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the same as benchmark_model, except that it doesn't benchmark? Do you think we need both?

Copy link
Collaborator Author

@hypercubestart hypercubestart Oct 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need both if we want benchmark tests for the shallow models (which runs very fast),
and still have full model testing (which currently takes a few hours)

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand why you can't use benchmark_model for the full models, though. Is it because of the #[ignore] that you're putting on the test in test_model? If that's the problem, take a look at how egg uses macros to match on #[..] expressions. In that case, you could match on #[..] expressions and attach them to the test. Then you could do

#[ignore = "too slow"]
benchmark_model!(...full mobilenet...);

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah thats what i was looking for thanks!

src/language/from_relay/mod.rs Outdated Show resolved Hide resolved
src/language/from_relay/mod.rs Outdated Show resolved Hide resolved
tests/mobilenet-relay-to-glenside.rs Outdated Show resolved Hide resolved
Copy link
Owner

@gussmith23 gussmith23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, forgot to mark "request changes"

tests/ingest-and-interpret-relay-models.rs Outdated Show resolved Hide resolved
tests/ingest-and-interpret-relay-models.rs Outdated Show resolved Hide resolved
Copy link
Owner

@gussmith23 gussmith23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!!

@gussmith23 gussmith23 merged commit 52e2539 into master Oct 20, 2020
@gussmith23 gussmith23 deleted the andrew_branch branch November 20, 2021 00:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants