move rag answer correctness metrics tests to test_metrics.py #1131

matanor · 2024-08-11T11:27:55Z

No description provided.

elronbandel

I think this test should be at the preparation card without the if __name__ == "__main__": so it will work every time someone import this file. The difference is that here its an activation of existing class (MetricPipline and its inner metric) . This will be tested that way every preparation test for every PR.

elronbandel · 2024-08-11T12:09:32Z

Related: #1132

matanor · 2024-08-11T12:17:15Z

I think this test should be at the preparation card without the if __name__ == "__main__": so it will work every time someone import this file. The difference is that here its an activation of existing class (MetricPipline and its inner metric) . This will be tested that way every preparation test for every PR.

hmm... for context_correctness we just did the same transition here: #1092 (you advised it).. and its the same kind of tests (MetricPipeline over existing metrics)

Also, if you look at https://github.com/IBM/unitxt/blame/main/prepare/metrics/rag_answer_correctness.py#L63, @eladven explicitly added the if __name__ == "__main__": statement 3 weeks ago (i think cause these metrics use models)

matanor · 2024-08-14T10:01:38Z

@elronbandel @dafnapension @yoavkatz

So, what should i do with this PR? (close it?)

Generally, where should tests that involve models be?

yoavkatz · 2024-08-14T10:47:39Z

There are two issues here. (1) Where to place the tests (2) when we run the tests

The rule is that testing python metric classes should be done in test_metrics, while checking catalog assets should be done at the prepare file when the catalog assets are added. Since in your case, you are testing assets, it should be in the prepare file.

The second point is that we can not afford to run tests that require large model loading on each PR. This is because it has to download large models to a new VM on each PR. This is why long tests are executed only with the if __name__ == "__main__": condition. So they are executed only if explicitly run.

@elronbandel - how do you suggest to avoid this and still keep reasonable runtimes?

matanor · 2024-08-15T05:26:19Z

Thanks @yoavkatz . This PR tests with a model, so i will close it (and we keep the test in the prepare script, under the if __name__ == "__main__": condition).

A question about another test please:
The test_context_correctness test is in test_metrics.py. It tests rag context correctness metrics, which are MetricPipeline objects.
Is this considered testing catalog assets? Should i move it to rag_context_correctness.py?

Generally, tests in test_metrics.py and tests placed in the prepare cards (without the if __name__ == "__main__": condition) are both executed with each PR, right?

yoavkatz · 2024-08-15T08:32:42Z

Thanks @yoavkatz . This PR tests with a model, so i will close it (and we keep the test in the prepare script, under the if __name__ == "__main__": condition).

A question about another test please: The test_context_correctness test is in test_metrics.py. It tests rag context correctness metrics, which are MetricPipeline objects. Is this considered testing catalog assets? Should i move it to rag_context_correctness.py?

Generally, tests in test_metrics.py and tests placed in the prepare cards (without the if __name__ == "__main__": condition) are both executed with each PR, right?

Yes. It makes sense to move test_context_correctness to the prepare file.

And yes , the test_metrics are run in the Test Library Code git action and the prepare cards in Test Catalog Preparation git action. This is done on every PR.

move rag context correctness metrics tests to test_metrics.py

ef6d4e1

matanor requested a review from elronbandel August 11, 2024 11:28

elronbandel requested changes Aug 11, 2024

View reviewed changes

matanor requested a review from dafnapension August 14, 2024 09:43

matanor changed the title ~~move rag context correctness metrics tests to test_metrics.py~~ move rag answer correctness metrics tests to test_metrics.py Aug 15, 2024

matanor closed this Aug 15, 2024

matanor mentioned this pull request Sep 9, 2024

Move test_context_correctness #1207

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

move rag answer correctness metrics tests to test_metrics.py #1131

move rag answer correctness metrics tests to test_metrics.py #1131

matanor commented Aug 11, 2024

elronbandel left a comment

elronbandel commented Aug 11, 2024

matanor commented Aug 11, 2024 •

edited

Loading

matanor commented Aug 14, 2024

yoavkatz commented Aug 14, 2024

matanor commented Aug 15, 2024

yoavkatz commented Aug 15, 2024

move rag answer correctness metrics tests to test_metrics.py #1131

move rag answer correctness metrics tests to test_metrics.py #1131

Conversation

matanor commented Aug 11, 2024

elronbandel left a comment

Choose a reason for hiding this comment

elronbandel commented Aug 11, 2024

matanor commented Aug 11, 2024 • edited Loading

matanor commented Aug 14, 2024

yoavkatz commented Aug 14, 2024

matanor commented Aug 15, 2024

yoavkatz commented Aug 15, 2024

matanor commented Aug 11, 2024 •

edited

Loading