Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Key terms support #229

Merged
merged 12 commits into from
Dec 6, 2023
Merged

Key terms support #229

merged 12 commits into from
Dec 6, 2023

Conversation

Enkidu93
Copy link
Collaborator

@Enkidu93 Enkidu93 commented Dec 1, 2023

Minor fixes to support changes in the Machine PR. In general, if you could leave the parallel PRs open until both are approved, that might be helpful.


This change is Reviewable

@Enkidu93 Enkidu93 requested a review from johnml1135 December 1, 2023 20:39
@codecov-commenter
Copy link

codecov-commenter commented Dec 1, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (8c25097) 62.39% compared to head (62c192f) 62.39%.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #229   +/-   ##
=======================================
  Coverage   62.39%   62.39%           
=======================================
  Files         134      134           
  Lines        5826     5826           
  Branches      912      912           
=======================================
  Hits         3635     3635           
  Misses       1803     1803           
  Partials      388      388           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@ddaspit ddaspit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 5 of 5 files at r1, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @Enkidu93 and @johnml1135)


tests/Serval.E2ETests/data/TestProject.zip line 0 at r1 (raw file):
We don't want to check in binary files. Can you check in the contents, and create the zip file in the test?

@Enkidu93
Copy link
Collaborator Author

Enkidu93 commented Dec 1, 2023

Reviewed 5 of 5 files at r1, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @Enkidu93 and @johnml1135)

tests/Serval.E2ETests/data/TestProject.zip line 0 at r1 (raw file): We don't want to check in binary files. Can you check in the contents, and create the zip file in the test?

OK, there was already a zip, but I can do that if you like.

Copy link
Contributor

@ddaspit ddaspit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @Enkidu93 and @johnml1135)


tests/Serval.E2ETests/data/TestProject.zip line at r1 (raw file):

Previously, Enkidu93 (Eli C. Lowry) wrote…

OK, there was already a zip, but I can do that if you like.

I should have caught it on the PR that added this file. Unfortunately, every time we make changes to this file it adds 2MB to our repo size. I would like to correct that sooner rather than later.

@Enkidu93
Copy link
Collaborator Author

Enkidu93 commented Dec 1, 2023

tests/Serval.E2ETests/data/TestProject.zip line at r1 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

I should have caught it on the PR that added this file. Unfortunately, every time we make changes to this file it adds 2MB to our repo size. I would like to correct that sooner rather than later.

Done.

@johnml1135
Copy link
Collaborator

src/Serval.Translation/Controllers/TranslationEnginesController.cs line 780 at r2 (raw file):

    /// A typical use case would be to set `"options"` to `{"max_steps":10}` in order to configure the maximum
    /// number of training iterations in order to reduce turnaround time for testing purposes. See [this](https://github.com/sillsdev/serval/wiki/Paratext-Key-Terms-Parsing)
    /// for using `"options"` to include Paratext key terms data in training.

It should be enabled by default - and clear from the documentation (it is not).

@Enkidu93
Copy link
Collaborator Author

Enkidu93 commented Dec 4, 2023

src/Serval.Translation/Controllers/TranslationEnginesController.cs line 780 at r2 (raw file):

    /// A typical use case would be to set `"options"` to `{"max_steps":10}` in order to configure the maximum
    /// number of training iterations in order to reduce turnaround time for testing purposes. See [this](https://github.com/sillsdev/serval/wiki/Paratext-Key-Terms-Parsing)
    /// for using `"options"` to include Paratext key terms data in training.

It should be enabled by default - and clear from the documentation (it is not).

Yep, I'm just getting to that. Just switched it to default in Machine.

@johnml1135
Copy link
Collaborator

tests/Serval.E2ETests/ServalApiTests.cs line 4 at r2 (raw file):

using System.Reflection.PortableExecutable;
using System.Runtime.InteropServices;
using DnsClient.Protocol;

If I am correct, all usings go into the global using document...

@johnml1135
Copy link
Collaborator

tests/Serval.E2ETests/ServalApiTests.cs line 365 at r2 (raw file):

        await _helperClient!.ClearEngines();
        ZipFile.CreateFromDirectory("../../../data/TestProject", "TestProject.zip");
        ZipFile.CreateFromDirectory("../../../data/TestProjectTarget", "TestProjectTarget.zip");

I like how you zip up the files so that changes to USFM can be tracked in git!

@johnml1135
Copy link
Collaborator

tests/Serval.E2ETests/ServalApiTests.cs line 365 at r2 (raw file):

Previously, johnml1135 (John Lambert) wrote…

I like how you zip up the files so that changes to USFM can be tracked in git!

That is, zip them up during the test and leave them uncompressed in the repo.

@johnml1135
Copy link
Collaborator

tests/Serval.E2ETests/ServalApiTests.cs line 400 at r2 (raw file):

            new PretranslateCorpusConfig { CorpusId = corpus.Id, TextIds = new string[] { "JHN", "REV" } }
        );
        _helperClient.TranslationBuildConfig.Options = "{\"max_steps\":10, \"use_key_terms\":true}";

If it is added by default, why include it here? How can you test that it is actually being used?

Copy link
Collaborator

@johnml1135 johnml1135 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 5 files at r1, 25 of 25 files at r2, 1 of 3 files at r3.
Reviewable status: 26 of 28 files reviewed, 5 unresolved discussions (waiting on @ddaspit and @Enkidu93)

@Enkidu93
Copy link
Collaborator Author

Enkidu93 commented Dec 4, 2023

tests/Serval.E2ETests/ServalApiTests.cs line 400 at r2 (raw file):

            new PretranslateCorpusConfig { CorpusId = corpus.Id, TextIds = new string[] { "JHN", "REV" } }
        );
        _helperClient.TranslationBuildConfig.Options = "{\"max_steps\":10, \"use_key_terms\":true}";

If it is added by default, why include it here? How can you test that it is actually being used?

I can remove it if you like - it shouldn't make a difference. The purpose of this test isn't really to capture that functionality/logic - that's covered in the NmtPreprocessBuildJobTests. I figured leaving it explicit might be beneficial, and it's also more robust if we were to change the defaults.

Copy link
Collaborator Author

@Enkidu93 Enkidu93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 26 of 28 files reviewed, 5 unresolved discussions (waiting on @ddaspit and @johnml1135)


src/Serval.Translation/Controllers/TranslationEnginesController.cs line 780 at r2 (raw file):

Previously, Enkidu93 (Eli C. Lowry) wrote…

Yep, I'm just getting to that. Just switched it to default in Machine.

Done. Also updated the Wiki.


tests/Serval.E2ETests/ServalApiTests.cs line 4 at r2 (raw file):

Previously, johnml1135 (John Lambert) wrote…

If I am correct, all usings go into the global using document...

Done. These were mainly auto-usings added by VSCode inadvertently.


tests/Serval.E2ETests/ServalApiTests.cs line 365 at r2 (raw file):

Previously, johnml1135 (John Lambert) wrote…

That is, zip them up during the test and leave them uncompressed in the repo.

Yes, this is as per Damien's request - trackable. I also paired them down for the sake of space.

@johnml1135
Copy link
Collaborator

tests/Serval.E2ETests/ServalApiTests.cs line 400 at r2 (raw file):

Previously, Enkidu93 (Eli C. Lowry) wrote…

I can remove it if you like - it shouldn't make a difference. The purpose of this test isn't really to capture that functionality/logic - that's covered in the NmtPreprocessBuildJobTests. I figured leaving it explicit might be beneficial, and it's also more robust if we were to change the defaults.

That's fine.

Copy link
Collaborator

@johnml1135 johnml1135 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 3 files at r3, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @ddaspit)

Copy link
Contributor

@ddaspit ddaspit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 24 of 25 files at r2, 3 of 3 files at r3, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @Enkidu93)


src/Serval.Translation/Controllers/TranslationEnginesController.cs line 780 at r2 (raw file):

Previously, Enkidu93 (Eli C. Lowry) wrote…

Done. Also updated the Wiki.

It would probably be good to mention that we only include proper names.


tests/Serval.E2ETests/ServalApiTests.cs line 359 at r3 (raw file):

    {
        await _helperClient!.ClearEngines();
        ZipFile.CreateFromDirectory("../../../data/TestProject", "TestProject.zip");

We should wrap all of this in a try/finally so that we ensure that the zip files are deleted.

try
{
    ZipFile.CreateFromDirectory("../../../data/TestProject", "TestProject.zip");
    ZipFile.CreateFromDirectory("../../../data/TestProjectTarget", "TestProjectTarget.zip");
    // upload files
    ...
}
finally
{
    if (File.Exists("TestProject.zip"))
        File.Delete("TestProject.zip");
    if (File.Exists("TestProjectTarget.zip"))
        File.Delete("TestProjectTarget.zip");
}

It would also be good to create these files in a temp directory.

Copy link
Collaborator Author

@Enkidu93 Enkidu93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @ddaspit)


src/Serval.Translation/Controllers/TranslationEnginesController.cs line 780 at r2 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

It would probably be good to mention that we only include proper names.

Done.


tests/Serval.E2ETests/ServalApiTests.cs line 359 at r3 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

We should wrap all of this in a try/finally so that we ensure that the zip files are deleted.

try
{
    ZipFile.CreateFromDirectory("../../../data/TestProject", "TestProject.zip");
    ZipFile.CreateFromDirectory("../../../data/TestProjectTarget", "TestProjectTarget.zip");
    // upload files
    ...
}
finally
{
    if (File.Exists("TestProject.zip"))
        File.Delete("TestProject.zip");
    if (File.Exists("TestProjectTarget.zip"))
        File.Delete("TestProjectTarget.zip");
}

It would also be good to create these files in a temp directory.

Done.

@ddaspit ddaspit requested a review from johnml1135 December 4, 2023 19:45
Copy link
Contributor

@ddaspit ddaspit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 2 files at r4, all commit messages.
Reviewable status: 27 of 28 files reviewed, 1 unresolved discussion (waiting on @Enkidu93 and @johnml1135)


tests/Serval.E2ETests/ServalApiTests.cs line 359 at r3 (raw file):

Previously, Enkidu93 (Eli C. Lowry) wrote…

Done.

I should have been more clear. I mean in the operating systems' temp folder. You can get it using Path.GetTempPath.

@Enkidu93
Copy link
Collaborator Author

Enkidu93 commented Dec 4, 2023

tests/Serval.E2ETests/ServalApiTests.cs line 359 at r3 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

I should have been more clear. I mean in the operating systems' temp folder. You can get it using Path.GetTempPath.

No worries. It seemed odd to me. This makes more sense haha. Done.

Copy link
Contributor

@ddaspit ddaspit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r5, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @Enkidu93 and @johnml1135)


tests/Serval.E2ETests/ServalApiTests.cs line 364 at r5 (raw file):

        try
        {
            ZipFile.CreateFromDirectory("../../../data/TestProject", $"{tempDirectory}/TestProject.zip");

You should use Path.Combine to construct paths.

@Enkidu93
Copy link
Collaborator Author

Enkidu93 commented Dec 4, 2023

Reviewed 1 of 1 files at r5, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @Enkidu93 and @johnml1135)

tests/Serval.E2ETests/ServalApiTests.cs line 364 at r5 (raw file):

        try
        {
            ZipFile.CreateFromDirectory("../../../data/TestProject", $"{tempDirectory}/TestProject.zip");

You should use Path.Combine to construct paths.

For all paths? E.g., including "../../somedir/somefile.ext"? Or in regards to the tempDirectory? If you want it everywhere, I should change it a few other places as well.

Copy link
Contributor

@ddaspit ddaspit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @Enkidu93 and @johnml1135)


tests/Serval.E2ETests/ServalApiTests.cs line 364 at r5 (raw file):

Previously, Enkidu93 (Eli C. Lowry) wrote…

For all paths? E.g., including "../../somedir/somefile.ext"? Or in regards to the tempDirectory? If you want it everywhere, I should change it a few other places as well.

Yes, you should use it everywhere.

Copy link
Collaborator Author

@Enkidu93 Enkidu93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 27 of 28 files reviewed, 1 unresolved discussion (waiting on @ddaspit and @johnml1135)


tests/Serval.E2ETests/ServalApiTests.cs line 364 at r5 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

Yes, you should use it everywhere.

OK, done. In Machine, I've also changed it except in the uses of shared service OpenRead following your usage of OpenWrite in the build job.

Copy link
Collaborator

@johnml1135 johnml1135 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 2 files at r4, 1 of 1 files at r6, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @ddaspit)

Copy link
Contributor

@ddaspit ddaspit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 1 of 1 files at r6, all commit messages.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @Enkidu93)

@Enkidu93 Enkidu93 merged commit 7ea2794 into main Dec 6, 2023
1 check passed
@Enkidu93 Enkidu93 deleted the key_terms_support branch December 6, 2023 20:28
Enkidu93 added a commit that referenced this pull request Dec 11, 2023
Enkidu93 added a commit that referenced this pull request Dec 11, 2023
Enkidu93 added a commit that referenced this pull request Dec 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants