Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

@ references in commemorations can loop infinitely #525

Open
gregordick opened this issue May 15, 2016 · 16 comments
Open

@ references in commemorations can loop infinitely #525

gregordick opened this issue May 15, 2016 · 16 comments
Labels

Comments

@gregordick
Copy link
Contributor

Pasc6-0 had this:

[Commemoration 3]
@:Commemoratio 2

Infinite loop ensued.

@fiapps
Copy link
Contributor

fiapps commented Jun 16, 2024

#3869 shows the importance of addressing this. The usual way to prevent an infinite loop is to count how many references have been followed. When this nesting depth exceeds some value, e.g., 10, the code returns an error like "References too deeply nested" instead of continuing to follow references.

@FAJ-Munich
Copy link
Contributor

Interesting issue; I've never understood what the original problem with the commemoratio3 was about. So thanks for explaining about the nesting issue. I guess the easiest way to avoid too deep nesting is to reference the actual text source as directly as possible, irrespective of the nesting in the breviary books?

@fiapps
Copy link
Contributor

fiapps commented Jun 17, 2024

Interesting issue; I've never understood what the original problem with the commemoratio3 was about. So thanks for explaining about the nesting issue. I guess the easiest way to avoid too deep nesting is to reference the actual text source as directly as possible, irrespective of the nesting in the breviary books?

I would say that nesting should reflect the structure of the data in this repository. Since the repository does contain duplicated texts, it may not always be obvious which location is the definitive one to reference, so deep nesting could be created legitimately or at least in good faith. The problem isn't deep nesting, but circular nesting, and sometimes the circularity is not obvious.

When the issue was reported, [Commemoratio 3] was defined as @:Commemoratio 2, which in turn referenced @Tempora/Pasc5-4:Oratio. That didn't look circular, and I'm not certain what went wrong, but I think that when the code tried to resolve @:Commemoratio 2, it said, "oh, in this case I should be looking for Commemoratio 2 in the [Commemoratio 3] section." It did, found the section's content was @:Commemoratio 2, and tried to look that up, adjusting it yet again to [Commemoratio 3], and so went round and round until someone killed the process.

I ran into another case years ago was when I was trying to add an Italian translation (I never finished, and eventually someone provided a different translation). I saw that a particular common had texts that were duplicated between the Easter file for the common and the regular file for the common: I forget which it was, but let's say C6. So I put references in the Easter file (C6p) to the regular file (C6). An infinite loop ensued when Easter offices were generated. Why? When the program looked up the common, it saw it was Easter, so it opened C6p.txt and tried to retrieve a section. It found a reference to the corresponding section in C6. But it's Easter, so references to C6 should be interpreted as C6p. So it reopened C6p and found in the same place the same reference to C6, interpreted it as C6p, etc. The mechanism for commons in Easter has recently been changed, so the same thing wouldn't happen today, when C6p only contains things that are different from C6.

These infinite loops in references tend to be pretty small: either something is referring to itself because of the way the program adjusts the reference, or perhaps there are two references that refer to each other.

Until this issue is resolved, testing is important to make sure you don't render the server unusable. Even after the issue is resolved, you still have to test to make sure there's no error, but at least users will be able to see the rest of the office or Mass that's not affected by the circular reference, and the error won't cause high CPU usage on the server.

FAJ-Munich added a commit to FAJ-Munich/divinum-officium that referenced this issue Jun 30, 2024
Preventing some cause of infinite loops for Matins Gospel in Monastic
APMarcello3 added a commit that referenced this issue Jun 30, 2024
@APMarcello3
Copy link
Contributor

@d4h0y0sr The refactoring that has taken place over the past month has caused excessive cost overruns on the live site, specifically stemming from missa.pl. Please review all of your PRs that you have submitted and ensure proper references. Thank you.

@d4h0y0sr
Copy link
Contributor

d4h0y0sr commented Jul 7, 2024

Do you have any clue to review this for testing?, I created a python script that I use to avoid loops in the references, until know i have not detect any unproper reference.

@d4h0y0sr
Copy link
Contributor

d4h0y0sr commented Jul 7, 2024

@fiapps would this code generate a infinity loop?

File: C6bp.txt

@Commune/C6b

[GradualeP]
@Commune/C6

[Evangelium]
@Commune/C6

[Graduale]
@Commune/C6:GradualeP

@APMarcello3
Copy link
Contributor

Are you using divinum_replay.pl?

For the moment I would request the following adjustments be made to all refactored Missa texts:

  1. Replace missing elements, such as deleted [Offertorium], etc.

  2. For the moment, avoid nesting references and make references directly to an explicit text.

@d4h0y0sr
Copy link
Contributor

d4h0y0sr commented Jul 7, 2024

Are you using divinum_replay.pl?

For the moment I would request the following adjustments be made to all refactored Missa texts:

1. Replace missing elements, such as deleted [Offertorium], etc.

2. For the moment, avoid nesting references and make references directly to an explicit text.

What is divinum_replay?

@APMarcello3
Copy link
Contributor

@d4h0y0sr See folder admin/divinum_replay.pl. It provides regression testing.

@fiapps
Copy link
Contributor

fiapps commented Jul 7, 2024

@fiapps would this code generate a infinity loop?

File: C6bp.txt

@Commune/C6b

[GradualeP]
@Commune/C6

[Evangelium]
@Commune/C6

[Graduale]
@Commune/C6:GradualeP

I wouldn't expect this to produce an infinite loop, but I only work with the horas part of the database, and as my examples above show, the ways loops are produced aren't always obvious. A script to detect loops is good, but ultimately, you have to actually test missa.pl.

What is divinum_replay?

This is the script used in conjunction with divinum-get.pl for testing. Both are in the admin directory of the repo. divinum-get.pl downloads the text of the Mass or of an hour of the breviary and saves it to a file. The header of this file contains the URL from which it was downloaded. divinum-replay.pl takes such a file and downloads the current text for the URL. A flag can be used to make it download from another server, typically a local server running the code you want to test, e.g, --url='http://localhost:8000/'. It compares the results and reports lines that have been added, removed, or changed.

The most rigorous way to test is by using the Docker container to run a local server, but since the Dockerfile copies the contents of web into the container, this means you have to rebuild the container every time you change the repo. What I do for testing my fork is run python3 -m http.server --bind localhost --cgi 8000 in the web directory of the repo. This uses the version of Perl I have installed locally, which isn't exactly the same as the one used in the Docker container that runs on divinumofficium.com, but any changes to the repo are immediately live on my local server.

As for test cases, I maintain a set of over 1000 cases for testing my variant of officium.pl, and continue to use the same cases week after week, using divinum-replay.pl --update to update them when the differences reported don't represent an error. You could also just do one-off testing, where after refactoring a set of Masses, you create test cases to cover that set and compare the results between the live site (or a local copy of it) and your local changes.

FAJ-Munich added a commit to FAJ-Munich/divinum-officium that referenced this issue Jul 7, 2024
APMarcello3 added a commit that referenced this issue Jul 7, 2024
Brute Force solution to #525 + Missa version names
APMarcello3 added a commit that referenced this issue Jul 7, 2024
Revert "Brute Force solution to #525 + Missa version names"
@d4h0y0sr
Copy link
Contributor

d4h0y0sr commented Jul 7, 2024

The most rigorous way to test is by using the Docker container to run a local server, but since the Dockerfile copies the contents of web into the container, this means you have to rebuild the container every time you change the repo. What I do for testing my fork is run python3 -m http.server --bind localhost --cgi 8000 in the web directory of the repo. This uses the version of Perl I have installed locally, which isn't exactly the same as the one used in the Docker container that runs on divinumofficium.com, but any changes to the repo are immediately live on my local server.

I dont use docker, I use WSL with perl installed and python. So if i understood well, this script will compare the content of my local copy vs the content of the website. I have a question about how to use this script, do you have an example with the parameters to run it over a certain mass? thanks

FAJ-Munich added a commit that referenced this issue Jul 8, 2024
Brute force solution to #525 only
@fiapps
Copy link
Contributor

fiapps commented Jul 8, 2024

I dont use docker, I use WSL with perl installed and python. So if i understood well, this script will compare the content of my local copy vs the content of the website. I have a question about how to use this script, do you have an example with the parameters to run it over a certain mass? thanks

As I said @d4h0y0sr, I don't use the missa part of the project, so I don't have an example for you. Someone who uses it to test that part of the project will have to help you. I tried perl admin/divinum-get.pl --prayer=SanctaMissa --version='Rubrics 1960' --from=07-08-2024 --to=07-08-2024, but the file it saved just contains two header lines, with no result content.

@d4h0y0sr
Copy link
Contributor

d4h0y0sr commented Jul 8, 2024

I dont use docker, I use WSL with perl installed and python. So if i understood well, this script will compare the content of my local copy vs the content of the website. I have a question about how to use this script, do you have an example with the parameters to run it over a certain mass? thanks

As I said @d4h0y0sr, I don't use the missa part of the project, so I don't have an example for you. Someone who uses it to test that part of the project will have to help you. I tried perl admin/divinum-get.pl --prayer=SanctaMissa --version='Rubrics 1960' --from=07-08-2024 --to=07-08-2024, but the file it saved just contains two header lines, with no result content.

Ok, first i think there could be a bug (or maybe something just with perl version):

this line already do the encode:
query_form(\%encoded)
so this line:
$_ = url_encode($_) for values %encoded;
its encoding twice, making version going from "Rubrics+1960" to "Rubrics%2B1960" which is wrong
But even after that, the file is empty, so i dont know to test using this file.

So if i want to do testing by myself, is this the testing algorithm?:

Compare

  • local content of url response
    vs
  • divinum website response

would be enough for testing?

@APMarcello3 could you help me by provide me examples of how to use replay script for mass? and also how to check the excessive cost overruns? i knew one of my PR cause an issue because a circular reference which i fixed the next day, but i have applied Benchmarking web page load time of the whole 2024 and i dont see any major issue.

@FAJ-Munich
Copy link
Contributor

The usual way to prevent an infinite loop is to count how many references have been followed. When this nesting depth exceeds some value, e.g., 10, the code returns an error like "References too deeply nested" instead of continuing to follow references.

This method described by @fiapps above is now used by means of 8a16867 with a limit of 6 for the loop. Testing it with a self-reference, that still is a lot of CPU work before breaking out of it since there is exponential growth in the process in such a case.

@fiapps
Copy link
Contributor

fiapps commented Jul 9, 2024

The usual way to prevent an infinite loop is to count how many references have been followed. When this nesting depth exceeds some value, e.g., 10, the code returns an error like "References too deeply nested" instead of continuing to follow references.

This method described by @fiapps above is now used by means of 8a16867 with a limit of 6 for the loop. Testing it with a self-reference, that still is a lot of CPU work before breaking out of it since there is exponential growth in the process in such a case.

I wouldn't expect there to be any real need for references nested more than 2 or 3 deep. I can imagine one section (Y) using a reference to X with a substitution, and another section (Z) using a reference to Y because its content should be identical. That's only 2 levels deep. More levels could result from X being replaced with a reference at a later date, but there wouldn't be an inherent need for those extra levels, they would just be a side effect of reorganization that hadn't been thoroughly carried out.

So I think a value less than 6 could work, but testing would be required to make sure nothing has been broken. Even a limit of 6 is probably a major cost savings, because I would expect a web server to let Perl run for something like 30–120 seconds before it decides to kill it automatically.

@d4h0y0sr
Copy link
Contributor

d4h0y0sr commented Jul 9, 2024

I agree with you @fiapps and @FAJ-Munich I already have a list of the explicit references and all of them avoided more than 2 levels deep, I dont see this is an issue right now.

Thats why i asked @APMarcello3, I could not find the issue he is mentioning about the excessive cost overruns, at least in the current state with my PR.

So if there is not any issue reported I will continue with the translations and using references with the testing as i had done with my recents PR.

FAJ-Munich added a commit to FAJ-Munich/divinum-officium that referenced this issue Sep 26, 2024
including some additional safeguarding regarding paschaltide Commune reference for DivinumOfficium#525 pushing it further to be closed;
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants