Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Backport GC4 new dump feature to GC3 #131

Draft
wants to merge 1 commit into
base: gcos4gnucobol-3.x
Choose a base branch
from

Conversation

ddeclerck
Copy link
Contributor

@ddeclerck ddeclerck commented Jan 16, 2024

Following #130 , attempt to backport the dump / symtab feature from GC4 to GC3.

Note that I kept the "old" text dump for backward compatibility.

"Mostly" works, but breaks 25 tests, mainly due to subtle divergences betwen the GC3 and GC4 branches.
I could fix some. For others, I'm not sure about the right way to fix them.
Here are a few of the problematic ones and my observations.

990: FUNCTION BYTE-LENGTH
1030: FUNCTION LENGTH

Generates invalid C: missing field variable. This is due to the fact that GC3 inlines calls to LENGTH on fields with a NATIONAL category while GC4 does not inline them, preventing the generation of the field in the C code. Note that LENGTH of fields with an ALPHANUMERIC category are also inlined but do not suffer for this problem, possibly because of the field initialization code (I suppose because initialize_uniform_char is defined for ALPHANUMERIC but not NATIONAL).

679: LOCAL-STORAGE (3)

Generates invalid C: storing the address of a non-static index variable into a static variable. This is due to a fix (bug 794) in GC3, which allowed the index type to be either CB_INT_INDEX or CB_STATIC_INT_INDEX (see parser.y:occurs_index). In GC4, it is always CB_STATIC_INT_INDEX thus we never run into problems.

1163: System routine C$NARG

Generates invalid C : storing the address of a non-static variable into a static variable. This is due to a different handling of procedure parameters (USING...) between GC3 and GC4. In GC3, the C function generated for a procedure with parameters has the field passed as arguments, while in GC4 these parameters are stored in (static) field in the local storage.

808: stack and dump feature
809: dump feature with NULL address

Divergence in the ouput, probably because the old and new dumps don't do exactly the same thing. I yet have to investigate more into this.

@ddeclerck ddeclerck changed the title [WIP] Backport GC4 new dump feature tu GC3 [WIP] Backport GC4 new dump feature to GC3 Jan 21, 2024
Copy link
Collaborator

@GitMensch GitMensch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please check the merged code in trunk via git blame or svn annotate, this should get you:

  • the revision numbers (please just write them down for now), which is useful for tracking the manual cherry-pick merge in SVN
  • and by that also the Changelog entries which should be merged, too
  • and the testcases that were added; ideally there is one that verifies that the "not raise a SIGSEGV during dump" code does work (and therefore can be easily tested on several systems) [that security feature is new, the old dump code (which may can be adjusted to use this new feature) would abort if there's a memory issue during the dump]

And of course: add an own ChangeLog entry in both Changelogs "backport new dump code, by keeping the old for backward-compatibility" and then the dates of the merged changelog entries as a textual reference.


I suppose because initialize_uniform_char is defined for ALPHANUMERIC but not NATIONAL

NATIONAL has two chars (we currently only handle it as UTF16, the later possible addition to UTF32 needs changes in a lot of places) so the single character memset won't work, but we could define NATIONAL_SPACES and NATIONAL_ZEROES as constants similar to the existing ones and use them in several places. Something to keep in mind for later (or adding an issue for).

Divergence in the ouput, probably because the old and new dumps don't do exactly the same thing. I yet have to investigate more into this.

Please do so and drop a note once you have more information.

@@ -4198,6 +4204,9 @@ process_command_line (const int argc, char **argv)
cb_flag_trace = 1;
cb_flag_source_location = 1;
}
if (cb_flag_trace) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add || cb_flag_dump != COB_DUMP_NONE here, then you can remove it from cobc_def_dump_opts () completely

@@ -8937,6 +8946,8 @@ finish_setup_compiler_env (void)

if (getenv ("COBC_GEN_DUMP_COMMENTS")) {
cb_wants_dump_comments = 1;
/* Disable "new" symbol table if requesting dump comments */
cb_flag_symbols = 0;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please leave cb_wants_dump_comments outside of that; if it is set then the old dump code should be generated - but only as comments (this should already be in its code)
please add a separate if (getenv ("COBC_GEN_DUMP_AS_CODE")) {cb_flag_symbols = 0;] here, so we can enable it if necessary - and until 4 also to test the backward compatibility of libcob.

@GitMensch
Copy link
Collaborator

GitMensch commented Jan 25, 2024

And just to check: would this already solve the initial needs which were the reasons for #130?

Note; we should take this new code live once it works, as it:

  • creates smaller code and is faster on the C compiler side (by numbers, if the programs have a huge number of copybooks, ähm, variables)
  • handles the SIGSEGV during dump issue, which is important for the call_with_exception stuff (and when being called for example from JAVA)

@ddeclerck
Copy link
Contributor Author

Don't know where I'm going with this, but in essence, the main problem comes from new features in GC3 that were not included yet in GC4 (and not the opposite). For instance, the changes introduced in GC3 by commits 4653 and 4886 actually break the dump feature if we port them to GC4. So that would probably mean it would be best to first forward port from GC3 to GC4 any new commit that might break the dump feature and make the appropriate fixes in GC4, before backporting the whole dump feature to GC3...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants