Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test files for LexPascal and some issue with TestLexers #295

Open
HoTschir opened this issue Dec 15, 2024 · 5 comments
Open

Test files for LexPascal and some issue with TestLexers #295

HoTschir opened this issue Dec 15, 2024 · 5 comments
Labels
committed Issue fixed in repository but not in release pascal Caused by the Pascal lexer

Comments

@HoTschir
Copy link
Contributor

HoTschir commented Dec 15, 2024

Hi,
attached a patch, which includes test files for LexPascal lexer.
It will add following files:

 test/examples/pascal/AllStyles.pas          
 test/examples/pascal/AllStyles.pas.folded   
 test/examples/pascal/AllStyles.pas.styled   
 test/examples/pascal/CodeFolding.pas        
 test/examples/pascal/CodeFolding.pas.folded 
 test/examples/pascal/CodeFolding.pas.styled 
 test/examples/pascal/SciTE.properties       
 test/examples/pascal/SomeExample.pas        
 test/examples/pascal/SomeExample.pas.folded 
 test/examples/pascal/SomeExample.pas.styled 

There is some issue while testing, how to explain?
Testing with attached files will fail the LexPascal. But there is strange behavior, because resulted style file is o.k. for the first run, when AllStyles.pas.styled does not exist yet, but different for all other runs, when AllStyles.pas.styled.new is renamed to AllStyles.pas.styled. Accordingly for other files SCE_PAS_ASM comes into use.

How you can verify the issue:

  • erase *.styled and *.styled.new files
  • run script/RunTest.sh
  • rename AllStyles.pas.styled.new to AllStyles.pas.styled
  • run script/RunTest.sh -- Note: No changes have been made on Lexer or somewhere else.
  • compare AllStyles.pas.styled.new vs. AllStyles.pas.styled
    you will see among other things following diff:
        **new wrong styled**                **correct style**
        {4}// asm                            {4}// asm
        {9}asm{14}                             {9}asm{14} 
          this is                               this is 
          inside assembler                      inside assembler
        {9}end{0}                            {9}end{0}
        {14} <---wrong styles--+                         
                               v     
        {5}{$if block_defined}{14}           {5}{$if block_defined}{0}
        {0}                            

Note: {14} is SCE_PAS_ASM            
  • erase AllStyles.pas.styled
  • run script/RunTest.sh
  • content of AllStyles.pas.styled.new will be again as on first run:
      {4}// asm
      {9}asm{14} 
        this is 
        inside assembler
      {9}end{0}
      
      
      {5}{$if block_defined}{0}

which is correct!

So only that code styling is correct, if no *.styled (without .new) file exist.

I'm not sure, the problem is located in TestLexers?

br
HoTschir

lexilla.devLexPascal.ufjn2a7iroa1j70j.patch.zip

@nyamatongwe
Copy link
Member

This indicates a bug in the lexer with its handling of 'asm' sections. The first run, with no .styled file, will produce a .styled.new file but will only perform limited tests as there is no current .styled file to compare to. When the .styled.new file is copied to .styled, then TestLexers has (hopefully) correct style data that it will test in more ways.

Lexing is incremental: that is if you modify line 100 of the file then the first 99 lines do not have to be processed again - just from the change to the bottom of the window. A common type of lexer bug is for the lexer to not completely recreate its state at the position its been asked to start at, perhaps forgetting it is in (or out of) 'asm' sections. To check for these problems, TestLexers runs the lexer once over the whole file then once lexing each line individually and reports any difference.

Another cause of problems is with Windows versus Unix line endings '\r\n' versus '\n' as lexer authors commonly only use one operating system and only test for the line ends they use. This looks like the case here: there appears to be different behaviour when the file has '\n' line ends (as it was downloaded) with 'asm' sections not terminating at the 'end'. This can be reproduced interactively with AllStyles.pas in SciTE:

  • make 'asm' sections visible with a background colour style.pascal.14=fore:#804080,back:#FFFF00
  • open the file
  • using the scroll bar down arrow, scroll down line-by-line past the asm block
  • some lines after the 'end' will show the background colour

The problem is likely to be with remembering the curLineState in LineState with the code styler.SetLineState(curLine, curLineState). With a line that ends with 'end\n' the remembering occurs before the SCE_PAS_ASM is terminated and the stateInAsm flag turned off.

@nyamatongwe nyamatongwe added the pascal Caused by the Pascal lexer label Dec 15, 2024
zufuliu added a commit to zufuliu/notepad4 that referenced this issue Dec 16, 2024
@zufuliu
Copy link
Contributor

zufuliu commented Dec 16, 2024

The bug can be fixed by move if (sc.atLineEnd) block to the end of for loop:
Pascal-1216.patch

@@ -225,18 +225,14 @@ static void ColourisePascalDoc(Sci_PositionU startPos, Sci_Position length, int
 	CharacterSet setHexNumber(CharacterSet::setDigits, "abcdefABCDEF");
 	CharacterSet setOperator(CharacterSet::setNone, "#$&'()*+,-./:;<=>@[]^{}");
 
-	Sci_Position curLine = styler.GetLine(startPos);
-	int curLineState = curLine > 0 ? styler.GetLineState(curLine - 1) : 0;
+	int curLineState = 0;
 
 	StyleContext sc(startPos, length, initStyle, styler);
+	if (sc.currentLine > 0) {
+		curLineState = styler.GetLineState(sc.currentLine - 1);
+	}
 
 	for (; sc.More(); sc.Forward()) {
-		if (sc.atLineEnd) {
-			// Update the line state, so it can be seen by next line
-			curLine = styler.GetLine(sc.currentPos);
-			styler.SetLineState(curLine, curLineState);
-		}
-
 		// Determine if the current state should terminate.
 		switch (sc.state) {
 			case SCE_PAS_NUMBER:
@@ -335,6 +331,11 @@ static void ColourisePascalDoc(Sci_PositionU startPos, Sci_Position length, int
 				sc.SetState(SCE_PAS_ASM);
 			}
 		}
+
+		if (sc.atLineEnd) {
+			// Update the line state, so it can be seen by next line
+			styler.SetLineState(sc.currentLine, curLineState);
+		}
 	}
 
 	if (sc.state == SCE_PAS_IDENTIFIER && setWord.Contains(sc.chPrev)) {

@HoTschir
Copy link
Contributor Author

This indicates a bug in the lexer with its handling of 'asm' sections. The first run, with no .styled file, will produce a .styled.new file but will only perform limited tests as there is no current .styled file to compare to. When the .styled.new file is copied to .styled, then TestLexers has (hopefully) correct style data that it will test in more ways. ...

My assumption was, on every run same tests are executed. Thanks for that excellent research and explanation. As I can see, patch is already existing. Thanks and br HoTschir

@nyamatongwe
Copy link
Member

There appears to be 3 identical sections in AllStyles.pas that don't add value but do add bulk and thus extra work to anyone examining the cases.

Some other lexers track the preprocessor state and can style inactive code in a distinctive way. The duplication might be useful for such a lexer but the Pascal lexer does not support this feature so it would be better to remove the copies from this test file.

@HoTschir
Copy link
Contributor Author

AllStyles.pas was already prepared for a feature enhancement to LexPascal, where inactive preprocessor block is styled differently than active block.

Attached a correction patch to remove 2 sections.

lexilla.devLexPascal.20241220.testfile.AllStyles.zip

nyamatongwe pushed a commit that referenced this issue Dec 21, 2024
 Add test files for Pascal from HoTschir.
@nyamatongwe nyamatongwe added the committed Issue fixed in repository but not in release label Dec 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
committed Issue fixed in repository but not in release pascal Caused by the Pascal lexer
Projects
None yet
Development

No branches or pull requests

3 participants