Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve instruction documentation #1560

Open
3 of 8 tasks
nummacway opened this issue Nov 20, 2024 · 8 comments · May be fixed by #1561
Open
3 of 8 tasks

Improve instruction documentation #1560

nummacway opened this issue Nov 20, 2024 · 8 comments · May be fixed by #1561
Assignees
Labels
docs This affects the documentation (web-specific issues go to rgbds-www) rgbasm This affects RGBASM
Milestone

Comments

@nummacway
Copy link

nummacway commented Nov 20, 2024

  • STOP
    When reading through gbz80(7)'s documentation, I noticed it says STOP was 2 bytes long, when the instruction itself is only 1, not located in CB and doesn't take useful arguments. People on Discord pointed me to this chart (which BTW misses a "Yes" label at the very bottom). I suggest the documentation say something like this:

    Because the CPU might take the next byte as either a useless argument to the STOP instruction or as a new opcode, rgbds will always add an implicit NOP for safety. Visit Pan Docs for more information.

Some other suggestions:

  • Legend:
    • Exclamation mark is only used once, and that's in the legend. There's no need to explain symbols that are never used.
    • u3 should say that it always refers to bit indices, and if 0 refers to the least or most significant bit.
  • Remove pointless anchors, e.g. any of these that aren't a heading. By using most browsers' default style of <abbr> HTML tag, one would expect the anchors to have a hint or point to the legend (or something like that), when in fact it's just a useless link. The legend of the 8-bit instructions looks really odd as B, D, E and L are underlined, but A, C, and H are not.
  • CPL should mention that the operation is also known as "bitwise NOT". I do programming for 20 years (though in high-level languages) and this was the first time I ever saw this operation being called the complement.
  • It could say that LD r8,r8 doesn't do anything at all if both arguments are the same. I thought that there must be a use if these exist, but LD [HL], [HL] does not — but there's no use, except for BGB using LD B, B for breakpoints and LD D, D for debug.
  • Several commands that work with a constant (literal, value, immediate data) could state that more efficient shortcuts exist if that value is 0, e.g. LD A, 0 could be written as XOR A, and CP 0 could be written as AND A or OR A, although flags are different. I acknowledge that this document is probably the wrong place for notes like that. If so, is there a location for community notes and advice on individual instructions?
@ISSOtm ISSOtm transferred this issue from gbdev/rgbds-www Nov 24, 2024
@ISSOtm
Copy link
Member

ISSOtm commented Nov 24, 2024

I've taken the liberty of reformatting your comment, so that it becomes a checklist for us :)

  • STOP
    When reading through gbz80(7)'s documentation, I noticed it says STOP was 2 bytes long, when the instruction itself is only 1, not located in CB and doesn't take useful arguments. People on Discord pointed me to this chart (which BTW misses a "Yes" label at the very bottom). I suggest the documentation say something like this:
    > Because the CPU might take the next byte as either a useless argument to the STOP instruction or as a new opcode, rgbds will always add an implicit NOP for safety. Visit Pan Docs for more information.

You might be surprised to learn that rgbasm actually doesn't always insert a nop: stop $FF will actually emit 10 FF! It would indeed be a good idea to document this—it's otherwise not documented anywhere.

The issue becomes finding the right wording. First off, since rgbasm always emits two bytes, I'd keep the canonical length at 2, and mention that the CPU can treat the instruction as 1-byte under specific circumstances, linking to Pan Docs. And finally, mentioning that for that purpose, stop <byte> syntax is available, with the default being equivalent to stop $00. How does that sound?

Some other suggestions:

Yes, thank you! These sound like improvements.

  • Legend:

    • Exclamation mark is only used once, and that's in the legend. There's no need to explain symbols that are never used.

I'm not sure what you're talking about here. Can you point at / link to a specific example?

  • u3 should say that it always refers to bit indices, and if 0 refers to the least or most significant bit.

Perhaps we should call it something else, then. Maybe bit_idx?

  • Remove pointless anchors, e.g. any of these that aren't a heading. By using most browsers' default style of <abbr> HTML tag, one would expect the anchors to have a hint or point to the legend (or something like that), when in fact it's just a useless link. The legend of the 8-bit instructions looks really odd as B, D, E and L are underlined, but A, C, and H are not.

Ah, well, we don't really control the HTML generation. mandoc generates it from the man pages. I agree it would be better to not have those spurious anchors (they exist on all pages, FWIW), but I couldn't figure out a way to remove them that wasn't troublesome. Feel free to give it a shot if you'd like, though?

  • CPL should mention that the operation is also known as "bitwise NOT". I do programming for 20 years (though in high-level languages) and this was the first time I ever saw this operation being called the complement.

Interesting. I thought the term was more widespread. Sure, feel free to open a PR against man/gbz80.7!

  • It could say that LD r8,r8 doesn't do anything at all if both arguments are the same. I thought that there must be a use if these exist, but LD [HL], [HL] does not — but there's no use, except for BGB using LD B, B for breakpoints and LD D, D for debug.

That sounds fair. We can, and maybe should, link to https://bgb.bircd.org/manual.html#expressions to explain ld d, d?

  • Several commands that work with a constant (literal, value, immediate data) could state that more efficient shortcuts exist if that value is 0, e.g. LD A, 0 could be written as XOR A, and CP 0 could be written as AND A or OR A, although flags are different. I acknowledge that this document is probably the wrong place for notes like that. If so, is there a location for community notes and advice on individual instructions?

Optimisation tips indeed definitely don't belong in gbz80.7, which is only intended as an instruction syntax (and, somewhat accidentally, behaviour too) reference. Optimisation tips like this are present on the pokecrystal wiki, but I think they should be ported as a guide on gbdev.io. (@Rangi42 may have some opinion there, as a pokecrystal maintainer.)

@Rangi42 Rangi42 added this to the v0.9.0 milestone Nov 24, 2024
@Rangi42 Rangi42 added docs This affects the documentation (web-specific issues go to rgbds-www) rgbasm This affects RGBASM labels Nov 24, 2024
@nummacway
Copy link
Author

nummacway commented Nov 24, 2024

I've taken the liberty of reformatting your comment, so that it becomes a checklist for us :)

  • STOP
    When reading through gbz80(7)'s documentation, I noticed it says STOP was 2 bytes long, when the instruction itself is only 1, not located in CB and doesn't take useful arguments. People on Discord pointed me to this chart (which BTW misses a "Yes" label at the very bottom). I suggest the documentation say something like this:

    Because the CPU might take the next byte as either a useless argument to the STOP instruction or as a new opcode, rgbds will always add an implicit NOP for safety. Visit Pan Docs for more information.

You might be surprised to learn that rgbasm actually doesn't always insert a nop: stop $FF will actually emit 10 FF! It would indeed be a good idea to document this—it's otherwise not documented anywhere.

The issue becomes finding the right wording. First off, since rgbasm always emits two bytes, I'd keep the canonical length at 2, and mention that the CPU can treat the instruction as 1-byte under specific circumstances, linking to Pan Docs. And finally, mentioning that for that purpose, stop <byte> syntax is available, with the default being equivalent to stop $00. How does that sound?

Great!

Some other suggestions:

Yes, thank you! These sound like improvements.

  • Legend:

    • Exclamation mark is only used once, and that's in the legend. There's no need to explain symbols that are never used.

I'm not sure what you're talking about here. Can you point at / link to a specific example?

image

  • u3 should say that it always refers to bit indices, and if 0 refers to the least or most significant bit.

Perhaps we should call it something else, then. Maybe bit_idx?

I'm not a 100% sure. But with this, noone will first think that this is some kind of optimized general-purpose integer, which was my first reaction.

  • Remove pointless anchors, e.g. any of these that aren't a heading. By using most browsers' default style of <abbr> HTML tag, one would expect the anchors to have a hint or point to the legend (or something like that), when in fact it's just a useless link. The legend of the 8-bit instructions looks really odd as B, D, E and L are underlined, but A, C, and H are not.

Ah, well, we don't really control the HTML generation. mandoc generates it from the man pages. I agree it would be better to not have those spurious anchors (they exist on all pages, FWIW), but I couldn't figure out a way to remove them that wasn't troublesome. Feel free to give it a shot if you'd like, though?

Ah! Because the HTML files were just static files in the repository where this issue was originally posted, I thought this was easily changed, though I did wonder how they even got there in the first place.

  • CPL should mention that the operation is also known as "bitwise NOT". I do programming for 20 years (though in high-level languages) and this was the first time I ever saw this operation being called the complement.

Interesting. I thought the term was more widespread. Sure, feel free to open a PR against man/gbz80.7!

At least Wikipedia uses the "bitwise NOT" term first, and then calls "bitwise complement" and alias. I primarily use Delphi (Pascal) for programming, a language that does not distinguish between bitwise and logical operators at all (except for shifts, which have no logical "true/false" equivalent), and also spelling these operators out.

All these suggestions are only about gbz80(7)'s manpage, so I'm not sure why to create a PR for just this? Is this because this is more debatable than the others?

  • It could say that LD r8,r8 doesn't do anything at all if both arguments are the same. I thought that there must be a use if these exist, but LD [HL], [HL] does not — but there's no use, except for BGB using LD B, B for breakpoints and LD D, D for debug.

That sounds fair. We can, and maybe should, link to https://bgb.bircd.org/manual.html#expressions to explain ld d, d?

If that's in scope of the document.

@Rangi42
Copy link
Contributor

Rangi42 commented Nov 24, 2024

Be sure to use the master-branch docs; we've already moved !cc documentation from gbz80.7 to rgbasm.5 (https://rgbds.gbdev.io/docs/master/gbz80.7).

@nummacway
Copy link
Author

I see, this was still cached. I had no idea my browser's cache could live for so long!

@Rangi42
Copy link
Contributor

Rangi42 commented Nov 24, 2024

Re: the STOP instruction, however we document it should agree with the opcode table (which currently also says 2 bytes): https://gbdev.io/gb-opcodes/optables/classic

It appears common for SM83 assemblers to output a two-byte $10 $00 form.

  • RGBASM does (with optional stop $xx for $10 $xx).
  • ASMotor does:
    handleStop(SInstruction* instruction, SAddressingMode* addrMode1, SAddressingMode* addrMode2) {
    	sect_OutputConst8(0x10);
    	sect_OutputConst8(0x00);
    	return true;
    }
  • CA65 with the CA83 macro pack does:
    .macro stop
      .byte $10, $00
    .endmacro
  • SDAS does:
            case S_STOP:    /* 0x10 */
                    /*
                     * 0x10 : STOP
                     */
                    outab(op);
                    outab(0x00);
                    break;
  • The original ISAS assembler did, as inferred from the xtal/pmcsrc/TOOL_00.DMG file of the Pokemon Crystal source code:
    switch_speed:
    	set	0,(hl)
    	xor	a
    	ld	(IFL),a
    	ld	(IE),a
    	ld	a,$30
    	ld	(P1),a
    	stop
    	ret
    (pret's pokecrystal disassembly likewise has stop and then ret in home/double_speed.asm, which assembles as $10 $00 $C9.)

Although it's not universal:

  • WLA-DX does not:

    WLA outputs only "$10" when it decodes "STOP". Often it's necessary to put an extra "NOP" ("$00") after a "STOP", and sometimes something else, but that's left entirely to the user.

@Rangi42
Copy link
Contributor

Rangi42 commented Nov 24, 2024

Going through all the points:

  1. STOP: I think we should keep the canonical length as 2 bytes, and briefly explain in the gbz80.7 docs why it's that way / link to Pan Docs for more info. I'm not sure about the "add an implicit NOP for safety" wording, because (a) other assemblers just treat it as two-byte $10 $00 period, and (b) it's reminiscent of the previous implicit-nop-after-halt implicit-ld-to-ldh behaviors that we deprecated and removed.
  2. Exclamation mark: Already moved in 0.9.0 release candidates / master docs.
  3. u3: Yeah, that's good to clarify somehow.
  4. HTML anchors: I don't know enough man page macro syntax to change this accident of HTML generation either.
  5. CPL: Agreed that mentioning "bitwise NOT" as another phrase for "complement" is good.
  6. LD r8,r8: This does do something if both arguments are the same, it's just a pointless/redundant thing do to. I agree we should mention the special cases of LD D,D and LD B,B in emulators.
  7. Optimizations: gbz80.7 is not the place for that. Maybe after 0.9.0 is out we can move the pokecrystal optimizations guide to gbdev, but it currently has some pret-specific tips (e.g. call _hl_ or call AddNTimes) that would have to be omitted in gbdev.

@ISSOtm
Copy link
Member

ISSOtm commented Nov 25, 2024

  1. I will repost my suggestions for stop:

The issue becomes finding the right wording. First off, since rgbasm always emits two bytes, I'd keep the canonical length at 2, and mention that the CPU can treat the instruction as 1-byte under specific circumstances, linking to Pan Docs. And finally, mentioning that for that purpose, stop <byte> syntax is available, with the default being equivalent to stop $00. How does that sound?

  1. I suggested bit_idx above. What do you think?

  2. HTML anchors (currently) cannot be influenced by man syntax, so we might as well just accept this as a fact of life. If only for now. (We discussed before moving the non-(1) man pages to Markdown documents, to be able to employ more formatting and make them easier to read.

  3. “It does something, that something is nothing” is a counter-intuitive enough concept to enough people that I think it's worth pointing out explicitly. (What about and a, or a, and sub a? What about xor a and sbc a, a?)

  4. I agree that gbz80.7 is not the place, but if you agree the general-purpose tips are good to move to gbdev.io, then we can open an issue there now to serve as a reminder. What do you think?


I see, this was still cached. I had no idea my browser's cache could live for so long!

Oh, you linked to the documentation for v0.8.0, not master. We don't retroactively update the documentation as a form of archival (indeed, not even typos), so that's why you were looking at that.


Ah! Because the HTML files were just static files in the repository where this issue was originally posted, I thought this was easily changed, though I did wonder how they even got there in the first place.

You have a point that we may want to point this out within rgbds-www; either in CONTRIBUTING.md, or in the README? I'm not such which is better.

All these suggestions are only about gbz80(7)'s manpage, so I'm not sure why to create a PR for just this? Is this because this is more debatable than the others?

I think a “improvements to instr docs” draft PR can be created already, and one or more commits pushed with the changes we have agreed upon thus far; this would let their wording be bikeshed in parallel to debating other changes here. Once more changes are agreed upon here, they can be added to that PR.

If [BGB's debug expr doc]'s in scope of the document.

Yeah, I think it's worth linking to it. For discovery purposes, if nothing else.

@Rangi42
Copy link
Contributor

Rangi42 commented Nov 25, 2024

Re: bit_idx, since the "8-bit signed offset" mnemonic is e8 (not offset or off), I'm fine with keeping u3 and documenting it as a "3-bit unsigned bit index".

I'm also fine with opening an issue to discuss moving the pokecrystal optimizations page. I could probably shorten the one on pret to just link to gbdev and explain the Pokemon-specific options.

@Rangi42 Rangi42 linked a pull request Nov 25, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs This affects the documentation (web-specific issues go to rgbds-www) rgbasm This affects RGBASM
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants