Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

high-level design changes for Boot/NKRO switching #1305

Closed
tlyu opened this issue Nov 7, 2022 · 37 comments · Fixed by #1361 · May be fixed by keyboardio/KeyboardioHID#100
Closed

high-level design changes for Boot/NKRO switching #1305

tlyu opened this issue Nov 7, 2022 · 37 comments · Fixed by #1361 · May be fixed by keyboardio/KeyboardioHID#100
Labels
design-change Proposals that incur potentially breaking design changes

Comments

@tlyu
Copy link
Collaborator

tlyu commented Nov 7, 2022

What do you want to change?

Clarify the Boot vs NKRO functionality and implementation. This is mostly me thinking out loud and trying to capture a few things before possibly suggesting some more narrowly scoped changes.

  • Implement something similar to Demo of boot protocol working KeyboardioHID#50, but with the state variable maintained in the Kaleidoscope wrappers. This effectively puts the protocol switching entirely under user control.
  • Consider starting up with only the Boot Keyboard sending reports, and only turn on reports on the NKRO Keyboard if we receive an explicit Set Protocol request to turn on Report Protocol. This might help with BIOSes and KVMs that assume that a Boot Keyboard is always in Boot Protocol mode. This needs some investigation about what USB-capable OSes actually do. (Maybe some of them assume Report Protocol, as it is the HID default.)
  • Track key events on both the Boot Keyboard and the NKRO Keyboard, but have Kaleidscope select which ones to send reports on.
  • If we choose to use the host turning on Report Protocol as a signal to stop sending Boot Keyboard reports, we should only stop after all keys currently pressed on the Boot Keyboard have been reported as released. The host might consider them to be two completely independent keyboards that happen to reside at the same device address, so if the protocol switch happens while keys are held down, the key up events would never be sent on the Boot Keyboard, if we didn't make this change.
  • We might want to consider timeouts on sending HID reports, and shut down an interface if its reports are blocked from being sent. (This means the host isn't polling the endpoint, which might happen if it's a BIOS that only talks to Boot Keyboards, and the interface is the NKRO/Multi-report interface.)

Background

The host probably considers the Boot Keyboard and the NKRO Keyboard to be two independent keyboard interfaces (sub-devices) within a single USB device. (Technically, the NKRO Keyboard is a single report within a multi-report HID device interface.)

The Boot Keyboard only produces HID Boot Protocol reports, regardless of what protocol the host has requested with the Set Protocol request. This is fine, and probably best for backwards compatibility. The NKRO Keyboard only produces NKRO reports, regardless of what protocol the host has requested with the Set Protocol request. This is probably wrong from a HID compliance standpoint. (The NKRO Keyboard shouldn't even honor requests to switch to Boot Protocol, because it currently lacks code to produce Boot Protocol reports. Also, being a multi-report HID interface, it couldn't send any of its other reports, because Boot Protocol reports can't have Report IDs.)

Kaleidoscope has quirks that cause the behavior of these two keyboards to be interrelated, by selectively routing higher level method calls to them. Roughly, if the protocol variable in the Boot Keyboard is set to the Boot Protocol, whether by Kaleidoscope or the host, Kaleidoscope will stop updating the NKRO Keyboard with key events, and stop asking the NKRO Keyboard to send reports. Likewise, if the protocol variable in the Boot Keyboard is set to the Report Protocol, Kaleidoscope will stop updating the Boot Keyboard with key events, and stop asking the Boot Keyboard to send reports.

There is some additional background here

How will it make Kaleidoscope better?

It might make Kaleidoscope behave a bit better around host wakeups from sleep or hibernate states. Waking up from deeper sleep or hibernation seems likely to be a situation where the BIOS or preboot environment is active, before restoring OS runtime state to RAM from mass storage.

It might also improve compatibility with KVMs and docking stations, which might expect a Boot Keyboard to actually send key events, instead of all key events going over a multi-report HID interface.

What trouble might users have in adapting to the new functionality?

It's theoretically possible that some users might prefer to always start in NKRO mode, but that seems unlikely. Currently, the state variables to switch between Boot Protocol and Report Protocol also have the effect of deactivating the Consumer Control reports, but that can be changed. (Note that we still want to add timeouts, in case those reports also get blocked because of a BIOS that only talks Boot Protocol and isn't polling the multi-report interface.)

What trouble might developers have in adapting to the new functionality?

It will shift around some abstractions. Hopefully the changes will all improve readability and maintainability of the code.

Specific host behavior

FreeBSD

  • Existing keyboard behavior doesn't work on the default USB HID driver, but does with the non-default one
  • With the boot keyboard always sending reports, but marked as passing, works with both the default and non-default HID driver
  • Works with the hybrid boot keyboard
  • Default HID driver (uhid) only matches on boot keyboards; it might not look into multi-report interfaces, or it might prefer that an interface with both a mouse and a keyboard matches on the mouse. It has a special case for a boot keyboard without any usable keys: it will explicitly set Boot Protocol, and use a hardcoded report descriptor.
  • Non-default HID driver (usbhid) does look into multi-report interfaces
  • Default HID driver doesn't work with existing behavior, because the boot keyboard only sends reports when Boot Protocol is active, and this driver doesn't set Boot Protocol except for specific conditions that aren't met here. The result is that the driver will only listen to the boot keyboard, but the boot keyboard isn't sending anything because Boot Protocol isn't active. This also explains why the workaround of forcing Boot Protocol on FreeBSD succeeds.

OpenBSD

  • Existing keyboard behavior works
  • With the boot keyboard always sending reports that are marked as padding, does listen to the NKRO keyboard, but there are 250ms timeouts, because the driver sees the all-padding report descriptor for the boot keyboard, and decides to not even poll it. The boot keyboard reports time out, because its endpoint isn't being polled.
  • Works with the hybrid boot keyboard

macOS

  • Existing keyboard behavior works
  • Works with boot keyboard always sending reports, but marked as padding
  • Works with hybrid boot keyboard
  • Intel Mac preboot environment seems to work with all the above options

Windows 10

  • Existing keyboard behavior works
  • Works with boot keyboard always sending reports, but marked as passing
  • Works with hybrid boot keyboard
  • UEFI on a Lenovo IdeaPad 110-15ACL works with all of the above
@tlyu tlyu added the design-change Proposals that incur potentially breaking design changes label Nov 7, 2022
@gedankenexperimenter
Copy link
Collaborator

  • Track key events on both the Boot Keyboard and the NKRO Keyboard, but have Kaleidscope select which ones to send reports on.

Possibly worth considering: When I wrote an experimental version of Kaleidoscope a few years ago, I had it store only one NKRO report, regardless of which protocol it was using, and if Boot Protocol was being used, I just had it translate the report on the fly.

@tlyu
Copy link
Collaborator Author

tlyu commented Nov 26, 2022

See #1314 for some Windows-related observations that might influence this design.

@theAlexes
Copy link

Hi all, per suggestion a in this link https://www.devever.net/~hl/usbnkro - it is theoretically possible to always send the boot protocol in the report, but mark it as padding in the HID descriptor, thus completely eliminating the need to switch protocols. Has anyone considered this implementation?

@obra
Copy link
Member

obra commented Nov 20, 2023 via email

@theAlexes
Copy link

We would be glad to test any experimental branches out on OS X (where the Keyboardio works fine) and FreeBSD, which currently seems to want a kernel option set in order to make the Keyboardio switch to HID protocol correctly.

@tlyu
Copy link
Collaborator Author

tlyu commented Nov 20, 2023

One problem with prepending the boot protocol keyboard report to the bitmap report and marking it as padding is that some hosts or KVMs might want a boot keyboard to have a report of exactly 8 bytes. They might also check that the endpoint length is 8 bytes. (edit: this is either not an issue, or is already a problem with Model 01 and Atreus, because those advertise 64-byte packets for the boot keyboard endpoint)

Appendix B of USB HID 1.1 says that a boot device report may not exceed 8 bytes, which I can interpret to mean even in its non-boot report format.

I guess we could take the approach in https://static.wongcornall.com/ibm-capsense-usb-web/ibm-capsense-usb.html#x1-160003.3.2 and have both the boot keyboard interface and the NKRO (part of the multi-report interface) always produce reports, but the boot keyboard report is marked as padding and ignored by HID-aware hosts.

These approaches also increase bandwidth usage, due to sending two HID reports or a larger HID report on each key event. It might not adversely affect the total bus bandwidth, but it could increase key latency (but maybe not by a lot).

@obra
Copy link
Member

obra commented Nov 21, 2023 via email

@tlyu
Copy link
Collaborator Author

tlyu commented Nov 26, 2023

This is an observation about HID standard conformance of NKRO. A reasonable reading of HID 1.1 Appendix C forbids the use of full key bitmaps for keyboards. For context, these requirements are not limited to Boot Keyboards.

Non-modifier keys must be reported in Input (Array, Absolute) items. Reports must contain a list of keys currently pressed and not make/break codes (relative data).

In practice, full bitmap NKRO seems to work in many host implementations, but KVM, etc. vendors who claim that our NKRO keyboard report is non-conforming might be technically correct. I'm not sure why that requirement is in there; maybe there were vendors who didn't want to implement reading a full bitmap keyboard for complexity reasons? There do seem to be enough other bitmap NKRO keyboards out there that this requirement is ignored in practice.

@obra
Copy link
Member

obra commented Nov 26, 2023 via email

@tlyu
Copy link
Collaborator Author

tlyu commented Nov 26, 2023

Are there actual claims that the bitmap report is nonconforming or just KVMs (that themselves are doing possibly noncompliant things) that don’t work right?

I think I remember someone mentioning on Discord that their KVM vendor made claims like that. I don't remember the exact wording.

@obra
Copy link
Member

obra commented Nov 26, 2023 via email

@tlyu
Copy link
Collaborator Author

tlyu commented Nov 26, 2023

@theAlexes I've linked the draft PRs that have a proof of concept fix, if you want to try them out. I'm definitely interested to hear how this behaves on FreeBSD.

Note you'll need to update both your KeyboardioHID and Kaleidoscope repositories. If you would prefer a firmware snapshot, I can probably do that, if you let me know your keyboard model. (I've only tested on Model 100 so far.)

@theAlexes
Copy link

theAlexes commented Nov 26, 2023

Preliminary results with these patches applied are promising! The keyboard is usable without adjusting the loader parameter, making it much easier for us to set the loader parameter in the first place. :) This did change something about how the keyboard identifies to Wayland making our per-keyboard customization not work, though (not a major breaker, though).

For reference, the old firmware appears in dmesg on freebsd 13.2-RELEASE without the tunable set as so:

ugen0.3: <Keyboardio Model 01> at usbus0
umodem0 on uhub0
umodem0: <Keyboardio Model 01, class 239/2, rev 2.00/1.00, addr 3> on usbus0
umodem0: data interface 1, has no CM over data, has break
ukbd1 on uhub0
ukbd1: <Keyboardio Model 01, class 239/2, rev 2.00/1.00, addr 3> on usbus0
kbd3 at ukbd1
uhid0 on uhub0
uhid0: <Keyboardio Model 01, class 239/2, rev 2.00/1.00, addr 3> on usbus0
ums1 on uhub0
ums1: <Keyboardio Model 01, class 239/2, rev 2.00/1.00, addr 3> on usbus0
ums1: 8 buttons and [XYZT] coordinates ID=1

While with the draft PRs checked out it enumerates/attaches as so:

ugen0.3: <Keyboardio Model 01> at usbus0
ukbd1 on uhub0
ukbd1: <Keyboardio Model 01, class 239/2, rev 2.00/1.00, addr 3> on usbus0
kbd3 at ukbd1
uhid0 on uhub0
uhid0: <Keyboardio Model 01, class 239/2, rev 2.00/1.00, addr 3> on usbus0
ums1 on uhub0
ums1: <Keyboardio Model 01, class 239/2, rev 2.00/1.00, addr 3> on usbus0
ums1: 8 buttons and [XYZT] coordinates ID=1
umodem0 on uhub0
umodem0: <Keyboardio Model 01, class 239/2, rev 2.00/1.00, addr 3> on usbus0
umodem0: data interface 1, has no CM over data, has break

we'll update later with what it looks like with the tunable set.

@theAlexes
Copy link

With the tunable set, the old keyboard and new keyboard both show up with the same dmesg:

ugen1.5: <Keyboardio Model 01> at usbus1
umodem0 on uhub3
umodem0: <Keyboardio Model 01, class 239/2, rev 2.00/1.00, addr 5> on usbus1
umodem0: data interface 1, has no CM over data, has break
usbhid0 on uhub3
usbhid0: <Keyboardio Model 01, class 239/2, rev 2.00/1.00, addr 5> on usbus1
hidbus0: <HID bus> on usbhid0
hkbd0: <Keyboardio Model 01> on hidbus0
kbd2 at hkbd0
usbhid1 on uhub3
usbhid1: <Keyboardio Model 01, class 239/2, rev 2.00/1.00, addr 5> on usbus1
hidbus1: <HID bus> on usbhid1
hms0: <Keyboardio Model 01 Tablet> on hidbus1
hms0: 8 buttons and [XYW] coordinates ID=0
usbhid2 on uhub3
usbhid2: <Keyboardio Model 01, class 239/2, rev 2.00/1.00, addr 5> on usbus1
hidbus2: <HID bus> on usbhid2
hms1: <Keyboardio Model 01 Mouse> on hidbus2
hms1: 8 buttons and [XYWH] coordinates ID=1
hsctrl0: <Keyboardio Model 01 System Control> on hidbus2
hkbd1: <Keyboardio Model 01> on hidbus2
kbd3 at hkbd1
hcons0: <Keyboardio Model 01 Consumer Control> on hidbus2

The other weird corner case that wasn't quite working was trying to run OpenBSD on a macbook pro. Let me try that one too.

@tlyu
Copy link
Collaborator Author

tlyu commented Nov 27, 2023

Thanks for the detailed updates!

With the tunable set, the old keyboard and new keyboard both show up with the same dmesg:

Does this tunable enable some non-default kernel drivers that know how to deal with multi-report HID interfaces?

@theAlexes
Copy link

Yeah, that looks to be what the mentioned tunable does on FreeBSD.

It looks like the previous arrangement, however, works better on OpenBSD — refreshing our memory, it was the built in macbook keyboard that wasn't working there.

OpenBSD seems to prefer the previous report structure — a keyboard flashed with the previous release enumerates as follows (typed by hand, as the machine at hand isn't connected to the network yet):

uhidev0 at uhub0 port 6 configuration 1 interface 2 "Keyboardio Model 01" rev 2.00/1.00 addr 3
uhidev0: iclass 3/1
ukbd0 at uhidev0
wskbd0 at ukbd0: console keyboard, using wsdisplay0
uhidev1 at uhub0 port 6 configuration 1 interface 3 "Keyboardio Model 01" rev 2.00/1.00 addr 3
uhidev0: iclass 3/0
uhhid at uhidev1 not configured
uhidev2 at uhub0 port 6 configuration 1 interface 3 "Keyboardio Model 01" rev 2.00/1.00 addr 3
uhidev2: iclass 3/0 8 report ids
uhid at uhidev2 reportid 1 not configured
uhid at uhidev2 reportid 4 not configured
uhid at uhidev2 reportid 5 not configured
ukbd1 at uhidev2 reportid8
wskbd1 at ukbd1 mux 1
wskbd1: connecting to wsdisplay0
"Keyboardio Model 01" rev 2.00/1.00 addr 3 at uhub6 port 6 configuration 1 not configured

Despite the 'not configured' lines, this all works fine.

The new firmware gives the following, similar except for the "no usable key codes array" error leading it to attach to a different keyboard. It also sort of works as a console keyboard but the keystrokes are laggy and typing quickly mixes them up. Macros don't suffer from keystrokes getting mixed up, but they do lag heavily. I'm also unsure if openbsd has a similar tunable to load similar drivers.

uhidev0 at uhub0 port 6 configuration 1 interface 2 "Keyboardio Model 01" rev 2.00/1.00 addr 3
uhidev0: iclass 3/1
ukbd0 at uhidev0: no usable key codes array
uhidev1 at uhub0 port 6 configuration 1 interface 3 "Keyboardio Model 01" rev 2.00/1.00 addr 3
uhidev0: iclass 3/0
uhhid at uhidev1 not configured
uhidev2 at uhub0 port 6 configuration 1 interface 3 "Keyboardio Model 01" rev 2.00/1.00 addr 3
uhidev2: iclass 3/0 8 report ids
uhid at uhidev2 reportid 1 not configured
uhid at uhidev2 reportid 4 not configured
uhid at uhidev2 reportid 5 not configured
ukbd1 at uhidev2 reportid8
wskbd0 at ukbd1: console keyboard, using wsdisplay0
"Keyboardio Model 01" rev 2.00/1.00 addr 3 at uhub6 port 6 configuration 1 not configured

@tlyu
Copy link
Collaborator Author

tlyu commented Nov 27, 2023

Thanks for the detailed updates! The OpenBSD results are very interesting.

It also sort of works as a console keyboard but the keystrokes are laggy and typing quickly mixes them up. Macros don't suffer from keystrokes getting mixed up, but they do lag heavily. I'm also unsure if openbsd has a similar tunable to load similar drivers.

Would you say the delay is about a quarter of a second per keystroke?

My current hypothesis is that OpenBSD sees that the boot keyboard is sending only padding, so is giving up on polling that endpoint entirely, causing the keyboard firmware to time out (250ms each) on sending those reports. This is kind of the opposite of the expected failure mode with these changes. My expectation was that by sending both the multi-report and the boot report, hosts that don't set boot protocol will cause the keyboard to time out sending the non-boot reports.

@theAlexes
Copy link

Would you say the delay is about a quarter of a second per keystroke?

yeah, it's approximately that.

OpenBSD

Looking deeper and comparing hidkbd.c against the dmesg, it looks like it didn't attach the boot keyboard and did attach the multi-report keyboard. Weird.

@tlyu
Copy link
Collaborator Author

tlyu commented Nov 30, 2023

@theAlexes could you please try the new PRs (hybrid boot keyboard) linked above? They use an alternative approach (boot and NKRO reports on the same endpoint) that might work better with OpenBSD. I would appreciate hearing your feedback.

@theAlexes
Copy link

@tlyu preliminary results: this approach works well on the FreeBSD machine without the tunable and the OpenBSD MacBook, both at the loader and once the OS has booted. Going to go grab the FreeBSD machine that has the tunable enabled.

@theAlexes
Copy link

@tlyu FreeBSD with the usbhid tunable enabled also works! Unfortunately we don't have any KVM switches to test with, but this approach at least satisfies our use cases.

@tlyu
Copy link
Collaborator Author

tlyu commented Dec 1, 2023

@theAlexes thanks for the testing! That is very helpful. It confirms a few things that I had deduced from source inspection. I'll update the issue description with some details of OS-specific things that we've discovered so far.

@tlyu
Copy link
Collaborator Author

tlyu commented Dec 1, 2023

Updated description with host-specific details.

@tlyu
Copy link
Collaborator Author

tlyu commented Dec 1, 2023

Updated PRs so they implement switching of whether the boot keyboard sends only Boot Protocol. Appears to work with preliminary testing.

@tlyu
Copy link
Collaborator Author

tlyu commented Dec 2, 2023

My summary of the current situation: we have an approach (hybrid boot keyboard) that leads to strict improvement in multiple "hard" platforms (FreeBSD, OpenBSD), and no known regressions yet. How do we want to expose this for more testing? I can clean up the patches to a reviewable condition in maybe a week or less.

@theAlexes
Copy link

just updated our quiet-click model-01. we'll run 'em through their paces on the machines and OSes in our possession — one we haven't tried was Haiku. looking forward to it :)

@tlyu
Copy link
Collaborator Author

tlyu commented Dec 3, 2023

@gedankenexperimenter thank you for your suggestion of making the NKRO report canonical, and constructing the boot report from it as needed. It wound up being simpler than trying to maintain the two reports independently.

@theAlexes
Copy link

theAlexes commented Dec 3, 2023

@tlyu the changes as of yesterday work fine on FreeBSD (default), FreeBSD (usbhid), OpenBSD, and OS X preboot. We weren't able to get Haiku running, but we did try it on the North Coast Synthesis "Gracious Host", a boot-protocol-only implementation: https://northcoastsynthesis.com/products/msk-014-gracious-host.html — it works great there too.

One request in the boot-protocol switch code, though: Can you use B instead of 6 for Boot Protocol, because our layout doesn't have numbers in the top layer (we use TopsyTurvy to make the numbers appear under Shift)

@tlyu
Copy link
Collaborator Author

tlyu commented Dec 4, 2023

@tlyu the changes as of yesterday work fine on FreeBSD (default), FreeBSD (usbhid), OpenBSD, and OS X preboot. We weren't able to get Haiku running, but we did try it on the North Coast Synthesis "Gracious Host", a boot-protocol-only implementation: https://northcoastsynthesis.com/products/msk-014-gracious-host.html — it works great there too.

Thanks for the extensive testing! Did you not get Haiku running at all, or did you have problems getting it to talk to the keyboard?

One request in the boot-protocol switch code, though: Can you use B instead of 6 for Boot Protocol, because our layout doesn't have numbers in the top layer (we use TopsyTurvy to make the numbers appear under Shift)

Yes, I could do that. I could even make it compile-time configurable.

@theAlexes
Copy link

Haiku wouldn't boot on either of the victim machines, and then we realized we had an even more cursed USB host than that :)

Thanks for the willingness to experiment — I realize this is a "creative violation of the letter of the spec", but it definitely solved the FreeBSD pain point we were having with this keyboard.

@tlyu
Copy link
Collaborator Author

tlyu commented Dec 4, 2023

Updated the Kaleidoscope patch to default to B instead of 6 for boot protocol. Also made it configurable, and documented the mode indicator behavior.

@tlyu
Copy link
Collaborator Author

tlyu commented Dec 4, 2023

"creative violation of the letter of the spec"

I'm not so sure about that. The main thing of the current approach that might violate the letter of the spec is the bitmap NKRO report, which we (and many others) were already doing. A device detaching and re-enumerating with different descriptors is almost indistinguishable (from a protocol level) from it being physically unplugged and replaced with a different device. I guess you could argue that the VID/PID should be different when that happens, but I'm not sure that's explicitly required.

Relatedly, Windows 10 persists cached device descriptors across hibernation and bus resets, at least on some USB3 ports, if the device hasn't been physically removed (or maybe is merely at the same physical port), and has the same device descriptor as before.

@theAlexes
Copy link

Okay, we've tried one more cursed device, the BMOW ADB/USB Wombat, which is closed-source, supposedly contains a HID parser, but still, even with this patch, requires us to manually switch the keyboard to boot protocol only. I think this is because it attaches to the first keyboard it sees, and is thus pulling an OpenBSD, ignoring the "padding" in the boot report device, concluding that it doesn't have any keys, and not trying any of the other endpoints.

@tlyu
Copy link
Collaborator Author

tlyu commented Dec 6, 2023

Thanks for the additional test!

thus pulling an OpenBSD, ignoring the "padding" in the boot report device, concluding that it doesn't have any keys, and not trying any of the other endpoints

The new hybrid report has the boot report array marked as padding, with a bitmap report after. A complete HID implementation should know what to do with it. From the vendor page at https://www.bigmessowires.com/usb-wombat/ it looks like it's using an unspecified Microchip HID library with possibly limited report descriptor parsing capabilities. It might not know how to deal with bitmap key reports at all.

@theAlexes
Copy link

ah, right, we were confusing it with the previous patch. I'll poke the BMOW fellow and see if he can release a firmware that requests boot protocol.

@tlyu
Copy link
Collaborator Author

tlyu commented Dec 6, 2023

Another possibility is that it could be fully understanding the report descriptor, but only reading the first 8 bytes of the report for some reason, despite it being declared as a longer report.

@tlyu
Copy link
Collaborator Author

tlyu commented Dec 15, 2023

For anyone watching, I updated #1361 so it no longer requires the separate KeyboardioHID change, now that KeyboardioHID is merged into Kaleidoscope.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
design-change Proposals that incur potentially breaking design changes
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants