-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Autocomplete/autocorrect [feature-request] #710
Comments
Cannot see the onboard memory, or processor, coping with that. |
Frankly, I am not optimistic about this either. Most likely this will end up on a "feel free to go on and mess with it yourself" basis. Also, I consider this a problem that is more suitable to be dealt with by software solutions. (Partially because keyboard does not have access to the text buffer, so it would be based entirelly on scancode-based guesswork, partially because indeed memory is a bit of a problem.) How big lookup tables are we talking about? Could you suggest a specific macro syntax that would control this? Do you envision user-defined typo dictionary, or a pre-defined english dictionary? |
The idea came to me while I was reabing about steno, and imagined what from it could be made possible with the hardware that I already own, of course in practice to use steno it'd be better to junt run plover and configure the uhk for that, but the idea of the hardware itselve being able to enhance the user's accuracy in nice. Considering that the 200 most common words cover a bit over 50% of the average text asd the most common 1000 words A possible approach woud be to store that dict as a sorted list. In the 2nd and 3rd case, there'd be no more errors to spare the user, either every other key match, in that case we can fix the typo in the beginning of the word, or not, in that case it is passed as the user typed. The 1st case restrict significantly the search space (~1/26) and we'd do a recursive search... That, I believe, would be an approach feasable with the hardware, a thousand words with a saparator bythe would consume roughtly 6kB, but we could do better restricting the scans we allow to go into de dict, if we were to use only 26 letters, we could fit three letters into two bytes. That aside, of course, we'd need to store the indexes for the look up, it is, where the section of words starting with any two scans starts and ends. those indexes cannot go much deeper than that as with two scans it already would use a large amount of memory, but the search deeper than that can be made linearly, considering that it'd be around 40 words. The key part is that the dictionary don't need to be that big to be effective, common words that are rarely miss typed could be ommited, as well as words that are not that common, the search, if we assume the user didn't commited more than one error, can be made relatively fast. It is, for sure, something that could be made more easily in software, but is something that would be nice to have in the keyboard, specially because it makes somewhat easy to diferentiate what is a word that the user wants to be spell checked/completed or not, by using keys that do or don't trigger that process (e.g. the right and the left space). It is possible to emulate today uning gestures, but that is way to clunky to be of any use, requiering the user to register every possible error. Another possible implementation would be to store te list of sequences as a tree, where teh node for "the" would lead to a not terminating the word, a node for "y" followed by the terminator, or a node for a "re" followed by a terminator... That approach offers us a way to avoid having to store subsequences, but introduce some overhead in the form of pointers, each node being composed of a chunk of the sequence, the pointer to its first child and the pointer to its next sibling. That way, the search would traverse the nodes where the first code match, where the first didn't but the second matched, and where the first and the second are swapped. Lastly, scrappind the idea of some auto-correct feature, implementing only the snippet part would be somewhat more feasable. It is, have the user provide the matching text and the text to be produced by it. Somthing akin to:
The idea to allow the user to use it so they can have Lastly, I'd like to take a moment to praise the keyboard. It has earned its name as ultimate. I cannot think of any feature that other keyboards have that the v2 don't or can't emulate. And I don't think I'll ever wanna change it for anything short of a UHKv3. |
|
Your thinking is correct, I'd just have a starting/restarting trigger so that the user can input that what was typed do not need to be corrected and that the capturing to complete can start over. Because the idea is to have just words that are common enough to justify it be completed. Like using 'th' for the and 'the' for they... Meaning, most of the time, preferrably always, the have a single match.
The idea was to have the dictionarie defined and have it static betwen configuration updates, like a "big" file uploaded to the keyboard by the agent, that'd be consumed, holding as little as possibel in the ram, just whats needed to actually traverse the file. That could be something appended to the firmware, that'd make it harder to update, but if it'd allow us to use much more memory it'd be preferrable. If we consider using those 100kB of rom, that could have a large amout of indexing baked in. Have it as an array of words, and a large pre baked search tree whose leafs point out to the array. Having it baked by the agent allow us to free much more of the processing aswell...
Now, that't the question, I have little to no experience with C, so surely the code that I'd produc would be unpropper to merge to the mainstream. But I think I will take some time and do a proof of concept... And even for that I believe that I'll be needing some help... |
I think I can provide that. |
This might not be feasable, but it'd be nice if we coulde implement via macro a lookup based on the last few scancodes sent.
The usecase would be:
Given a trigger, the scan codes would be stored.
Given the trigger again, those scan codes would be used to look up a table of registered typos, the lookup would produce the desired scancode sequence to be sent instead of the ones that were produced by the user typing.
Having that trigger be the space bar, that could be used to autocorrect a set of typos in somewhat fluid way.
Alternatively, having it as a list of valid scancodes sequences and sending the one that is most similar to the one produced by the user would enable the dump of a large dictionary of words without the need to actually register the typos that could happen in typing them.
The same way, the trigger could be made in such a way that the matching of the produced scancode sequence would consider only the length of the produced one, that way the user could autocomplete the words.
The text was updated successfully, but these errors were encountered: