Replies: 1 comment 6 replies
-
That's an interesting write up, thank you for sharing. Below are my 2c: It seems to me that it can take quite a while until a personalized dictionary generated from user input is ready for use, and it likely won't ever be definitive because every day we may find ourselves using a few new words instead of some very old ones. Surely, most of the dictionary words won't ever be used by any given person, and using a generic dictionary creates a lot of "word noise", but most people would prefer something functional out-of-the-box over the hassle of creating a personalized dictionary. Every so often I reset my keyboard dictionary because I don't want personally identifiable information like my bank username/email addresses to be suggested as dictionary words. So there is also a privacy benefit to use a generic dictionary. Proprietary keyboards like GBoard likely collects telemetry to create a dictionary with sensible word frequencies, which is critical for predictions and gesture typing. Heliboard gathers no information from its users so it will be hard to match that level of accuracy. If you have suggestions for improving the spell checker (as opposed to the dictionary), please let me know, I've been trying to fine tune it in #613 P.S. regarding the toolbar: Is the current state of the toolbar really such a big turn-off? |
Beta Was this translation helpful? Give feedback.
-
Contents
UNDER CONSTRUCTION!!!
I’m revising this document into one single file with TOC, images where beneficial, and attempting to make it more readable while expanding where necessary.
To developers: I’ve decided to leave this unfinished. It seems a waste of my time & energy. If there are any developers interested then I’ll put the effort in but until then I’ll just leave it as is.
To users: this information can benefit you and help you create your own custom dictionary if you are finding that you’re currently unhappy with the state of text prediction or gesture typing. Best of luck!
Intro
Android keyboards, after all these years, are still, IMO, not very good. This document started because I wanted to make HeliBoard gesture typing function better — doing the DIY Custom Dictionary has made HeliBoard one of the best gesture typing keyboards, but it is lacking in many ways, just as most Android keyboards are. Therefore I began working on this document, but I honestly think it has fallen on def ears.
Dictionary
AOSP Dictionaries
AOSP dictionaries are an over complicated word list, not an actual dictionary, that is compiled into a binary .dict file.
The dictionary is without a doubt the most important item that influences typing — the larger the dictionary the worse text prediction & gesture typing become, the smaller the dictionary the better they become! It’s that simple!
Word List vs Dictionary
TT9 dictionary
Text vs Binary
It would be wise if keyboards utilized text based word lists instead of binary files — allowing for user configuration of word lists that could be imported, no developer maintenance, ability to delete words from the dictionary instead of blacklisting, etc…
Word List Structure
Word Ranking
The “word frequency” or “ranking” is set in the dictionary with “f=” with a number from 0 to 255. I’ve found that dictionaries that use word frequency lists to set a ranking for words in the dictionary, such as Openboard & HeliBoard, are actually creating conflict in the prediction system. The AOSP keyboard, and therefore Openboard/HeliBoard, develops an internal word ranking based on the individual user’s input, therefore, if you understand this it becomes clear that using high ranking numbers for words in the dictionary adds no real long term benefit to the user. In fact doing so causes conflict and significantly worse word prediction.
Second, “f=0” signals to the keyboard to not suggest the word unless the word is explicitly typed or gestured to within a certain proximity for the spelling correction to assume you meant that word. This allows a developer to “hide” words and decrease the overall “word noise” of a large dictionary, therefore allowing the internal ranking to have even better clarity for prediction and gesture typing. Furthermore the f=0 still allows the word to be available to the spell checker in the system.
Therefore I would recommend users rank a significant portion of the dictionary at 0, and rank more common words (levels 1-5) above f=1, with less common words (level 6b,c) ranked at f=1, however, I see no reason to rank any word in the dictionary above 15 — allow the algorithm to do its job and develop its own internal word list with proper word ranking based on the user’s actual use instead of what’s been vacuumed off the internet.
bigram
What is this “bigram=” that’s in these dictionaries? This is a next word suggestion. There can be any number of them for any given word, however, for AOSP based keyboards there is no point in having more than 3 — since there is only 3 columns for next word suggestion.
Is tripling the size of the dictionary with these next word suggesting, and complicating the development of dictionaries, worth it? I say no! The reason is, just like word ranking, the keyboard has internal algorithms that develop next word suggestions based on actual use of each individual user. So why have it in the dictionary? It only offers a “head start” or a “guess” based on popular word combinations, especially with famous people’s names. However, just like word ranking, there’s a problem created when including bigram in the dictionary. That problem, once again, is more “word noise” based on patterns that have nothing to do with you. Each bigram is seen as a word, and apparently a significantly high ranking word, by the keyboard. Also, this next word suggestion will always fight with your own typing, just like the word rankings in the dictionary do. From my research, and using Swype & Gboard over the years, this is yet another unnecessary element in the dictionary — if the developers have tons of resources to developer a better contextual engine then MAYBE adding such elements would pay off, however, as it stands it’s better to allow the internal algorithms to do their job and strip the dictionaries of everything possible.
Furthermore the “bigram” is unnecessary as, again, the keyboard does this internally and adding it to the dictionary adds to “word noise” and conflict with the prediction algorithm.
Tags
Finally, all the tags and “possibly offensive” bloat is just that, useless bloat that adds difficulty to creating and maintaining dictionaries.
Further research
Right now the research that needs to be done is testing compiled dictionaries with very low rankings — putting all words at 0 or all at 1, etc., and splitting them between 0 and 1. I’ve put together a few of those tests in the dict folder — the files just need to be compiled to .dict and tested.
Those tests along with testing up to f=15 for highest level words would be wise. That should clarify if there’s any real difference at that level or if it’s just better to stay with 0 & 1 for all ranking — certainly would make maintenance easier.
From my tests using the personal dictionary — remember I almost always gesture type — I’ve found that the ranking is pretty irrelevant. Starting with a clean install and all words set to 0 behaves the same, as far as I can tell, as using 0-15 to rank the words. I know going above 15 from other tests that it’s likely words ranked that high start to be seen as important and I wouldn’t suggest ranking words that high but I could be wrong. 🤷
So far I’m getting significantly better and more accurate gesture typing by using a smaller dictionary set to 0 and/or 1, however, using the personal dictionary for testing has its limits.
The Word lists
I’ve created several word lists based on the idea of ranking words in groups instead of using “word frequency” lists.
These lists will make more sense once you’ve read about word frequency & ranking.
Notice
The intent of these word lists is to be “incomplete”, to make more focused word lists for keyboard text prediction & gesture typing, not comprehensive “spell checking” word lists, as smaller word list leads to better text prediction and significantly better gesture typing — the smaller the word list used the better the gesture typing & text prediction will be!!! This is an easily provable fact. The issue is, how smash is too small and can we mitigate the need to add “new” words?
Word List Folders
12dicts
This folder contains its original Readme. The word lists contained have been cleansed of artifacts, though there maybe some I missed. This is a good dictionary though a little old, 2016.
Levels
The Levels folder are word lists I’ve created. They are based on the idea of using basic words as higher levels in creation of text prediction dictionaries.
The naming of the files in the level folder is simple — they are numbered based on priority, lower number equals more common or “base” English words, the ending number followed by “k” is approximately how many words there are in that list.
dict
The dict folder has lists ready to be compiled into .dict files. As I’m doing all of this on an Android phone I’m unable to carry out the compiling. These are for testing the word ranking in dictionaries to see if there are any peculiar behaviors with the keyboard when ranking is significantly restricted — my research testing this using the personal dictionary suggested that ranking everything at f=0 may cause some initial issues with gesture typing, however as the keyboard builds its own internal list things settle out very well with exceptional word prediction and gesture typing, however, I was only able to test 25k words due to import limitations with Multiling O Keyboard into the system personal dictionary.
HeliBoard
The HeliBoard folder contains files from the HeliBoard dictionary repository that I’ve been working with and editing for testing purposes, nothing worth using other than the empty.dict file.
Misc
This is a junk folder, a bunch of lists I used or bits of lists leftover.
Individualized Dictionary
This is the main purpose of these word lists
and this repository— to help users build their own individualized dictionary and, hopefully, get developers to see there’s a better way of creating & maintaining word lists.Let’s first start with a simple experiment.
From a fresh installation of Gboard or HeliBoard I typed for a while ensuring that I didn’t use any words starting with the letter ‘k’ except for the word ‘keyboard’.
After typing the word ‘keyboard’ many times you would think the system would recommend the word ‘keyboard’ as soon as I typed the letter ‘k’ but what actually happened is, I’m consistently given the recommendations of ‘key’, ‘keep’, ‘known’, etc. Sometimes I get some different words starting with ‘k’ but I don’t get ‘keyboard’ until I type ‘keyb’ and only then is it suggested.
Why isn’t the predictive text giving me the word I use, and instead offers me words I haven’t used at all?
Simply put, it’s because the words in the dictionary are given a “ranking” and that ranking drives the text prediction instead of your actual use driving the text prediction.
Here’s the ranking of words that keep getting suggested over keyboard, from the HeliBoard main en_US dictionary:
The ‘f=’ number is the ranking. This ranking is from 0 to 255, higher being “more important” but according to whom?
As you can see from the rankings ‘keyboard’ is ranked lower than ‘known’, ‘keep’, ‘know’, and ‘key’. And that’s exactly why, regardless of your actual use of each of these words, that certain ones are constantly suggested before others — this is why AOSP keyboard and Gboard have always sucked at tap/touch typing and you, the user, have to type so many more letters to get to the words you actually use than you would with other keyboards. 🤦🤦🤦🤦
Where did these developers come up with these “rankings” and how are they related to you in any meaningful way?
Google, and other companies, scrape the internet and collect all the text spewed out by companies, bloggers, governments, etc. and run that text through a word frequency counter. This tells them how often a particular word appears within a particular “data set.” Fancy isn’t it? Then they apply an algorithm that converts a given word’s frequency number into a rank number between 1 & 255. Ta-da!
That’s the simple explanation and it’s good enough for our needs. So, how does this relate to your writing? In actuality it relates very little. Sure, there are a few hundred “base level” words like ‘of’, ‘the’, ‘to’, etc. that will correlate but if you spend a little time going through any of these frequency lists or dictionaries with word rankings applied you’ll quickly realize, not far down the list, that most of the words you probably never or rarely use, and much lower ranked words might be words you frequently use. Therefore these “rankings” mean nothing to you but the developers are forcing them on you! This causes the dictionary to drive word prediction instead of the keyboard’s own internal algorithm — it really is one of the stupidest designs I’ve encountered, and the fact they are still using it shows how blind people can become.
I’ve tested this as thoroughly as I can.
By using Multiling O Keyboard and importing words into the personal dictionary and using HeliBoard, using a language with no installed dictionary, I was able to test how the keyboard behaves when word ranking is “neutralized.” I would set all words to f=0 or f=1 or a range 0 to 15 but all low rankings.
What happened? I started getting the correct word prediction based on my actual use. In the same test using ‘keyboard’ it would appear immediately after typing the letter ‘k’ — I no longer got the recommendations based on the dictionary’s peculiar word rankings.
So, one problem solved. And it’s an easy fix but will developers do it?
Dictionary Size Matters
The second problem that was obvious to me, for years, is that most keyboard dictionaries are far larger than needed for text prediction — one reason Swype is so much better at gesture typing is because it uses a smaller dictionary, and I painfully proved this by going through a portion of the HeliBoard dictionary and seeing if each word was in Swype, a great deal were not, but I stopped because doing the whole dictionary would have taken a month! ~160,000 words!!! And ~400,000 words in the experimental dictionary!!! 🤷🏻♂️ 🤦🏻♂️
These large dictionaries don’t appear to be such a problem when you are touch typing however, they become VERY PROBLEMATIC WHEN GESTURE TYPING! Using HeliBoard “out of the box,” so to speak, with its ~160,000 word dictionary, for gesture typing proved to be THE WORST KEYBOARD FOR GESTURE TYPING I’VE EVER USED. The amount of typos is significantly higher, the word suggestions and predictions are often extremely irrelevant and it quickly became an exercise in frustration using HeliBoard for gesture typing.
Why was it so bad? Both the large number of words used in the dictionary combined with the word rankings in the dictionary do not play well with the gesture input method. While touch typing each letter acts as a filter, slowly filtering out words based on rank and spelling, gesture typing isn’t able to be as accurate and it doesn’t seem to use the word ranking as any type of filter — the gesture is interpreted by the keyboard very literally — it’s guessing what letters you meant to “type” in your gesture and then applies text prediction and spell checking algorithms to guess at your intended word but it just isn’t that accurate at guessing the letters you meant to gesture in the first place — this is seen when you drastically limit the available number of words, the literal gesture interpretation will appear as a suggestion and boy is it off.
Here’s a screenshot of HeliBoard with a reasonably accurate gesture interpretation:
I gestured for the word ‘have’, as seen in the text field, and you can see the far right word suggestion is ‘habe’ which is NOT in my dictionary — that’s the literal interpretation of the gesture. Often the interpretation is wildly off and yet you’ll get the correct word however, this points us to why a large dictionary is a problem — there are too many words that can be considered close enough to the gesture interpretation that you get a lot more errors. It’s a simple probability issue — a gesture is going to match more words when there are more words to match against especially words that are similar. This is also why gesturing for longer words is less problematic than short words — the longer gesture acts like the “filter” you usually get letter by letter when touch typing.
It’s actually amazing seeing this and the fact that the keyboard can get anything correct. The guess at the gesture isn’t the whole problem. It’s a combination of this “guesswork” combined with the combination spell checking (Gboard no longer uses this in gesture typing) and a very wide or “liberal” guessing by the predictive text algorithm.
If you take a moment and read my comparison of HeliBoard & Gboard you’ll see how Google has made changes to improve gesture typing and that’ll help this all make sense. In fact, I’m going to include that comparison now. 🤷🏻♂️
HeliBoard vs Gboard
This is a comparison between the current versions, May 18, 2024, of HeliBoard & Gboard, focused on the differences in typing — what changes Gboard has made to improve typing and their consequences.
3 ways of spell checking
First we need to understand some basics about computer spell checking. There are three basic ways used for spell checking:
In other words, a spell checker will match from either the beginning of the word or from the end of the word, thus assuming you correctly spelled that part of the word. So you can see having a spell checker that works in both directions has a better chance of correcting the word than one that only works from one direction.
Back in the early days of personal computers it was normal to have a spell checker that only worked in one direction. What this means is that the comparison algorithm between what you typed and the dictionary word list compared the words starting from only the beginning or only from the end of the word, therefore, all suggestions for correct spelling were based on that part of the word and if you got that wrong the spell checker failed to find the correct spelling. Obviously the best spell checker would combine these two methods, thus, giving you the best probability of getting a correctly spelled word.
This is easy to test. By intentionally misspelling a word either at its beginning or end you can see which type of spell checker is being used.
AOSP keyboard spell checker
AOSP keyboard, Openboard, and HeliBoard are clearly using a combination spell checker. This is great for correcting spelling errors but it has a down side. The negative is that it’ll give a broader range of suggestions, and this seems to directly impact the text prediction algorithm — I don’t know how or if the spell checker & text prediction tie together, I’m sure they do somehow.
The important thing to learn here is how using these different spell checking algorithms can impact text prediction and therefore typing input.
The test is simple, “jelous” is incorrectly spelled at the beginning, it’s missing the ‘a’. Understand that a spell checker that only runs from the beginning of the word to its end will not be able to correct the spelling error. HeliBoard correctly corrects this word whether you tap type it or gesture type it. The current Gboard does not catch this spelling mistake when gesture typed! It only corrects it when tap typing. Why would Gboard do this?
Gboard’s new gesture typing
Gboard recently became a lot faster at text prediction when gesture typing. Now Gboard will predict your gestured word within a few gestured letters, previously, like HeliBoard currently, you would need to gesture the whole word as it would guess at other words along the gesture path.
How did they achieve this improvement? Isn’t it obvious? They changed the “spell checking” to assume that you are correctly gesture typing the word — you are assumed to be a perfect speller. 🤣 This works great for text prediction when gesture typing but completely fails when you misspell a word.
Gboard still has the better spelling correction when you tap type the word. Apparently they want us gesture typing users to become better spellers 😂
In all fairness this is probably the best decision. The improvement in gesture typing is obvious!
Furthermore it seems that Gboard has changed its predictive text algorithm.
Predictive text
It’s easy to see predictive text at work when tap typing — each letter acts like a filter narrowing the list of possible words. HeliBoard currently allows the user to swipe up on the suggestion view to access more suggestions. This allows us a window into HeliBoard’s predictive text algorithm.
Just start by tapping ‘wha’ then look into the extra suggestions. You will see suggestions starting with other letters than ‘w’. Why? Because the predictive text algorithm assumes you might have hit the wrong key thus searches for possible matches using assumed typos, essentially.
With Gboard it’s hard to tell what’s going on because you only get to see what you are typing and two other suggestions, and gesture typing gives you only the singular preview in Gboard where HeliBoard gives 3 or 4 depending on the settings.
What can we learn here? Really the only thing we can takeaway is how broad HeliBoard predictive text is or is not.
From what I’ve seen, when using the full English US dictionary, HeliBoard is very broad in its predictive text algorithm. VERY BROAD! This would account for the need to type nearly the entire word whereas, the current Gboard gets you to the desired word a lot faster thus, it’s my opinion that Gboard has narrowed its predictive text algorithm just as it has narrowed things in the gesture typing. Gboard now takes on greater assumption that you, the user, are more accurate in your typing than was previously assumed by the algorithm.
In other words, it seems Gboard has “tightened” up the algorithm significantly by assuming correct spelling in the gesture typing and assuming accurate tapping when touch typing or gesture typing.
These two versions of the keyboards compared gives us insight into how certain adjustments can affect typing.
There are other things that Google has broken in the current Gboard — as is always the case worth Google, one step forward, one step right, one step backward and one step to the left 😂
Individualized Dictionary continued
So what’s the fix? If a user or developer spends any time fully contemplating the use of the dictionary in spell checking vs predictive text typing they ought to come to the “holy grail” for predictive text typing, and that is a dictionary (word list) that is tailored to each individual user — meaning the dictionary would contain only the words that individual actually uses and ranked ONLY according to how that individual user actually types.
At first thought you might think this is impossible, but the reality is, all good modern soft keyboards are already doing exactly this. They just aren’t decoupling it from the main dictionary, and they continue to use their arbitrary word rankings!
On one hand this is a very simple thing but because of certain design choices by Google there are hurdles to achieving this.
You may ask yourself at this point, if this is the holy grail then why haven’t the big tech companies done it? Maybe they haven’t thought of it? But most likely, doing so doesn’t fit their goals. Their goal is obtaining your data — they only improve the keyboard because if they didn’t you’ll use a different keyboard and your data will go elsewhere. Lock-in is always the goal for all big tech companies, and this is contrary to lock-in. Or they’re just lazy 🤷🏻♂️
Developing the Individualized Dictionary
This is where real work for developers resides. Being able to effectivity implement this will create a significant improvement in the user’s overall typing experience, especially or the long term — that’s obvious to understand when you realize their individualized dictionary would be less than 20,000 words, more likely to be ~5,000 words for most users, but that’s a guess.
How many words does the average person use?
That’s a key question! Sources range the average English speaking person’s vocabulary between 20-50 thousand words however that includes both passive and active vocabulary. Active is usually half that, and that’s broken down into active spoken vs active written, with active written being lower, therefore, you can see that for a smartphone user having a text prediction dictionary of ~160,000 words is complete insanity — if they’re only using ~10k words then what’s the other 150,000 words doing? Creating confusion! That’s all the extra words do! Create confusion for the gesture algorithm. Create confusion for the text prediction algorithm. This is what I call “word noise” and the more we can reduce “word noise” the clearer the algorithms can differentiate the user’s input and the faster the algorithms can accurately guess the desired word. This is also true for next word prediction. This leads to less errors, fewer typos, fewer inaccurate suggestions, and much faster prediction! I actually get the words I use suggested within a few taps, … but I mostly gesture type so 🤷 at least my typos decreased significantly.
I’ve done a significant amount of testing on this, and much of what I’m saying should be blatantly obvious — the fewer unnecessary words in the dictionary the better the typing & prediction. In my tests with HeliBoard the improvement is significant — actually placing HeliBoard on par with Swype, and this isn’t even close to what could be achieved over time! However this lays solely in the hands of developers! I can only point it out.
So, how can this be achieved?
Text Prediction vs Spell Checker
First we need to understand what the dictionary is used for. It has two primary purposes in Android:
The spell checker needs as large a dictionary as possible, but this is contrary to the goals of the text prediction dictionary.
Secondly, we need a text prediction dictionary from which the user can “pull” words into their individualized dictionary — you can’t expect them to build from nothing, especially the spelling challenged. 🤣
Both these dictionaries are going to be significantly larger than the user needs.
First, let’s deal with the spell checker dictionary. This only needs two main things:
That’s it! We don’t need to worry about word frequency, possibly offensive, etc. Why? Because everything is given a f=0 rank. But how is this implemented without totally killing the text prediction of the keyboard
stopped here
There are many ways to fulfil this idea — breaking the spell check dictionary off from the text prediction dictionary, only using a smaller dictionary that words can be deleted from, a learning mode to build a dictionary, etc. However I doubt any developer will bother and users that want as small a dictionary as possible to improve their typing performance will need to build their own custom dictionary.
DIY Custom Dictionary
Why should you build your own dictionary? The smaller the dictionary the more accurate gesture typing becomes and the less errors or typos you’ll get. Furthermore, the smaller the dictionary the better text prediction & next word suggestion works, though the next word suggestions take time to build up that learned profile. Furthermore this custom dictionary will benefit any keyboard that reads the system personal dictionary, like Gboard. As developers of keyboards have always, and continue, to go the route of increasing the size of the dictionaries, in some insane attempt at “completes”, and therefore degrading the typing experience of the user this is really the only route currently available to the user that can’t compile a binary AOSP dictionary. In my experience, going this route, it’s the only way to have decent gesture typing!
Pros
Cons
A user can build their own dictionary over time or quickly if they have a significant amount of their own writing to get a quick start, or use a generic word list.
The first thing to decide is, do you want the slimmest possible dictionary — a dictionary containing only words you ACTUALLY use — or a reasonably small generic dictionary created from word lists, or a combination of the two?
Using a tool like this online Word Frequency Counter a user can paste a large body of their writings and get a list of words to build their dictionary — clean the list of numbers, commas, and other artifacts and spell check it. Then use User Dictionary Manager (UDM) or Multiling O Keyboard to import the words into the user dictionary. You’ll need to install the empty.dic into whatever language you decide to use:
That will block the installed dictionary and allow gesture typing to still work, once you have some words in the personal dictionary.
I have settled on using ‘en_US’ as my main language however, I installed the main en_US dictionary from the HeliBoard dictionary repository into the English United kingdom (en_GB). This way I can switch from US to GB whenever I need to spell check something that’s not in my DIY custom dictionary, and I’ve forgotten how to spell 🤣 😂 and the keyboard layouts & symbols are close enough to not be a problem.
By using a different language you can use the alternative language when you need to spell check just by switching language. This makes “building” the dictionary a little less painful.
For gesture typing to work you must have some words either in an installed dictionary or the personal dictionary.
If you use User Dictionary Manager (UDM) to import the words, all words will be ranked 255 apparently, 🤷 however because they are all ranked the same they are effectivity equal. It seems to work reasonably well — a lot better than the pre-installed dictionary.
If you use Multiling O Keyboard to import the words be aware that it deletes everything from all personal dictionaries upon import, it cannot do shortcuts, and cannot import into the ‘All Languages Dictionary.’ However you can set the word ranking and import it, but if you allow the keyboard to automatically add words to the dictionary they’ll be added at 250 🤦🤦🤦 Therefore I’ve found it best to build the dictionary using User Dictionary Manager (UDM) and leaving everything at 255 ranking. 🤷 Once you’ve built the dictionary to a reasonable size you can then decide if you want to import it with rankings via Multiling O Keyboard — by using Multiling you could import a large amount of words as a “base dictionary” all set to 0, and then use User Dictionary Manager (UDM) to import words you have actually used at the high ranking. This would giver you a reasonable “spell checking dictionary” and those lower ranked words won’t be suggested as much unless you explicitly type them, at which point they’ll become part of the internal word list and internal ranking via the algorithms.
DO NOT USE THE NO LANGUAGE (ALPHABET) as the personal dictionary because this “language” personal dictionary is not accessible from the settings, cannot be imported to properly, and has many other bugs related to it!
DIY dictionary tips
Adding a shortcut for “conflicting” words such as its vs it’s or good vs God. You can use space, commas, or periods in the gesture shortcut to differentiate between two such words.
Example: shortcut for it’s = ‘it s’
Quickly gesture from the ‘t’ to the space bar with a slight pause on the space bar gives a good clear indication and seems to work well. The period key and comma key are great as well, just harder to hit but they are very specific. You can also combine the space, comma, and period keys for more specificity. I personally get tired of correcting certain words every time I type them!
Predict Numbers
Add numbers from 0 to 999 into your dictionary and HeliBoard will show number suggestions when you type letters from the top row of the qwerty keyboard.
Predict Emojis
Add the emoji as the word and a shortcut.
Example: shortcut = puke, word = 🤮
See the import & export tutorial for more tips on using User Dictionary Manager (UDM) & Multiling O Keyboard for importing & exporting words.
Improved Gboard typing!
Since I switched to using the en_US personal dictionary, the same Gboard uses to pull from but doesn’t add words, I noticed Gboard improved typing as well. The reason is simple and obvious and can be applied to HeliBoard’s full installed dictionary. Once you have built up a significant list of words you actually use into the personal dictionary they act as a “weight” to the keyboard’s regular dictionary — since I’ve been leaving the word ranking set to its default for most words in the personal dictionary the keyboards see these words as “more important” than any other words. And since these really are my “most used words” they seem to help clarify suggestions and gesture typing. Gboard has become nearly typo free, HeliBoard still has issues with short words like ‘as’ — HeliBoard behaves as if the gesture is slightly off from the keys thus I need to over or under exaggerate the gesture depending on the word, somewhat annoying!
I tested this same idea with HeliBoard in some of my initial tests — load the personal dictionary with more commonly used words, 5,000-30,000 words, while keeping the main installed dictionary enabled. This helped HeliBoard gesture type better however, HeliBoard has such a massive dictionary, 160,000+ words, and bigram (next word recommendations) for every word that while it was more usable for gesture typing it wasn’t close to Gboard. However those tests were generalized words, not a dictionary I built from my own typing. Personally I don’t see myself using HeliBoard with a regular dictionary from this point forward — if I were to do that I would switch fully to Gboard as GrapheneOS allows me to block internet connection per app so Google privacy invasion isn’t an issue for me. 🤷🏻♂️
Really, the only benefit, other than privacy and there are other keyboards just as private, to using HeliBoard at this point in its development is the ability to build your own dictionary IMO.
Nonetheless it’s truly impressive how much better typing is on both HeliBoard and Gboard after implementing this “individualized dictionary” and it really saddens me that developers, for over a decade, have missed or blatantly ignored this and therefore given all users substandard typing on Android 😡
HeliBoard Setting Adjustments
There are some settings in HeliBoard when adjusted seem to improve gesture typing accuracy.
The optimum setting I’ve landed on are:
This gives a slightly large keyboard however, with setting the background color different than the key background, just slightly, it’s proving to be dead on accurate for gesture typing. . . but that’s for my phone, so 🤷🏻♂️ and I don’t know if some of it is visual cues or due to how the glide library has been hacked into HeliBoard, but it seems better to me 🤷🏻♂️ whereas Gboard works perfectly fine whatever size I set. Just an FYI.
Fully Functional Toolbar
I hear you say, but we have that and even better we have editing layouts! No, you don’t have a fully functional toolbar. There is not one keyboard with a well designed & well executed toolbar. NOT A SINGLE ONE!
Now this is something painfully obvious but here we are 2024 and developers haven’t pulled this simple task off.
By far Yandex has the most functional of the toolbars I’ve tested, however, it leaves out a lot of functionality simply because the developers, apparently, aren’t capable of using a single element for multiple functions or making a toolbar that can be scrolled 🤷 hopefully we can improve on it.
Screenshot of the Yandex Keyboard toolbar:
Toolbar vs Editing Layout
Which is best? Swype went with an editing layout and those that have included any type of text manipulation have done the same except Yandex, and I have to say, I think Yandex is on the right path. The only real downside of a toolbar is the sizing of the elements used — developers often forget people with fingers much larger than the mouse pointer they’re using onscreen will be using these features! Otherwise everything can be fitted into a scrollable toolbar that is in an editing layout, just a little thought will get us there. Furthermore, the toolbar can be available across every layout the keyboard has, always available and therefore always useable whereas, an editing layout can often be very clumsy to use — having to switch to symbols to add that punctuation then to clipboard and then back, etc. Personally I’ve always been somewhat frustrated using Gboard’s editing layout, and clipboard layout, because I’m always having to go to another layout and then back, therefore, IMO it’s best to make such tools in a way that they are “always available” and therefore requires less transitioning.
Toolbar Design
Primary Elements (5 TTL):
Secondary Elements (2 TTL):
Tertiary Elements (4 TTL):
Ancillary Elements
Primary + secondary + tertiary = 11 elements. That’s one more than the top keyboard row and a rather tight fit if all these elements were to be forced into that width and not scrollable.
Note: items like clipboard & microphone may be better placed in an always available area.
Ancillary elements can turn the toolbar into a junk drawer — I’m looking at you Openboard/HeliBoard. Therefore the developer has to ask, is this element really needed in the toolbar or can it be placed in a more suitable place? Most ancillary items can be placed in the long press of the enter key or comma. Is it really something you need fast access to? Do you need it frequently?
Of course the developer can always add everything and allow the user to turn off elements but the toolbar needs to be scrollable!
Toolbar Features
Voice-Input
SayBoard recently released an update that included an “editable keyboard” demonstrating something I’ve always complained about — why can’t the user have access to the keyboard, at least punctuation, while doing voice input?
SayBoard with its editable layout, customized to my liking:
As you can see, my SayBoard layout has most of the elements I would use when doing voice input. How much better would it be with a fully functional keyboard?
Why doesn’t the keyboard and voice input work as one?
Because Google did it the other way — each their own input and switching between the two. And everyone else keeps on doing it the same way.
But SayBoard with it’s new layout demonstrates you can have continuous voice input while entering punctuation, numbers, etc. Therefore why aren’t keyboard developers integrating voice to text in a way that allows the user to continue to use the full keyboard while using voice input? 🤷
There are a few things one needs to consider about this integration.
First, the voice to text must be a fast streaming method that outputs the text quickly, not the current trend of AI modules that needs to think about what the user has said, format the text, add in its own hallucinations, etc. No, this is voice to text input with the user knowing they need to add punctuation, symbols, numbers, etc. This is interactive, not a secretary. This is for speed — one time through and you are done, no need to go back and edit everything!
Second, the user cannot be transitioning to a different input method — I’d hope this would be obvious but I’m not assuming anything anymore. The microphone should be the only thing changing — gray for offline, white outline for getting ready, solid white for ready for input, etc. Very simple! But has it every been done? NO!
What has been forced on the users for years is to switch to voice input, talk, switch back to keyboard and edit the input, because so much of it is wrong or missing punctuation or new paragraph, etc…
This is such a simple and obvious thing that open-source keyboards can implement today that gives them an edge.
Of course there will be those users that want the AI secretary approach, which is fine as there are modules that do not integrate into the keyboard that are available, lots of them. These types add no benefit to the keyboard, they are a fully separate input method.
Keyboard Layouts
Compact Layout
T9 Layout
Import Gboard Words
Gboard uses its own internal personal/user dictionary, therefore, while it looks like the system’s personal dictionary it is not and therefore no other keyboards can see your Gboard words. This tutorial will show you how to get your words out of Gboard and import them into the system’s personal dictionary, and I’ll show you how you can manipulate the word’s “rank” in mass.
Step one: export from Gboard
Step two: import
Step three: alt import & word “rank”
HeliBoard gives a “rank” (I don’t know what it’s actually called) between 0-255 to every word in the dictionary and in the personal dictionary it gives all words the maximum weight of 255. This 255 rank can be problematic.
The numbers from 0-255 correspond as:
This means a word with 255 will be highly recommended in the suggestions. This may not be what you want and you may want to change the weight to something less important, however, the only way to adjust this number in HeliBoard is one word at a time via the personal dictionary. Furthermore there is No way to backup these words with their ranking number.
The solution, partially, is “Multiling O Keyboard” in the Google play store here.
Multiling O Keyboard hasn’t been updated in a long time and its user dictionary can’t be installed on my pixel 6a running GrapheneOS Android 14, but the keyboard does install. Another reason keyboard developers need too ensure quality backups.
Once you have it installed and enabled, in the settings for Multiling O Keyboard you’ll see:
If you have imported your words and setup Multiling O Keyboard so it’s using the user dictionary you’ll see listed all your words with their weight number and what dictionary they are stored in:
Notice the first and last line? Those lines are required by Multiling O Keyboard to import words. This list can’t be used by User Dictionary Manager (UDM)!
From this setting screen you can copy all your words with their corresponding ranking and save them in a txt file = export. This makes “batch” changing of the weight number easy (find & replace) along with an easy way to backup those numbers with their words.
Remember 0-255 are the numbers you can use, the higher the more important the word is. The number 0 doesn’t seem to do anything in the personal dictionary — in the main dictionary it’s used to signify a word that is NOT to be shown.
The “null” dictionary, if you used “all languages” to import using User Dictionary Manager (UDM), doesn’t seem to work very well importing into with Multiling, best to use a language specific dictionary (en_US).
See the section on DIY Custom Dictionary for tips to build an ultralight dictionary to improve gesture typing & text prediction.
Resources
These arec Android apps, online tools, and source materials that I used in developing the word lists and doing this research.
Android apps
Online Tools
Source Material
Tutorials
Beta Was this translation helpful? Give feedback.
All reactions