I've had an interesting meeting with Jens Petersen yesterday about input methods. Jens is one of the i18n guys working for Red Hat.
Input methods are a way of merging several typed symbols into one actual symbols. Western languages rarely use them (the compose key isn't quite the same), but many eastern languages rely on them. To give one (made up) example, an IM setup allows you to type "qqq" and converts it into the chinese symbol for tree.
Unfortunately, IM implementations are somewhat broken and rely on a multitude of hacks. Right now, IM implementations often need to hook onto keycodes instead of keysyms. Keycodes are a numerical value that is usually the same for a key (except when it isn't). So "q" will always be the same keycode (except when it isn't). In X, a keycode has no meaning other than being an index into the keysym table.
Keysyms are the actual symbols that are to be displayed. So while the "q" key may have a keycode of 24, it will have the keysym for "q" in qwerty and the keysym for "a" in azerty.
And here's where everything goes wrong for IM. If you listen for keycodes, and you switch drivers, then keycode 24 isn't the same key anymore. If you listen for keysyms and you switch layout, keysym "q" isn't the same key anymore. Oops.
During a previous meeting and the one yesterday, we came up with a solution to fix them properly.
Let's take a step back and look at keyboard input. The user hits a physical key, usually because of what is printed on that key. That key generates a keycode, which represents a keysym. That keysym is usually the same symbol as what is printed on the keyboard. (Of course, there are exceptions to that with the prime example being dvorak layout on a qwerty physical keyboard)
In the end, IM should aim to provide the same functionality, with the added step of combining multiple symbols into one.
For IM implementations, we can differ between two approaches:
In the first approach, a set of keysyms should combine to a final symbol. For example, typing "tree" should result in a tree symbol. This case can be fixed easily by the IM implementation only ever dealing with keysyms. Where the key is located doesn't matter and it works equally well with us(qwerty) and fr(dvorak). As a mental bridge: if the symbols come in via morse code and you can convert to the correct final symbol, then your IM is in this category. This approach is easy to deal with, so we can close the case on it.
In the second approach, a set of key presses should combine to a final symbol. For example, typing the top left key four times should result in a tree symbol. In this case, we can't hook onto keysyms because they may change with the layout. But we can't hook onto keycodes either because they are essentially random.
Wait. What? Why does the keysym change with the layout?
Because we have the wrong layout selected. If you're trying to type Chinese, you shouldn't have a us layout. If you're trying to type Japanese, you shouldn't have a french layout. Because these keysyms don't represent what the key is supposed to do. The keysyms are supposed to represent what is printed on the keyboard, and those symbols are Chinese, Japanese, Indic, etc. So the solution is to fix up the keysyms. Instead of trying to listen for a "q", the keyboard layout should generate a "tree" keysym. The IM implementation can then listen for this symbol and combine to the final symbol as required.
This essentially means that for each keyboard with intermediate symbols there should be an appropriate keyboard layout - just as there is for western languages. And once these keysyms are available, the second approach becomes identical to the first approach and it doesn't matter anymore where the physical key is located.
The good thing about this approach are that users and developers can leverage existing tools for selecting and changing between different layouts. (bonus points for using the word "leverage") It also means that a more unified configuration between standard DE tools and IM tools is possible.
For the IM implementation, this simplifies things by a bit. First of all, it can listen to the XKB group state to adjust automatically whether IM is needed or not. For example, if us(qwerty) and traditional chinese are configured as layouts, the IM implementation can kick in whenever the group toggles to chinese. As long as it is on us(qwerty), it can slumber in the background.
Second, no layout-specific hacks are required. The physical location of the key, the driver, they all don't matter anymore. Even morse-code is supported now ;)
Talking to Jens, his main concern is that XKB limits to 4 groups at a time. This restriction is built into the protocol and won't disappear completely anytime soon. Though XI2 and XKB2 address this issue, it will take a while to get a meaningful adoption rate. Nonetheless, the approach above should make IM for the large majority of users more robust and predictable, without the issues coming up whenever hacks are involved.
I think this is the right approach, Jens agrees and Sergey Udaltsov, the xkeyboard-config maintainer too. So now we just need to get this implemented, but it will take a while to sort out all the details and move all languages over.