Monday, August 31, 2020

User-specific XKB configuration - part 3

This is the continuation from these posts: part 1, part 2

Let's talk about everyone's favourite [1] keyboard configuration system again: XKB. If you recall the goal is to make it simple for users to configure their own custom layouts. Now, as described earlier, XKB-the-implementation doesn't actually have a concept of a "layout" as such, it has "components" and something converts your layout desires into the combination of components. RMLVO (rules, model, layout, variant, options) is what you specify and gets converted to KcCGST (keycodes, compat, geometry, symbols, types). This is a one-way conversion, the resulting keymaps no longer has references to the RMLVO arguments. Today's post is about that conversion, and we're only talking about libxkbcommon as XKB parser because anything else is no longer maintained.

The peculiar thing about XKB data files (provided by xkeyboard-config [3]) is that the filename is part of the API. You say layout "us" variant "dvorak", the rules file translates this to symbols 'us(dvorak)' and the parser will understand this as "load file 'symbols/us' and find the dvorak section in that file". [4] The default "us" keyboard layout will require these components:

xkb_keymap {
 xkb_keycodes  { include "evdev+aliases(qwerty)" };
 xkb_types     { include "complete" };
 xkb_compat    { include "complete" };
 xkb_symbols   { include "pc+us+inet(evdev)" };
 xkb_geometry  { include "pc(pc105)" };
};
So the symbols are really: file symbols/pc, add symbols/us and then the section named 'evdev' from symbols/inet [5]. Types are loaded from types/complete, etc. The lookup paths for libxkbcommon are $XDG_CONFIG_HOME/xkb, /etc/xkb, and /usr/share/X11/xkb, in that order.

Most of the symbols sections aren't actually full configurations. The 'us' default section only sets the alphanumeric rows, everything else comes from the 'pc' default section (hence: include "pc+us+inet(evdev)"). And most variant sections just include the default one, usually called 'basic'. For example, this is the 'euro' variant of the 'us' layout which merely combines a few other sections:

partial alphanumeric_keys
xkb_symbols "euro" {

    include "us(basic)"
    name[Group1]= "English (US, euro on 5)";

    include "eurosign(5)"

    include "level3(ralt_switch)"
};
Including things works as you'd expect: include "foo(bar)" loads section 'bar' from file 'foo' and this works for 'symbols/', 'compat/', etc., it'll just load the file in the same subdirectory. So yay, the directory is kinda also part of the API.

Alright, now you understand how KcCGST files are loaded, much to your despair.

For user-specific configuration, we could already load a 'custom' layout from the user's home directory. But it'd be nice if we could just add a variant to an existing layout. Like "us(banana)", because potassium is important or somesuch. This wasn't possible because the filename is part of the API. So our banana variant had to be in $XDG_CONFIG_HOME/xkb/symbols/us and once we found that "us" file, we could no longer include the system one.

So as of two days ago, libxkbcommon now extends the parser to have merged KcCGST files, or in other words: it'll load the symbols/us file in the lookup path order until it finds the section needed. With that, you can now copy this into your $XDG_CONFIG_HOME/xkb/symbols/us file and have it work as variant:

partial alphanumeric_keys
xkb_symbols "banana" {

    include "us(basic)"
    name[Group1]= "English (Banana)";

    // let's assume there are some keymappings here
};
And voila, you now have a banana variant that can combine with the system-level "us" layout.

And because there must be millions [6] of admins out there that maintain custom XKB layouts for a set of machines, the aforementioned /etc/xkb lookup path was also recently added to libxkbcommon. So we truly now have the typical triplet of lookup paths:

  • vendor-provided ones in /usr/share/X11/xkb,
  • host-specific ones in /etc/xkb, and
  • user-specific ones in $XDG_CONFIG_HOME/xkb [7].
Good times, good times.

[1] Much in the same way everyone's favourite Model T colour was black
[2] This all follows the UNIX philosophy, there are of course multiple tools involved and none of them know what the other one is doing
[3] And I don't think Sergey gets enough credit for maintaining that pile of language oddities
[4] Note that the names don't have to match, we could map layout 'us' to the symbols in 'banana' but life's difficult enough as it is
[5] I say "add" when it's sort of a merge-and-overwrite and, yes, of course there are multiple ways to combine layouts, thanks for asking
[6] Actual number may be less
[7] Notice how "X11" is missing in the latter two? If that's not proof that we want to get rid of X, I don't know what is!

Tuesday, August 25, 2020

libei - a library to support emulated input

Let's talk about eggs. X has always supported XSendEvent() which allows anyone to send any event to any client [1]. However, this event had a magic bit to make it detectable, so clients detect and subsequently ignore it. Spoofing input that just gets ignored is of course not productive, so in the year 13 BG [2] the XTest extension was conceived. XTest has a few requests that allow you to trigger a keyboard event (press and release, imagine the possibilities), buttons and pointer motion. The name may seem odd until someone explains to you that it was primarily written to support automated testing of X servers. But no-one has the time to explain that.

Having a separate extension worked around the issue of detectability and thus any client could spoof input events. Security concerns were addressed with "well, just ifdef out that extension then" which worked great until other applications started using it for input emulation. Since around ~2008 XTest events are emulated through special XTest devices in the server but that is solely to make the implementation less insane. Technically this means that XTest events are detectable again, except that no-one bothers to actually do that. Having said that, these devices only make it possible to detect an XTest event, but not which client sent that event. And, due to how the device hierarchy works, it's really hard to filter out those events anyway.

Now it's 2020 and we still have an X server that basically allows access to anything and any client to spoof input. This level of security is industry standard for IoT devices but we are trying to be more restrictive than that on your desktop, lest the stuff you type actually matters. The accepted replacement for X is of course Wayland, but Wayland-the-protocol does not provide an extension to emulate input. This makes sense, emulating input is not exactly a display server job [3] but it does leaves us with a bunch of things no longer working.

libei

So let's talk about libei. This new library for Emulated Input consists of two components: libei, the client library to be used in applications, and libeis, the part to be used by an Emulated Input Server, to be used in the compositors. libei provides the API to connect to an EIS implementation and send events. libeis provides the API to receive those events and pass them on to the compositor. Let's see how this looks like in the stack:

    +--------------------+             +------------------+
    | Wayland compositor |---wayland---| Wayland client B |
    +--------------------+\            +------------------+
    | libinput | libeis  | \_wayland______
    +----------+---------+                \
        |          |           +-------+------------------+
 /dev/input/       +---brei----| libei | Wayland client A |
                               +-------+------------------+
     
"brei" is the communication "Bridge for EI" and is a private protocol between libei and libeis.

Emulated input is not particularly difficult, mostly because when you're emulating input, you know exactly what you are trying to do. There is no interpretation of bad hardware data like libinput has to do, it's all straightforward. Hence the communication between libei and libeis is relatively trivial, it's all button, keyboard, pointer and touch events. Nothing fancy here and I won't go into details because that part will work as you expect it to. The selling point of libei is elsewhere, primarily in separation, distinction and control.

libei is a separate library to libinput or libwayland or Xlib or whatever else may have. By definition, it is thus a separate input stream in both the compositor and the client. This means the compositor can do things like display a warning message while emulated input is active. Or it can re-route the input automatically, e.g. assign a separate wl_seat to the emulated input devices so they can be independently focused, etc. Having said that, the libeis API is very similar to libinput's API so integration into compositors is quite trivial because most of the hooks to process incoming events are already in place.

libei distinguishes between different clients, so the compositor is always aware of who is currently emulating input. You have synergy, xdotool and a test suite running at the same time? The compositor is aware of which client is sending which events and can thus handle those accordingly.

Finally, libei provides control over the emulated input. This works on multiple levels. A libei client has to request device capabilities (keyboard, touch, pointer) and the compositor can restrict to a subset of those (e.g. "no keyboard for you"). Second, the compositor can suspend/resume a device at any time. And finally, since the input events go through the compositor anyway, it can discard events it doesn't like. For example, even where the compositor allowed keyboards and the device is not suspended, it may still filter out Escape key presses. Or rewrite those to be Caps Lock because we all like a bit of fun, don't we?

The above isn't technically difficult either, libei itself is not overly complex. The interaction between an EI client and an EIS server is usually the following:

  • client connects to server and says hello
  • server disconnects the client, or accepts it
  • client requests one or more devices with a set of capabilities
  • server rejects those devices or allows those devices and/or limits their capabilities
  • server resumes the device (because they are suspended by default)
  • client sends events
  • client disconnects, server disconnects the client, server suspends the device, etc.
Again, nothing earth-shattering here. There is one draw-back though: the server must approve of a client and its devices, so a client is not able to connect, send events and exit. It must wait until the server has approved the devices before it can send events. This means tools like xdotool need to be handled in a special way, more on that below.

Flatpaks and portals

With libei we still have the usual difficulties: a client may claim it's synergy when really it's bad-hacker-tool. There's not that much we can do outside a sandbox, but once we are guarded by a portal, things look different:

    +--------------------+
    | Wayland compositor |_
    +--------------------+  \
    | libinput | libeis  |   \_wayland______
    +----------+---------+                  \
        |     [eis-0.socket]                 \
 /dev/input/     /   \\       +-------+------------------+
                |      ======>| libei | Wayland client A |
                |      after    +-------+------------------+
         initial|     handover   /
      connection|               / initial request
                |              /  dbus[org.freedesktop.portal.EmulatedInput]
        +--------------------+
        | xdg-desktop-portal |
        +--------------------+
The above shows the interaction when a client is run inside a sandbox with access to the org.freedesktop.portal.Desktop bus. The libei client connects to the portal (it won't see the EIS server otherwise), the portal hands it back a file descriptor to communicate on and from then on it's like a normal EI session. What the portal implementation can do though is write several Restrictions for EIS on the server side of the file descriptor using libreis. Usually, the portal would write the app-id of the client (thus guaranteeing a reliable name) and maybe things like "don't allow keyboard capabilities for any device". Once the fd is handed to the libei client, the restrictions cannot be loosened anymore so a client cannot overwrite its own name.

So the full interaction here becomes:

  • Client connects to org.freedesktop.portal.EmulatedInput
  • Portal implementation verifies that the client is allowed to emulate input
  • Portal implementation obtains a socket to the EIS server
  • Portal implementation sends the app id and any other restrictions to the EIS server
  • Portal implementation returns the socket to the client
  • Client creates devices, sends events, etc.
For a client to connect to the portal instead of the EIS server directly is currently a one line change.

Note that the portal implementation is still in its very early stages and there are changes likely to happen. The general principle is unlikely to change though.

Xwayland

Turns out we still have a lot of X clients around so somehow we want to be able to use those. Under Wayland, those clients connect to Xwayland which translates X requests to Wayland requests. X clients will use XTest to emulate input which currently goes to where the dodos went. But we can add libei support to Xwayland and connect XTest, at which point the diagram looks like this:

    +--------------------+             +------------------+
    | Wayland compositor |---wayland---| Wayland client B |
    +--------------------+\            +------------------+
    | libinput | libeis  | \_wayland______
    +----------+---------+                \
        |          |           +-------+------------------+
 /dev/input/       +---brei----| libei |     XWayland     |
                               +-------+------------------+
                                                |
                                                | XTEST
                                                |
                                         +-----------+
                                         |  X client |
                                         +-----------+
XWayland is basically just another Wayland client but it knows about XTest requests, and it knows about the X clients that request those. So it can create a separate libei context for each X client, with pointer and keyboard devices that match the XTest abilities. The end result of that is you can have a xdotool libei context, a synergy libei context, etc. And of course, where XWayland runs in a sandbox it could use the libei portal backend. At which point we have a sandboxed X client using a portal to emulate input in the Wayland compositor. Which is pretty close to what we want.

Where to go from here?

So, at this point the libei repo is still sitting in my personal gitlab namespace. Olivier Fourdan has done most of the Xwayland integration work and there's a basic Weston implementation. The portal work (tracked here) is the one needing the most attention right now, followed by the implementations in the real compositors. I think I have tentative agreement from the GNOME and KDE developers that this all this is a good idea. Given that the goal of libei is to provide a single approach to emulate input that works on all(-ish) compositors [4], that's a good start.

Meanwhile, if you want to try it, grab libei from git, build it, and run the eis-demo-server and ei-demo-client tools. And for portal support, run the eis-fake-portal tool, just so you don't need to mess with the real one. At the moment, those demo tools will have a client connecting a keyboard and pointer and sending motion, button and 'qwerty' keyboard events every few seconds. The latter with client and/or server-set keymaps because that's possible too.

Eggs

What does all this have to do with eggs? "Ei", "Eis", "Brei", and "Reis" are, respectively, the German words for "egg", "ice" or "ice cream", "mush" (think: porridge) and "rice". There you go, now you can go on holidays to a German speaking country and sustain yourself on a nutritionally imbalanced diet.

[1] The whole "any to any" is a big thing in X and just shows that in the olden days you could apparently trust, well, apparently anyone
[2] "before git", i.e. too bothersome to track down so let's go with the Copyright notice in the protocol specification
[3] XTest the protocol extension exists presumably because we had a hammer and a protocol extension thus looked like a nail
[4] Let's not assume that every compositor having their own different approach to emulation is a good idea