Who-T

udev-hid-bpf: quickstart tooling to fix your HID devices with eBPF

2024-04-18T14:17:00.000+10:00

For the last few months, Benjamin Tissoires and I have been working on and polishing a little tool called udev-hid-bpf [1]. This is the scaffolding required quickly and easily write, test and eventually fix your HID input devices (mouse, keyboard, etc.) via a BPF program instead of a full-blown custom kernel driver or a semi-full-blown kernel patch. To understand how it works, you need to know two things: HID and BPF [2].

Why BPF for HID?

HID is the Human Interface Device standard and the most common way input devices communicate with the host (HID over USB, HID over Bluetooth, etc.). It has two core components: the "report descriptor" and "reports", both of which are byte arrays. The report descriptor is a fixed burnt-in-ROM byte array that (in rather convoluted terms) tells us what we'll find in the reports. Things like "bits 16 through to 24 is the delta x coordinate" or "bit 5 is the binary button state for button 3 in degrees celcius". The reports themselves are sent at (usually) regular intervals and contain the data in the described format, as the devices perceives reality. If you're interested in more details, see Understanding HID report descriptors.

BPF or more correctly eBPF is a Linux kernel technology to write programs in a subset of C, compile it and load it into the kernel. The magic thing here is that the kernel will verify it, so once loaded, the program is "safe". And because it's safe it can be run in kernel space which means it's fast. eBPF was originally written for network packet filters but as of kernel v6.3 and thanks to Benjamin, we have BPF in the HID subsystem. HID actually lends itself really well to BPF because, well, we have a byte array and to fix our devices we need to do complicated things like "toggle that bit to zero" or "swap those two values".

If we want to fix our devices we usually need to do one of two things: fix the report descriptor to enable/disable/change some of the values the device pretends to support. For example, we can say we support 5 buttons instead of the supposed 8. Or we need to fix the report by e.g. inverting the y value for the device. This can be done in a custom kernel driver but a HID BPF program is quite a lot more convenient.

HID-BPF programs

For illustration purposes, here's the example program to flip the y coordinate. HID BPF programs are usually device specific, we need to know that the e.g. the y coordinate is 16 bits and sits in bytes 3 and 4 (little endian):

SEC("fmod_ret/hid_bpf_device_event")
int BPF_PROG(hid_y_event, struct hid_bpf_ctx *hctx)
{
	s16 y;
	__u8 *data = hid_bpf_get_data(hctx, 0 /* offset */, 9 /* size */);

	if (!data)
		return 0; /* EPERM check */

	y = data[3] | (data[4] << 8);
	y = -y;

	data[3] = y & 0xFF;
	data[4] = (y >> 8) & 0xFF;

	return 0;
}

That's it. HID-BPF is invoked before the kernel handles the HID report/report descriptor so to the kernel the modified report looks as if it came from the device.

As said above, this is device specific because where the coordinates is in the report depends on the device (the report descriptor will tell us). In this example we want to ensure the BPF program is only loaded for our device (vid/pid of 04d9/a09f), and for extra safety we also double-check that the report descriptor matches.

// The bpf.o will only be loaded for devices in this list
HID_BPF_CONFIG(
	HID_DEVICE(BUS_USB, HID_GROUP_GENERIC, 0x04D9, 0xA09F)
);

SEC("syscall")
int probe(struct hid_bpf_probe_args *ctx)
{
	/*
	* The device exports 3 interfaces.
	* The mouse interface has a report descriptor of length 71.
	* So if report descriptor size is not 71, mark as -EINVAL
	*/
	ctx->retval = ctx->rdesc_size != 71;
	if (ctx->retval)
		ctx->retval = -EINVAL;

	return 0;
}

Obviously the check in probe() can be as complicated as you want.

This is pretty much it, the full working program only has a few extra includes and boilerplate. So it mostly comes down to compiling and running it, and this is where udev-hid-bpf comes in.

udev-hid-bpf as loader

udev-hid-bpf is a tool to make the development and testing of HID BPF programs simple, and collect HID BPF programs. You basically run meson compile and meson install and voila, whatever BPF program applies to your devices will be auto-loaded next time you plug those in. If you just want to test a single bpf.o file you can udev-hid-bpf install /path/to/foo.bpf.o and it will install the required udev rule for it to get loaded whenever the device is plugged in. If you don't know how to compile, you can grab a tarball from our CI and test the pre-compiled bpf.o. Hooray, even simpler.

udev-hid-bpf is written in Rust but you don't need to know Rust, it's just the scaffolding. The BPF programs are all in C. Rust just gives us a relatively easy way to provide a static binary that will work on most tester's machines.

The documentation for udev-hid-bpf is here. So if you have a device that needs a hardware quirk or just has an annoying behaviour that you always wanted to fix, well, now's the time. Fixing your device has never been easier! [3].

[1] Yes, the name is meh but you're welcome to come up with a better one and go back in time to suggest it a few months ago.
[2] Because I'm lazy the terms eBPF and BPF will be used interchangeably in this article. Because the difference doesn't really matter in this context, it's all eBPF anyway but nobody has the time to type that extra "e".
[3] Citation needed

Enforcing a touchscreen mapping in GNOME

2024-03-12T14:33:00.002+10:00

Touchscreens are quite prevalent by now but one of the not-so-hidden secrets is that they're actually two devices: the monitor and the actual touch input device. Surprisingly, users want the touch input device to work on the underlying monitor which means your desktop environment needs to somehow figure out which of the monitors belongs to which touch input device. Often these two devices come from two different vendors, so mutter needs to use ... */me holds torch under face* .... HEURISTICS! :scary face:

Those heuristics are actually quite simple: same vendor/product ID? same dimensions? is one of the monitors a built-in one? [1] But unfortunately in some cases those heuristics don't produce the correct result. In particular external touchscreens seem to be getting more common again and plugging those into a (non-touch) laptop means you usually get that external screen mapped to the internal display.

Luckily mutter does have a configuration to it though it is not exposed in the GNOME Settings (yet). But you, my $age $jedirank, can access this via a commandline interface to at least work around the immediate issue. But first: we need to know the monitor details and you need to know about gsettings relocatable schemas.

Finding the right monitor information is relatively trivial: look at $HOME/.config/monitors.xml and get your monitor's vendor, product and serial from there. e.g. in my case this is:

  <monitors version="2">
   <configuration>
    <logicalmonitor>
      <x>0</x>
      <y>0</y>
      <scale>1</scale>
      <monitor>
        <monitorspec>
          <connector>DP-2</connector>
          <vendor>DEL</vendor>              <--- this one
          <product>DELL S2722QC</product>   <--- this one
          <serial>59PKLD3</serial>          <--- and this one
        </monitorspec>
        <mode>
          <width>3840</width>
          <height>2160</height>
          <rate>59.997</rate>
        </mode>
      </monitor>
    </logicalmonitor>
    <logicalmonitor>
      <x>928</x>
      <y>2160</y>
      <scale>1</scale>
      <primary>yes</primary>
      <monitor>
        <monitorspec>
          <connector>eDP-1</connector>
          <vendor>IVO</vendor>
          <product>0x057d</product>
          <serial>0x00000000</serial>
        </monitorspec>
        <mode>
          <width>1920</width>
          <height>1080</height>
          <rate>60.010</rate>
        </mode>
      </monitor>
    </logicalmonitor>
  </configuration>
</monitors>

Well, so we know the monitor details we want. Note there are two monitors listed here, in this case I want to map the touchscreen to the external Dell monitor. Let's move on to gsettings.

gsettings is of course the configuration storage wrapper GNOME uses (and the CLI tool with the same name). GSettings follow a specific schema, i.e. a description of a schema name and possible keys and values for each key. You can list all those, set them, look up the available values, etc.:


    $ gsettings list-recursively
    ... lots of output ...
    $ gsettings set org.gnome.desktop.peripherals.touchpad click-method 'areas'
    $ gsettings range org.gnome.desktop.peripherals.touchpad click-method
    enum
    'default'
    'none'
    'areas'
    'fingers'

Now, schemas work fine as-is as long as there is only one instance. Where the same schema is used for different devices (like touchscreens) we use a so-called "relocatable schema" and that requires also specifying a path - and this is where it gets tricky. I'm not aware of any functionality to get the specific path for a relocatable schema so often it's down to reading the source. In the case of touchscreens, the path includes the USB vendor and product ID (in lowercase), e.g. in my case the path is:

  /org/gnome/desktop/peripherals/touchscreens/04f3:2d4a/

In your case you can get the touchscreen details from lsusb, libinput record, /proc/bus/input/devices, etc. Once you have it, gsettings takes a schema:path argument like this:

  $ gsettings list-recursively org.gnome.desktop.peripherals.touchscreen:/org/gnome/desktop/peripherals/touchscreens/04f3:2d4a/
  org.gnome.desktop.peripherals.touchscreen output ['', '', '']

Looks like the touchscreen is bound to no monitor. Let's bind it with the data from above:

 
   $ gsettings set org.gnome.desktop.peripherals.touchscreen:/org/gnome/desktop/peripherals/touchscreens/04f3:2d4a/ output "['DEL', 'DELL S2722QC', '59PKLD3']"

Note the quotes so your shell doesn't misinterpret things.

And that's it. Now I have my internal touchscreen mapped to my external monitor which makes no sense at all but shows that you can map a touchscreen to any screen if you want to.

[1] Probably the one that most commonly takes effect since it's the vast vast majority of devices

New gitlab.freedesktop.org 🚯 emoji-based spamfighting abilities

2024-01-29T17:58:00.003+10:00

This is a follow-up from our Spam-label approach, but this time with MOAR EMOJIS because that's what the world is turning into.

Since March 2023 projects could apply the "Spam" label on any new issue and have a magic bot come in and purge the user account plus all issues they've filed, see the earlier post for details. This works quite well and gives every project member the ability to quickly purge spam. Alas, pesky spammers are using other approaches to trick google into indexing their pork [1] (because at this point I think all this crap is just SEO spam anyway). Such as commenting on issues and merge requests. We can't apply labels to comments, so we found a way to work around that: emojis!

In GitLab you can add "reactions" to issue/merge request/snippet comments and in recent GitLab versions you can register for a webhook to be notified when that happens. So what we've added to the gitlab.freedesktop.org instance is support for the :do_not_litter: (🚯) emoji [2] - if you set that on an comment the author of said comment will be blocked and the comment content will be removed. After some safety checks of course, so you can't just go around blocking everyone by shotgunning emojis into gitlab. Unlike the "Spam" label this does not currently work recursively so it's best to report the user so admins can purge them properly - ideally before setting the emoji so the abuse report contains the actual spam comment instead of the redacted one. Also note that there is a 30 second grace period to quickly undo the emoji if you happen to set it accidentally.

Note that for purging issues, the "Spam" label is still required, the emojis only work for comments.

Happy cleanup!

[1] or pork-ish
[2] Benjamin wanted to use :poop: but there's a chance that may get used for expressing disagreement with the comment in question

Xorg being removed. What does this mean?

2023-12-14T14:13:00.002+10:00

You may have seen the news that Red Hat Enterprise Linux 10 plans to remove Xorg. But Xwayland will stay around, and given the name overloading and them sharing a git repository there's some confusion over what is Xorg. So here's a very simple "picture". This is the xserver git repository:

$ tree -d -L 2 xserver
xserver
├── composite
├── config
├── damageext
├── dbe
├── dix
├── doc
│   └── dtrace
├── dri3
├── exa
├── fb
├── glamor
├── glx
├── hw
│   ├── kdrive
│   ├── vfb
│   ├── xfree86              <- this one is Xorg
│   ├── xnest
│   ├── xquartz
│   ├── xwayland
│   └── xwin
├── include
├── m4
├── man
├── mi
├── miext
│   ├── damage
│   ├── rootless
│   ├── shadow
│   └── sync
├── os
├── present
├── pseudoramiX
├── randr
├── record
├── render
├── test
│   ├── bigreq
│   ├── bugs
│   ├── damage
│   ├── scripts
│   ├── sync
│   ├── xi1
│   └── xi2
├── Xext
├── xfixes
├── Xi
└── xkb

The git repo produces several X servers, including the one designed to run on bare metal: Xorg (in hw/xfree86 for historical reasons). The other hw directories are the other X servers including Xwayland. All the other directories are core X server functionality that's shared between all X servers [1]. Removing Xorg from a distro but keeping Xwayland means building with --disable-xfree86 -enable-xwayland [1]. That's simply it (plus the resulting distro packaging work of course).

Removing Xorg means you need something else that runs on bare metal and that is your favourite Wayland compositor. Xwayland then talks to that while presenting an X11-compatible socket to existing X11 applications.

Of course all this means that the X server repo will continue to see patches and many of those will also affect Xorg. For those who are running git master anyway. Don't get your hopes up for more Xorg releases beyond the security update background noise [2].

Xwayland on the other hand is actively maintained and will continue to see releases. But those releases are a sequence [1] of

$ git new-branch xwayland-23.x.y
$ git rm hw/{kdrive/vfb/xfree86/xnest,xquartz,xwin}
$ git tag xwayland-23.x.y

In other words, an Xwayland release is the xserver git master branch with all X servers but Xwayland removed. That's how Xwayland can see new updates and releases without Xorg ever seeing those (except on git master of course). And that's how your installed Xwayland has code from 2023 while your installed Xorg is still stuck on the branch created and barely updated after 2021.

I hope this helps a bit with the confusion of the seemingly mixed messages sent when you see headlines like "Xorg is unmaintained", "X server patches to fix blah", "Xorg is abandoned", "new Xwayland release.

[1] not 100% accurate but close enough
[2] historically an Xorg release included all other X servers (Xquartz, Xwin, Xvfb, ...) too so this applies to those servers too unless they adopt the Xwayland release model

PSA: For Xorg GNOME sessions, use the xf86-input-wacom driver for your tablets

2023-11-10T13:22:00.001+10:00

TLDR: see the title of this blog post, it's really that trivial.

Now that ~~Godot~~Wayland has been coming for ages and all new development focuses on a pile of software that steams significantly less, we're seeing cracks appear in the old Xorg support. Not intentionally, but there's only so much time that can be spent on testing and things that are more niche fall through. One of these was a bug I just had the pleasure of debugging and was triggered by GNOME on Xorg user using the xf86-input-libinput driver for tablet devices.

On the surface of it, this should be fine because libinput (and thus xf86-input-libinput) handles tablets just fine. But libinput is the new kid on the block. The old kid on said block is the xf86-input-wacom driver, older than libinput by slightly over a decade. And oh man, history has baked things into the driver that are worse than raisins in apple strudel [1].

The xf86-input-libinput driver was written as a wrapper around libinput and makes use of fancy things that (from libinput's POV) have always been around: things like input device hotplugging. Fancy, I know. For tablet devices the driver creates an X device for each new tool as it comes into proximity first. Future events from that tool will go through that device. A second tool, be it a new pen or the eraser on the original pen, will create a second X device and events from that tool will go through that X device. Configuration on any device will thus only affect that particular pen. Almost like the whole thing makes sense.

The wacom driver of course doesn't do this. It pre-creates X devices for some possible types of tools (pen, eraser, and cursor [2] but not airbrush or artpen). When a tool goes into proximity the events are sent through the respective device, i.e. all pens go through the pen tool, all erasers through the eraser tool. To actually track pens there is the "Wacom Serial IDs" property that contains the current tool's serial number. If you want to track multiple tools you need to query the property on proximity in [4]. At the time this was within a reasonable error margin of a good idea.

Of course and because MOAR CONFIGURATION! will save us all from the great filter you can specify the "ToolSerials" xorg.conf option as e.g. "airbrush;12345;artpen" and get some extra X devices pre-created, in this case a airbrush and artpen X device and an X device just for the tool with the serial number 12345. All other tools multiplex through the default devices. Again, at the time this was a great improvement. [5]

Anyway, where was I? Oh, right. The above should serve as a good approximation of a reason why the xf86-input-libinput driver does not try to be fullly compatible to the xf86-input-wacom driver. In everyday use these things barely matter [6] but for the desktop environment which needs to configure these devices all these differences mean multiple code paths. Those paths need to be tested but they aren't, so things fall through the cracks.

So quite a while ago, we made the decision that until Xorg goes dodo, the xf86-input-wacom driver is the tablet driver to use in GNOME. So if you're using a GNOME on Xorg session [7], do make sure the xf86-input-wacom driver is installed. It will make both of us happier and that's a good aim to strive for.

[1] It's just a joke. Put the pitchforks down already.
[2] The cursor is the mouse-like thing Wacom sells. Which is called cursor [3] because the English language has a limited vocabulary and we need to re-use words as much as possible lest we run out of them.
[3] It's also called puck. Because [2].
[4] And by "query" I mean "wait for the XI2 event notifying you of a property change". Because of lolz the driver cannot update the property on proximity in but needs to schedule that as idle func so the property update for the serial always arrives at some unspecified time after the proximity in but hopefully before more motion events happen. Or not, and that's how hope dies.
[5] Think about this next time someone says they long for some unspecified good old days.
[6] Except the strip axis which on the wacom driver is actually a bit happily moving left/right as your finger moves up/down on the touch strip and any X client needs to know this. libinput normalizes this to...well, a normal value but now the X client needs to know which driver is running so, oh deary deary.
[7] e.g because your'e stockholmed into it by your graphics hardware

gitlab.freedesktop.org now has a bugbot for automatic issue/merge request processing

2023-07-03T16:34:00.000+10:00

As of today, gitlab.freedesktop.org provides easy hooks to invoke the gitlab-triage tool for your project. gitlab-triage allows for the automation of recurring tasks, for example something like

If the label FOO is set, close the issue and add a comment containing ".... blah ..."

Many project have recurring tasks like this, e.g. the wayland project gets a lot of issues that are compositor (not protocol) issues. Being able to just set a label and have things happen is much more convenient than having to type out the same explanations over and over again.

The goal for us was to provide automated handling for these with as little friction as possible. And of course each project must be able to decide what actions should be taken. Usually gitlab-triage is run as part of project-specific scheduled pipelines but since we already have webhook-based spam-fighting tools we figured we could make this even easier.

So, bugbot was born. Any project registered with bugbot can use labels prefixed with "bugbot::" to have gitlab-triage invoked against the project's policies file. These labels thus serve as mini-commands for bugbot, though each project decides what happens for any particular label. bugbot effectively works like this:

sleep 30
for label in {issue|merge_request}.current_labels:
  if label.startswith("bugbot::"):
     wget https://gitlab.freedesktop.org/foo/bar/-/raw/{main|master}/.triage-policies.yml
     run-gitlab-triage --as-user @bugbot --use-file .triage-policies.yml
     break

And this is triggered on every issue/merge request update for any registered project which means that all you need to do is set the label and you're done. The things of note here:

bugbot delays by 30 seconds, giving you time to unset an accidentally applied label before it takes effect
bugbot doesn't care about the label beyond the (hard-coded) "bugbot::" prefix
bugbot always runs your project's triage policies, from your main or master branch (whichever succeeds first)
The actions are performed as the bugbot user, not your user

The full documentation of what you can do in a policies file is available at the gitlab-triage documentation but let's look at a simple example that shouldn't even need explanation:

resource_rules:
  issues:
    rules:
      - name: convert bugbot label to other label
        conditions:
          labels:
            - "bugbot::foo"
        actions:
          labels:
            - "foo"
          remove_labels:
            - "bugbot::foo"
          comment: |
            Nice label you have there. Would be a shame 
            if someone removed it
          status: "close"
  merge_requests:
    rules:
      []

And the effect of this file can be seen in this issue here.

Registering a project

Bugbot is part of the damspam project and registering a project can be done with a single command. Note: this can only be done by someone with the Maintainer role or above.

Create a personal access token with API access and save the token value as $XDG_CONFIG_HOME/bugbot/user.token Then run the following commands with your project's full path (e.g. mesa/mesa, pipewire/wireplumber, xorg/lib/libX11):

$ pip install git+https://gitlab.freedesktop.org/freedesktop/damspam
$ bugbot request-webhook foo/bar

After this you may remove the token file and the package

$ pip uninstall damspam
$ rm $XDG_CONFIG_HOME/bugbot/user.token

The bugbot command will file an issue in the freedesktop/fdo-bots repository. This issue will be automatically processed and should be done by the time you finish the above commands, see this issue for an example. Note: the issue processing requires a git push to an internal repo - if you script this for multiple repos please put a sleep(30) in to avoid conflicts.

Adding triage policies

Once registered, the .triage-policies.yml file must be added to the root directory of your project. What bugbot commands you want to respond to (and the actions to take) is up to you, though there are two things of note: you should always remove the bugbot label you are reacting to to avoid duplicate processing and gitlab-triage does not create new labels. So any label in your actions must be manually created in the project first. Beyond that - the sky's your limit.

Remember you can test your policies file with

 $ gitlab-triage --dry-run --token $GITLAB_TOKEN \
   --source-id foo/bar  --resource-reference 1234

As usual, many thanks to Benjamin Tissoires for reviews and the magic of integrating this into infrastructure.

snegg - Python bindings for libei

2023-06-06T19:36:00.001+10:00

After what was basically a flurry of typing, the snegg Python bindings for libei are now available. This is a Python package that provides bindings to the libei/libeis/liboeffis C libraries with a little bit of API improvement to make it not completely terrible. The main goal of these bindings (at least for now) is to provide some quick and easy way to experiment with what could possibly be done using libei - both server-side and client-side. [1] The examples directory has a minimal EI client (with portal support via liboeffis) and a minimal EIS implementation. The bindings are still quite rough and the API is nowhere near stable.

A proper way to support EI in Python would be to implement the protocol directly - there's no need for the C API quirkiness this way and you can make full use of things like async and whatnot. If you're interested in that, get in touch! Meanwhile, writing something roughly resemling xdotool is probably only a few hundred lines of python code. [2]

[1] writing these also exposed a few bugs in libei itself so I'm happy 1.0 wasn't out just yet
[2] at least the input emulation parts of xdotool

libei and a fancy protocol

2023-05-09T10:51:00.003+10:00

libei is the library for Emulated Input - see this post for an introduction. Like many projects, libei was started when it was still unclear if it could be the right solution to the problem. In the years (!) since, we've upgraded the answer to that question from "hopefully" to "yeah, I reckon" - doubly so since we added support for receiver contexts and got InputLeap working through the various portal changes.

Emulating or capturing input needs two processes to communicate for obvious reasons so the communication protocol is a core part of it. But initially, libei was a quickly written prototype and the protocol was hacked up on an as-needed let's-get-this-working basis. The rest of the C API got stable enough but the protocol was the missing bit. Long-term the protocol must be stable - without a stable protocol updating your compositor may break all flatpaks still shipping an older libei. Or updating a flatpak may not work with an older compositor. So in the last weeks/months, a lot of work as gone into making the protocol stable. This consisted of two parts: drop protobuf and make the variuos features interface-dependent, unashamedly quite like the Wayland protocol which is also split into a number of interfaces that can be independently versioned. Initially, I attempted to make the protocol binary compatible with Wayland but dropped that goal eventually - the benefits were minimal and the effort and limitations (due to different requirements) were quite significant.

The protocol is defined in a single XML file and can be used directly from language bindings (if any). The protocol documentation is quite extensive but it's relatively trivial in principal: the first 8 bytes of each message are the object ID, then we have 4 bytes for the message length in bytes, then 4 for the object-specific opcode. That opcode is one of the requests or events in the object's interface - which is defined at object creation time. Unlike Wayland, the majority of objects in libei are created in server-side (the EIS implementation decides which seats are available and which devices in those seats). The remainder of the message are the arguments. Note that unlike other protocols the message does not carry a signature - prior knowledge of the message is required to parse the arguments. This is a direct effect of initially making it wayland-compatible and I didn't really find it worth the effort to add this.

Anyway, long story short: swapping the protocol out didn't initially have any effect on the C library but with the changes came some minor updates to remove some of the warts in the API. Perhaps the biggest change is that the previous capabilities of a device are now split across several interfaces. Your average mouse-like emulated device will have the "pointer", "button" and "scroll" interfaces, or maybe the "pointer_absolute", "button" and "scroll" interface. The touch and keyboard interfaces were left as-is. Future interfaces will likely include gestures and tablet tools, I have done some rough prototyping locally and it will fit in nicely enough with the current protocol.

At the time of writing, the protocol is not officialy stable but I have no intention of changing it short of some bug we may discover. Expect libei 1.0 very soon.

New gitlab.freedesktop.org spamfighting abilities

2023-03-28T19:30:00.001+10:00

As of today, gitlab.freedesktop.org allows anyone with a GitLab Developer role or above to remove spam issues. If you are reading this article a while after it's published, it's best to refer to the damspam README for up-to-date details. I'm going to start with the TLDR first.

For Maintainers

Create a personal access token with API access and save the token value as $XDG_CONFIG_HOME/damspam/user.token Then run the following commands with your project's full path (e.g. mesa/mesa, pipewire/wireplumber, xorg/lib/libX11):

$ pip install git+https://gitlab.freedesktop.org/freedesktop/damspam
$ damspam request-webhook foo/bar
# clean up, no longer needed.
$ pip uninstall damspam
$ rm $XDG_CONFIG_HOME/damspam/user.token

The damspam command will file an issue in the freedesktop/fdo-bots repository. This issue will be automatically processed by a bot and should be done by the time you finish the above commands, see this issue for an example. Note: the issue processing requires a git push to an internal repo - if you script this for multiple repos please put a sleep(30) in to avoid conflicts.

Once the request has been processed (and again, this should be instant), any issue in your project that gets assigned the label Spam will be processed automatically by damspam. See the next section for details.

For Developers

Once the maintainer for your project has requested the webhook, simply assign the Spam label to any issue that is spam. The issue creator will be blocked (i.e. cannot login), this issue and any other issue filed by the same user will be closed and made confidential (i.e. they are no longer visible to the public). In the future, one of the GitLab admins can remove that user completely but meanwhile, they and their spam are gone from the public eye and they're blocked from producing more. This should happen within seconds of assigning the Spam label.

For GitLab Admins

Create a personal access token with API access for the @spambot user and save the token value as $XDG_CONFIG_HOME/damspam/spambot.token. This is so you can operate as spambot instead of your own user. Then run the following command to remove all tagged spammers:

$ pip install git+https://gitlab.freedesktop.org/freedesktop/damspam
$ damspam purge-spammers

The last command will list any users that are spammers (together with an issue that should make it simple to check whether it is indeed spam) and after interactive confirmation purge them as requested. At the time of writing, the output looks like this:

$ damspam purge-spammers
0: naughtyuser              : https://gitlab.freedesktop.org/somenamespace/project/-/issues/1234: [STREAMING@TV]!* LOOK AT ME
1: abcuseless               : https://gitlab.freedesktop.org/somenamespace/project/-/issues/4567: ((@))THIS STREAM IS IMPORTANT
2: anothergit               : https://gitlab.freedesktop.org/somenamespace/project/-/issues/8778: Buy something, really
3: whatawasteofalife        : https://gitlab.freedesktop.org/somenamespace/project/-/issues/9889: What a waste of oxygen I am
Purging a user means a full delete including all issues, MRs, etc. This is nonrecoverable!
Please select the users to purge:
[q]uit, purge [a]ll, or the index:

Purging the spammers will hard-delete them and remove anything they ever did on gitlab. This is irreversible.

How it works

There are two components at play here: hookiedookie, a generic webhook dispatcher, and damspam which handles the actual spam issues. Hookiedookie provides an HTTP server and "does things" with JSON data on request. What it does is relatively generic (see the Settings.yaml example file) but it's set up to be triggered by a GitLab webhook and thus receives this payload. For damspam the rules we have for hookiedookie come down to something like this: if the URL is "webhooks/namespace/project" and damspam is set up for this project and the payload is an issue event and it has the "Spam" label in the issue labels, call out to damspam and pass the payload on. Other rules we currently use are automatic reload on push events or the rule to trigger the webhook request processing bot as above.

This is also the reason a maintainer has to request the webhook. When the request is processed, the spambot installs a webhook with a secret token (a uuid) in the project. That token will be sent as header (a standard GitLab feature). The project/token pair is also added to hookiedookie and any webhook data must contain the project name and matching token, otherwise it is discarded. Since the token is write-only, no-one (not even the maintainers of the project) can see it.

damspam gets the payload forwarded but is otherwise unaware of how it is invoked. It checks the issue, fetches the data needed, does some safety check and if it determines that yes, this is spam, then it closes the issue, makes it confidential, blocks the user and then recurses into every issue this user ever filed. Not necessarily in that order. There are some safety checks, so you don't have to worry about it suddenly blocking every project member.

Why?

For a while now, we've suffered from a deluge of spam (and worse) that makes it through the spam filters. GitLab has a Report Abuse feature for this but it's... woefully incomplete. The UI guides users to do the right thing - as reporter you can tick "the user is sending spam" and it automatically adds a link to the reported issue. But: none of this useful data is visible to admins. Seriously, look at the official screenshots. There is no link to the issue, all you get is a username, the user that reported it and the content of a textbox that almost never has any useful information. The link to the issue? Not there. The selection that the user is a spammer? Not there.

For an admin, this is frustrating at best. To verify that the user is indeed sending spam, you have to find the issue first. Which, at best, requires several clicks and digging through the profile activities. At worst you know that the user is a spammer because you trust the reporter but you just can't find the issue for whatever reason.

But even worse: reporting spam does nothing immediately. The spam stays up until an admin wakes up, reviews the abuse reports and removes that user. Meanwhile, the spammer can happily keep filing issues against the project. Overall, it is not a particularly great situation.

With hookiedookie and damspam, we're now better equipped to stand against the tide of spam. Anyone who can assign labels can help fight spam and the effect is immediate. And it's - for our use-cases - safe enough: if you trust someone to be a developer on your project, we can trust them to not willy-nilly remove issues pretending they're spam. In fact, they probably could've deleted issues beforehand already anyway if they wanted to make them disappear.

Other instances

While we're definitely aiming at gitlab.freedesktop.org, there's nothing in particular that requires this instance. If you're the admin for a public gitlab instance feel free to talk to Benjamin Tissoires or me to check whether this could be useful for you too, and what changes would be necessary.

libinput and the custom pointer acceleration function

2023-01-17T14:47:00.001+10:00

After 8 months of work by Yinon Burgansky, libinput now has a new pointer acceleration profile: the "custom" profile. This profile allows users to tweak the exact response of their device based on their input speed.

A short primer: the pointer acceleration profile is a function that multiplies the incoming deltas with a given factor F, so that your input delta (x, y) becomes (Fx, Fy). How this is done is specific to the profile, libinput's existing profiles had either a flat factor or an adaptive factor that roughly resembles what Xorg used to have, see the libinput documentation for the details. The adaptive curve however has a fixed behaviour, all a user could do was scale the curve up/down, but not actually adjust the curve.

Input speed to output speed

The new custom filter allows exactly that: it allows a user to configure a completely custom ratio between input speed and output speed. That ratio will then influence the current delta. There is a whole new API to do this but simplified: the profile is defined via a series of points of (x, f(x)) that are linearly interpolated. Each point is defined as input speed in device units/ms to output speed in device units/ms. For example, to provide a flat acceleration equivalent, specify [(0.0, 0.0), (1.0, 1.0)]. With the linear interpolation this is of course a 45-degree function, and any incoming speed will result in the equivalent output speed.

Noteworthy: we are talking about the speed here, not any individual delta. This is not exactly the same as the flat acceleration profile (which merely multiplies the deltas by a constant factor) - it does take the speed of the device into account, i.e. device units moved per ms. For most use-cases this is the same but for particularly slow motion, the speed may be calculated across multiple deltas (e.g. "user moved 1 unit over 21ms"). This avoids some jumpyness at low speeds.

But because the curve is speed-based, it allows for some interesting features too: the curve [(0.0, 1.0), (1.0, 1.0)] is a horizontal function at 1.0. Which means that any input speed results in an output speed of 1 unit/ms. So regardless how fast the user moves the mouse, the output speed is always constant. I'm not immediately sure of a real-world use case for this particular case (some accessibility needs maybe) but I'm sure it's a good prank to play on someone.

Because libinput is written in C, the API is not necessarily immediately obvious but: to configure you pass an array of (what will be) y-values and set the step-size. The curve then becomes: [(0 * step-size, array[0]), (1 * step-size, array[1]), (2 * step-size, array[2]), ...]. There are some limitations on the number of points but they're high enough that they should not matter.

Note that any curve is still device-resolution dependent, so the same curve will not behave the same on two devices with different resolution (DPI). And since the curves uploaded by the user are hand-polished, the speed setting has no effect - we cannot possibly know how a custom curve is supposed to scale. The setting will simply update with the provided value and return that but the behaviour of the device won't change in response.

Motion types

Finally, there's another feature in this PR - the so-called "movement type" which must be set when defining a curve. Right now, we have two types, "fallback" and "motion". The "motion" type applies to, you guessed it, pointer motion. The only other type available is fallback which applies to everything but pointer motion. The idea here is of course that we can apply custom acceleration curves for various different device behaviours - in the future this could be scrolling, gesture motion, etc. And since those will have a different requirements, they can be configure separately.

How to use this?

As usual, the availability of this feature depends on your Wayland compositor and how this is exposed. For the Xorg + xf86-input-libinput case however, the merge request adds a few properties so that you can play with this using the xinput tool:

  # Set the flat-equivalent function described above
  $ xinput set-prop "devname" "libinput Accel Custom Motion Points" 0.0 1.0
  # Set the step, i.e. the above points are on 0 u/ms, 1 u/ms, ...
  # Can be skipped, 1.0 is the default anyway
  $ xinput set-prop "devname" "libinput Accel Custom Motion Points" 1.0 
  # Now enable the custom profile
  $ xinput set-prop "devname" "libinput Accel Profile Enabled" 0 0 1

The above sets a custom pointer accel for the "motion" type. Setting it for fallback is left as an exercise to the reader (though right now, I think the fallback curve is pretty much only used if there is no motion curve defined).

Happy playing around (and no longer filing bug reports if you don't like the default pointer acceleration ;)

Availability

This custom profile will be available in libinput 1.23 and xf86-input-libinput-1.3.0. No release dates have been set yet for either of those.

X servers no longer allow byte-swapped clients (by default)

2023-01-06T12:15:00.003+10:00

In the beginning, there was the egg. Then fictional people started eating that from different ends, and the terms of "little endians" and "Big Endians" was born.

Computer architectures (mostly) come with one of either byte order: MSB first or LSB first. The two are incompatible of course, and many a bug was introduced trying to convert between the two (or, more common: failing to do so). The two byte orders were termed Big Endian and little endian, because that hilarious naming scheme at least gives us something to laugh about while contemplating throwing it all away and considering a future as, I don't know, a strawberry plant.

Back in the mullet-infested 80s when the X11 protocol was designed both little endian and big endian were common enough. And back then running the X server on a different host than the client was common too - the X terminals back then had less processing power than a smart toilet seat today so the cpu-intensive clients were running on some mainfraime. To avoid overtaxing the poor mainframe already running dozens of clients for multiple users, the job of converting between the two byte orders was punted to the X server. So to this day whenever a client connects, the first byte it sends is a literal "l" or "B" to inform the server of the client's byte order. Where the byte order doesn't match the X server's byte order, the client is a "swapped client" in X server terminology and all 16, 32, and 64-bit values must be "byte-swapped" into the server's byte order. All of those values in all requests, and then again back to the client's byte order in all outgoing replies and events. Forever, till a crash do them part.

If you get one of those wrong, the number is no longer correct. And it's properly wrong too, the difference between 0x1 and 0x01000000 is rather significant. [0] Which has the hilarious side-effect of... well, pretty much anything. But usually it ranges from crashing the server (thus taking all other clients down in commiseration) to leaking random memory locations. The list of security issues affecting the various SProcFoo implementations (X server naming scheme for Swapped Procedure for request Foo) is so long that I'm too lazy to pull out the various security advisories and link to them. Just believe me, ok? *jedi handwave*

These days, encountering a Big Endian host is increasingly niche, letting it run an X client that connects to your local little-endian X server is even more niche [1]. I think the only regular real-world use-case for this is running X clients on an s390x, connecting to your local intel-ish (and thus little endian) workstation. Not something most users do on a regular basis. So right now, the byte-swapping code is mainly a free attack surface that 99% of users never actually use for anything real. So... let's not do that?

I just merged a PR into the X server repo that prohibits byte-swapped clients by default. A Big Endian client connecting to an X server will fail the connection with an error message of "Prohibited client endianess, see the Xserver man page". [2] Thus, a whole class of future security issues avoided - yay!

For the use-cases where you do need to let Big Endian clients connect to your little endian X server, you have two options: start your X server (Xorg, Xwayland, Xnest, ...) with the +byteswappedclients commandline option. Alternatively, and this only applies for Xorg: add Option "AllowByteSwappedClients" "on" to the xorg.conf ServerFlags section. Both of these will change the default back to the original setting. Both are documented in the Xserver(1) and xorg.conf(5) man pages, respectively.

Now, there's a drawback: in the Wayland stack, the compositor is in charge of starting Xwayland which means the compositor needs to expose a way of passing +byteswappedclients to Xwayland. This is compositor-specific, bugs are filed for mutter (merged for GNOME 44), kwin and wlroots. Until those are addressed, you cannot easily change this default (short of changing /usr/bin/Xwayland into a wrapper script that passes the option through).

There's no specific plan yet which X releases this will end up in, primarily because the release cycle for X is...undefined. Probably xserver-23.0 if and when that happens. It'll probably find its way into the xwayland-23.0 release, if and when that happens. Meanwhile, distributions interested in this particular change should consider backporting it to their X server version. This has been accepted as a Fedora 38 change.

[0] Also, it doesn't help that much of the X server's protocol handling code was written with the attitude of "surely the client wouldn't lie about that length value"
[1] little-endian client to Big Endian X server is so rare that it's barely worth talking about. But suffice to say, the exact same applies, just with little and big swapped around.
[2] That message is unceremoniously dumped to stderr, but that bit is unfortunately a libxcb issue.

libei - opening the portal doors

2022-12-13T13:24:00.000+10:00

Time for another status update on libei, the transport layer for bouncing emulated input events between applications and Wayland compositors [1]. And this time it's all about portals and how we're about to use them for libei communication. I've hinted at this in the last post, but of course you're forgiven if you forgot about this in the... uhm.. "interesting" year that was 2022. So, let's recap first:

Our basic premise is that we want to emulate and/or capture input events in the glorious new world that is Wayland (read: where applications can't do whatever they want, whenever they want). libei is a C library [0] that aims to provide this functionality. libei supports "sender" and "receiver" contexts and that just specifies which way the events will flow. A sender context (e.g. xdotool) will send emulated input events to the compositor, a "receiver" context will - you'll never guess! - receive events from the compositor. If you have the InputLeap [2] use-case, the server-side will be a receiver context, the client side a sender context. But libei is really just the transport layer and hasn't had that many changes since the last post - most of the effort was spent on trying to figure out how to exchange the socket between different applications. And for that, we have portals!

RemoteDesktop

In particular, we have a PR for the RemoteDesktop portal to add that socket exchange. In particular, once a RemoteDesktop session starts your application can request an EIS socket and send input events over that. This socket supersedes the current NotifyButton and similar DBus calls and removes the need for the portal to stay in the middle - the application and compositor now talk directly to each other. The compositor/portal can still close the session at any time though, so all the benefits of a portal stay there. The big advantage of integrating this into RemoteDesktop is that the infrastructucture for that is already mostly in place - once your compositor adds the bits for the new ConnectToEIS method you get all the other pieces for free. In GNOME this includes a visual indication that your screen is currently being remote-controlled, same as from a real RemoteDesktop session.

Now, talking to the RemoteDesktop portal is nontrivial simply because using DBus is nontrivial, doubly so for the way how sessions and requests work in the portals. To make this easier, libei 0.4.1 now includes a new library "liboeffis" that enables your application to catch the DBus. This library has a very small API and can easily be integrated with your mainloop (it's very similar to libei). We have patches for Xwayland to use that and it's really trivial to use. And of course, with the other Xwayland work we already had this means we can really run xdotool through Xwayland to connect through the XDG Desktop Portal as a RemoteDesktop session and move the pointer around. Because, kids, remember, uhm, Unix is all about lots of separate pieces.

InputCapture

On to the second mode of libei - the receiver context. For this, we also use a portal but a brand new one: the InputCapture portal. The InputCapture portal is the one to use to decide when input events should be captured. The actual events are then sent over the EIS socket.

Right now, the InputCapture portal supports PointerBarriers - virtual lines on the screen edges that, once crossed, trigger input capture for a capability (e.g. pointer + keyboard). And an application's basic approach is to request a (logical) representation of the available desktop areas ("Zones") and then set up pointer barriers at the edge(s) of those Zones. Get the EIS connection, Enable() the session and voila - the compositor will (hopefully) send input events when the pointer crosses one of those barriers. Once that happens you'll get a DBus signal in InputCapture and the events will start flowing on the EIS socket. The portal itself doesn't need to sit in the middle, events go straight to the application. The portal can still close the session anytime though. And the compositor can decide to stop capturing events at any time.

There is actually zero Wayland-y code in all this, it's display-system acgnostic. So anyone with too much motivation could add this to the X server too. Because that's what the world needs...

The (currently) bad news is that this needs to be pulled into a lot of different repositories. And everything needs to get ready before it can be pulled into anything to make sure we don't add broken API to any of those components. But thanks to a lot of work by Olivier Fourdan, we have this mostly working in InputLeap (tbh the remaining pieces are largely XKB related, not libei-related). Together with the client implementation (through RemoteDesktop) we can move pointers around like in the InputLeap of old (read: X11).

Our current goal is for this to be ready for GNOME 45/Fedora 39.

[0] eventually a protocol but we're not there yet
[1] It doesn't actually have to be a compositor but that's the prime use-case, so...
[2] or barrier or synergy. I'll stick with InputLeap for this post

The new XWAYLAND extension is available

2022-08-11T16:50:00.004+10:00

As of xorgproto 2022.2, we have a new X11 protocol extension. First, you may rightly say "whaaaat? why add new extensions to the X protocol?" in a rather unnecessarily accusing way, followed up by "that's like adding lipstick to a dodo!". And that's not completely wrong, but nevertheless, we have a new protocol extension to the ... [checks calendar] almost 40 year old X protocol. And that extension is, ever creatively, named "XWAYLAND".

If you recall, Xwayland is a different X server than Xorg. It doesn't try to render directly to the hardware, instead it's a translation layer between the X protocol and the Wayland protocol so that X clients can continue to function on a Wayland compositor. The X application is generally unaware that it isn't running on Xorg and Xwayland (and the compositor) will do their best to accommodate for all the quirks that the application expects because it only speaks X. In a way, it's like calling a restaurant and ordering a burger because the person answering speaks American English. Without realising that you just called the local fancy French joint and now the chefs will have to make a burger for you, totally without avec.

Anyway, sometimes it is necessary for a client (or a user) to know whether the X server is indeed Xwayland. Previously, this was done through heuristics: the xisxwayland tool checks for XRandR properties, the xinput tool checks for input device names, and so on. These heuristics are just that, though, so they can become unreliable as Xwayland gets closer to emulating Xorg or things just change. And properties in general are problematic since they could be set by other clients. To solve this, we now have a new extension.

The XWAYLAND extension doesn't actually do anything, it's the bare minimum required for an extension. It just needs to exist and clients only need to XQueryExtension or check for it in XListExtensions (the equivalent to xdpyinfo | grep XWAYLAND). Hence, no support for Xlib or libxcb is planned. So of all the nightmares you've had in the last 2 years, the one of misidentifying Xwayland will soon be in the past.

libei - adding support for passive contexts

2022-03-04T14:30:00.002+10:00

A quick reminder: libei is the library for emulated input. It comes as a pair of C libraries, libei for the client side and libeis for the server side.

libei has been sitting mostly untouched since the last status update. There are two use-cases we need to solve for input emulation in Wayland - the ability to emulate input (think xdotool, or Synergy/Barrier/InputLeap client) and the ability to capture input (think Synergy/Barrier/InputLeap server). The latter effectively blocked development in libei [1], until that use-case was sorted there wasn't much point investing too much into libei - after all it may get thrown out as a bad idea. And epiphanies were as elusive like toilet paper and RATs, so nothing much get done. This changed about a week or two ago when the required lightbulb finally arrived, pre-lit from the factory.

So, the solution to the input capturing use-case is going to be a so-called "passive context" for libei. In the traditional [2] "active context" approach for libei we have the EIS implementation in the compositor and a client using libei to connect to that. The compositor sets up a seat or more, then some devices within that seat that typically represent the available screens. libei then sends events through these devices, causing input to be appear in the compositor which moves the cursor around. In a typical and simple use-case you'd get a 1920x1080 absolute pointer device and a keyboard with a $layout keymap, libei then sends events to position the cursor and or happily type away on-screen.

In the "passive context" <deja-vu> approach for libei we have the EIS implementation in the compositor and a client using libei to connect to that. The compositor sets up a seat or more, then some devices within that seat </deja-vu> that typically represent the physical devices connected to the host computer. libei then receives events from these devices, causing input to be generated in the libei client. In a typical and simple use-case you'd get a relative pointer device and a keyboard device with a $layout keymap, the compositor then sends events matching the relative input of the connected mouse or touchpad.

The two notable differences are thus: events flow from EIS to libei and the devices don't represent the screen but rather the physical [3] input devices.

This changes libei from a library for emulated input to an input event transport layer between two processes. On a much higher level than e.g. evdev or HID and with more contextual information (seats, devices are logically abstracted, etc.). And of course, the EIS implementation is always in control of the events, regardless which direction they flow. A compositor can implement an event filter or designate key to break the connection to the libei client. In pseudocode, the compositor's input event processing function will look like this:

  function handle_input_events():
     real_events = libinput.get_events()
     for e in real_events:
       if input_capture_active:
         send_event_to_passive_libei_client(e)
       else:
         process_event(e)
         
     emulated_events = eis.get_events_from_active_clients()
     for e in emulated_events:
          process_event(e)

Not shown here are the various appropriate filters and conversions in between (e.g. all relative events from libinput devices would likely be sent through the single relative device exposed on the EIS context). Again, the compositor is in control so it would be trivial to implement e.g. capturing of the touchpad only but not the mouse.

In the current design, a libei context can only be active or passive, not both. The EIS context is both, it's up to the implementation to disconnect active or passive clients if it doesn't support those.

Notably, the above only caters for the transport of input events, it doesn't actually make any decision on when to capture events. This handled by the CaptureInput XDG Desktop Portal [4]. The idea here is that an application like Synergy/Barrier/InputLeap server connects to the CaptureInput portal and requests a CaptureInput session. In that session it can define pointer barriers (left edge, right edge, etc.) and, in the future, maybe other triggers. In return it gets a libei socket that it can initialize a libei context from. When the compositor decides that the pointer barrier has been crossed, it re-routes the input events through the EIS context so they pop out in the application. Synergy/Barrier/InputLeap then converts that to the global position, passes it to the right remote Synergy/Barrier/InputLeap client and replays it there through an active libei context where it feeds into the local compositor.

Because the management of when to capture input is handled by the portal and the respective backends, it can be natively integrated into the UI. Because the actual input events are a direct flow between compositor and application, the latency should be minimal. Because it's a high-level event library, you don't need to care about hardware-specific details (unlike, say, the inputfd proposal from 2017). Because the negotiation of when to capture input is through the portal, the application itself can run inside a sandbox. And because libei only handles the transport layer, compositors that don't want to support sandboxes can set up their own negotiation protocol.

So overall, right now this seems like a workable solution.

[1] "blocked" is probably overstating it a bit but no-one else tried to push it forward, so..
[2] "traditional" is probably overstating it for a project that's barely out of alpha development
[3] "physical" is probably overstating it since it's likely to be a logical representation of the types of inputs, e.g. one relative device for all mice/touchpads/trackpoints
[4] "handled by" is probably overstating it since at the time of writing the portal is merely a draft of an XML file

The xf86-input-wacom driver hits 1.0

2022-02-15T15:24:00.002+10:00

After roughly 20 years and counting up to 0.40 in release numbers, I've decided to call the next version of the xf86-input-wacom driver the 1.0 release. [1] This cycle has seen a bulk of development (>180 patches) which is roughly as much as the last 12 releases together. None of these patches actually added user-visible features, so let's talk about technical dept and what turned out to be an interesting way of reducing it.

The wacom driver's git history goes back to 2002 and the current batch of maintainers (Ping, Jason and I) have all been working on it for one to two decades. It used to be a Wacom-only driver but with the improvements made to the kernel over the years the driver should work with most tablets that have a kernel driver, albeit some of the more quirky niche features will be more limited (but your non-Wacom devices probably don't have those features anyway).

The one constant was always: the driver was extremely difficult to test, something common to all X input drivers. Development is a cycle of restarting the X server a billion times, testing is mostly plugging hardware in and moving things around in the hope that you can spot the bugs. On a driver that doesn't move much, this isn't necessarily a problem. Until a bug comes along, that requires some core rework of the event handling - in the kernel, libinput and, yes, the wacom driver.

After years of libinput development, I wasn't really in the mood for the whole "plug every tablet in and test it, for every commit". In a rather caffeine-driven development cycle [2], the driver was separated into two logical entities: the core driver and the "frontend". The default frontend is the X11 one which is now a relatively thin layer around the core driver parts, primarily to translate events into the X Server's API. So, not unlike libinput + xf86-input-libinput in terms of architecture. In ascii-art:

                                             |
                     +--------------------+  |   big giant 
  /dev/input/event0->|  core driver | x11 |->|    X server
                     +--------------------+  |    process
                                             |

Now, that logical separation means we can have another frontend which I implemented as a relatively light GObject wrapper and is now a library creatively called libgwacom:

                                            
                     +-----------------------+  |
  /dev/input/event0->|  core driver | gwacom |--| tools or test suites
                     +-----------------------+  |

This isn't a public library or API and it's very much focused on the needs of the X driver so there are some peculiarities in there. What it allows us though is a new wacom-record tool that can hook onto event nodes and print the events as they come out of the driver. So instead of having to restart X and move and click things, you get this:

$ ./builddir/wacom-record
wacom-record:
  version: 0.99.2
  git: xf86-input-wacom-0.99.2-17-g404dfd5a
  device:
    path: /dev/input/event6
    name: "Wacom Intuos Pro M Pen"
  events:
    - source: 0
      event: new-device
      name: "Wacom Intuos Pro M Pen"
      type: stylus
      capabilities:
        keys: true
        is-absolute: true
        is-direct-touch: false
        ntouches: 0
        naxes: 6
        axes:
          - {type: x           , range: [    0, 44800], resolution: 200000}
          - {type: y           , range: [    0, 29600], resolution: 200000}
          - {type: pressure    , range: [    0, 65536], resolution:     0}
          - {type: tilt_x      , range: [  -64,    63], resolution:    57}
          - {type: tilt_y      , range: [  -64,    63], resolution:    57}
          - {type: wheel       , range: [ -900,   899], resolution:     0}
    ...
    - source: 0
      mode: absolute
      event: motion
      mask: [ "x", "y", "pressure", "tilt-x", "tilt-y", "wheel" ]
      axes: { x: 28066, y: 17643, pressure:    0, tilt: [ -4, 56], rotation:   0, throttle:   0, wheel: -108, rings: [  0,   0]

This is YAML which means we can process the output for comparison or just to search for things.

A tool to quickly analyse data makes for faster development iterations but it's still a far cry from reliable regression testing (and writing a test suite is a daunting task at best). But one nice thing about GObject is that it's accessible from other languages, including Python. So our test suite can be in Python, using pytest and all its capabilities, plus all the advantages Python has over C. Most of driver testing comes down to: create a uinput device, set up the driver with some options, push events through that device and verify they come out of the driver in the right sequence and format. I don't need C for that. So there's pull request sitting out there doing exactly that - adding a pytest test suite for a 20-year old X driver written in C. That this is a) possible and b) a lot less work than expected got me quite unreasonably excited. If you do have to maintain an old C library, maybe consider whether's possible doing the same because there's nothing like the warm fuzzy feeling a green tick on a CI pipeline gives you.

[1] As scholars of version numbers know, they make as much sense as your stereotypical uncle's facebook opinion, so why not.
[2] The Colombian GDP probably went up a bit

What's new in XI 2.4 - touchpad gestures

2021-09-23T15:26:00.003+10:00

After a nine year hiatus, a new version of the X Input Protocol is out. Credit for the work goes to Povilas Kanapickas, who also implemented support for XI 2.4 in the various pieces of the stack [0]. So let's have a look.

X has had touch events since XI 2.2 (2012) but those were only really useful for direct touch devices (read: touchscreens). There were accommodations for indirect touch devices like touchpads but they were never used. The synaptics driver set the required bits for a while but it was dropped in 2015 because ... it was complicated to make use of and no-one seemed to actually use it anyway. Meanwhile, the rest of the world moved on and touchpad gestures are now prevalent. They've been standard in MacOS for ages, in Windows for almost ages and - with recent GNOME releases - now feature prominently on the Linux desktop as well. They have been part of libinput and the Wayland protocol for years (and even recently gained a new set of "hold" gestures). Meanwhile, X was left behind in the dust or mud, depending on your local climate.

XI 2.4 fixes this, it adds pinch and swipe gestures to the XI2 protocol and makes those available to supporting clients [2]. Notably here is that the interpretation of gestures is left to the driver [1]. The server takes the gestures and does the required state handling but otherwise has no decision into what constitutes a gesture. This is of course no different to e.g. 2-finger scrolling on a touchpad where the server just receives scroll events and passes them on accordingly.

XI 2.4 gesture events are quite similar to touch events in that they are processed as a sequence of begin/update/end with both types having their own event types. So the events you will receive are e.g. XIGesturePinchBegin or XIGestureSwipeUpdate. As with touch events, a client must select for all three (begin/update/end) on a window. Only one gesture can exist at any time, so if you are a multi-tasking octopus prepare to be disappointed.

Because gestures are tied to an indirect-touch device, the location they apply at is wherever the cursor is currently positioned. In that, they work similar to button presses, and passive grabs apply as expected too. So long-term the window manager will likely want a passive grab on the root window for swipe gestures while applications will implement pinch-to-zoom as you'd expect.

In terms of API there are no suprises. libXi 1.8 is the version to implement the new features and there we have a new XIGestureClassInfo returned by XIQueryDevice and of course the two events: XIGesturePinchEvent and XIGestureSwipeEvent. Grabbing is done via e.g. XIGrabSwipeGestureBegin, so for those of you with XI2 experience this will all look familiar. For those of you without - it's probably no longer worth investing time into becoming an XI2 expert.

Overall, it's a nice addition to the protocol and it will help getting the X server slightly closer to Wayland for a widely-used feature. Once GTK, mutter and all the other pieces in the stack are in place, it will just work for any (GTK) application that supports gestures under Wayland already. The same will be true for Qt I expect.

X server 21.1 will be out in a few weeks, xf86-input-libinput 1.2.0 is already out and so are xorgproto 2021.5 and libXi 1.8.

[0] In addition to taking on the Xorg release, so clearly there are no limits here
[1] More specifically: it's done by libinput since neither xf86-input-evdev nor xf86-input-synaptics will ever see gestures being implemented
[2] Hold gestures missed out on the various deadlines

An Xorg release without Xwayland

2021-09-22T13:16:00.001+10:00

Xorg is about to released.

And it's a release without Xwayland.

And... wait, what?

Let's unwind this a bit, and ideally you should come away with a better understanding of Xorg vs Xwayland, and possibly even Wayland itself.

Heads up: if you are familiar with X, the below is simplified to the point it hurts. Sorry about that, but as an X developer you're probably good at coping with pain.

Let's go back to the 1980s, when fashion was weird and there were still reasons to be optimistic about the future. Because this is a thought exercise, we go back with full hindsight 20/20 vision and, ideally, the winning Lotto numbers in case we have some time for some self-indulgence.

If we were to implement an X server from scratch, we'd come away with a set of components. libxprotocol that handles the actual protocol wire format parsing and provides a C api to access that (quite like libxcb, actually). That one will just be the protocol-to-code conversion layer.

We'd have a libxserver component which handles all the state management required for an X server to actually behave like an X server (nothing in the X protocol require an X server to display anything). That library has a few entry points for abstract input events (pointer and keyboard, because this is the 80s after all) and a few exit points for rendered output.

libxserver uses libxprotocol but that's an implementation detail, we can ignore the protocol for the rest of the post.

Let's create a github organisation and host those two libraries. We now have: http://github.com/x/libxserver and http://github.com/x/libxprotocol [1].

Now, to actually implement a working functional X server, our new project would link against libxserver hook into this library's API points. For input, you'd use libinput and pass those events through, for output you'd use the modesetting driver that knows how to scream at the hardware until something finally shows up. This is somewhere between outrageously simplified and unacceptably wrong but it'll do for this post.

Your X server has to handle a lot of the hardware-specifics but other than that it's a wrapper around libxserver which does the work of ... well, being an X server.

Our stack looks like this:

+------------------------+
|  xserver   [libxserver]|--------[ X client ]
|                        |
|[libinput] [modesetting]|
+------------------------+
|       kernel           |
+------------------------+

Hooray, we have re-implemented Xorg. Or rather, XFree86 because we're 20 years from all the pent-up frustratrion that caused the Xorg fork. Let's host this project on http://github.com/x/xorg

Now, let's say instead of physical display devices, we want to render into an framebuffer, and we have no input devices.

+------------------------+
|  xserver   [libxserver]|--------[ X client ]
|                        |
|            [write()]   |
+------------------------+
|       some buffer      |
+------------------------+

This is basically Xvfb or, if you are writing out PostScript, Xprint. Let's host those on github too, we're accumulating quite a set of projects here.

Now, let's say those buffers are allocated elsewhere and we're just rendering to them. And those buffer are passed to us via an IPC protocol, like... Wayland!

+------------------------+
|  xserver   [libxserver]|--------[ X client ]
|                        |
|input events    [render]|
+------------------------+
      |            |
+------------------------+
|   Wayland compositor   |
+------------------------+

And voila, we have Xwayland. If you swap out the protocol you can have Xquartz (X on Macos) or Xwin (X on Windows) or Xnext/Xephyr (X on X) or Xvnc (X over VNC). The principle is always the same.

Fun fact: the Wayland compositor doesn't need to run on the hardware, you can play display server matryoshka until you run out of turtles.

In our glorious revisioned past all these are distinct projects, re-using libxserver and some external libraries where needed. Depending on the projects things may be very simple or get very complex, it depends on how we render things.

But in the end, we have several independent projects all providing us with an X server process - the specific X bits are done in libxserver though. We can release Xwayland without having to release Xorg or Xvfb.

libxserver won't need a lot of releases, the behaviour is largely specified by the protocol requirements and once you're done implementing it, it'll be quite a slow-moving project.

Ok, now, fast forward to 2021, lose some hindsight, hope, and attitude and - oh, we have exactly the above structure. Except that it's not spread across multiple independent repos on github, it's all sitting in the same git directory: our Xorg, Xwayland, Xvfb, etc. are all sitting in hw/$name, and libxserver is basically the rest of the repo.

A traditional X server release was a tag in that git directory. An XWayland-only release is basically an rm -rf hw/*-but-not-xwayland followed by a tag, an Xorg-only release is basically an rm -rf hw/*-but-not-xfree86 [2].

In theory, we could've moved all these out into separate projects a while ago but the benefits are small and no-one has the time for that anyway.

So there you have it - you can have Xorg-only or XWayland-only releases without the world coming to an end.

Now, for the "Xorg is dead" claims - it's very likely that the current release will be the last Xorg release. [3] There is little interest in an X server that runs on hardware, or rather: there's little interest in the effort required to push out releases. Povilas did a great job in getting this one out but again, it's likely this is the last release. [4]

Xwayland - very different, it'll hang around for a long time because it's "just" a protocol translation layer. And of course the interest is there, so we have volunteers to do the releases.

So basically: expecting Xwayland releases, be surprised (but not confused) by Xorg releases.

[1] Github of course doesn't exist yet because we're in the 80s. Time-travelling is complicated.
[2] Historical directory name, just accept it.
[3] Just like the previous release...
[4] At least until the next volunteer steps ups. Turns out the problem "no-one wants to work on this" is easily fixed by "me! me! I want to work on this". A concept that is apparently quite hard to understand in the peanut gallery.

libinput and high-resolution wheel scrolling

2021-08-31T17:50:00.000+10:00

Gut Ding braucht Weile. Almost three years ago, we added high-resolution wheel scrolling to the kernel (v5.0). The desktop stack however was first lagging and eventually left behind (except for an update a year ago or so, see here). However, I'm happy to announce that thanks to José Expósito's efforts, we now pushed it across the line. So - in a socially distanced manner and masked up to your eyebrows - gather round children, for it is storytime.

Historical History

In the beginning, there was the wheel detent. Or rather there were 24 of them, dividing a 360 degree [1] movement of a wheel into a neat set of 15 clicks. libinput exposed those wheel clicks as part of the "pointer axis" namespace and you could get the click count with libinput_event_pointer_get_axis_discrete() (announced here). The degree value is exposed as libinput_event_pointer_get_axis_value(). Other scroll backends (finger-scrolling or button-based scrolling) expose the pixel-precise value via that same function.

In a "recent" Microsoft Windows version (Vista!), MS added the ability for wheels to trigger more than 24 clicks per rotation. The MS Windows API now treats one "traditional" wheel click as a value of 120, anything finer-grained will be a fraction thereof. You may have a mouse that triggers quarter-wheel clicks, each sending a value of 30. This makes for smoother scrolling and is supported(-ish) by a lot of mice introduced in the last 10 years [2]. Obviously, three small scrolls are nicer than one large scroll, so the UX is less bad than before.

Now it's time for libinput to catch up with Windows Vista! For $reasons, the existing pointer axis API could get changed to accommodate for the high-res values, so a new API was added for scroll events. Read on for the details, you will believe what happens next.

Out with the old, in with the new

As of libinput 1.19, libinput has three new events: LIBINPUT_EVENT_POINTER_SCROLL_WHEEL, LIBINPUT_EVENT_POINTER_SCROLL_FINGER, and LIBINPUT_EVENT_POINTER_SCROLL_CONTINUOUS. These events reflect, perhaps unsuprisingly, scroll movements of a wheel, a finger or along a continuous axis (e.g. button scrolling). And they replace the old event LIBINPUT_EVENT_POINTER_AXIS. Those familiar with libinput will notice that the new event names encode the scroll source in the event name now. This makes them slightly more flexible and saves callers an extra call.

In terms of actual API, the new events have two new functions: libinput_event_pointer_get_scroll_value(). For the FINGER and CONTINUOUS events, the value returned is in "pixels" [3]. For the new WHEEL events, the value is in degrees. IOW this is a drop-in replacement for the old libinput_event_pointer_get_axis_value() function. The second call is libinput_event_pointer_get_scroll_value_v120() which, for WHEEL events, returns the 120-based logical units the kernel uses as well. libinput_event_pointer_has_axis() returns true if the given axis has a value, just as before. With those three calls you now get the data for the new events.

Backwards compatibility

To ensure backwards compatibility, libinput generates both old and new events so the rule for callers is: if you want to support the new events, just ignore the old ones completely. libinput also guarantees new events even on pre-5.0 kernels. This makes the old and new code easy to ifdef out, and once you get past the immediate event handling the code paths are virtually identical.

When, oh when?

These changes have been merged into the libinput main branch and will be part of libinput 1.19. Which is due to be released over the next month or so, so feel free to work backwards from that for your favourite distribution.

Having said that, libinput is merely the lowest block in the Jenga tower that is the desktop stack. José linked to the various MRs in the upstream libinput MR, so if you're on your seat's edge waiting for e.g. GTK to get this, well, there's an MR for that.

[1] That's degrees of an angle, not Fahrenheit
[2] As usual, on a significant number of those you'll need to know whatever proprietary protocol the vendor deemed to be important IP. Older MS mice stand out here because they use straight HID.
[3] libinput doesn't really have a concept of pixels, but it has a normalized pixel that movements are defined as. Most callers take that as real pixels except for the high-resolution displays where it's appropriately scaled.

Flatpak portals - how do they work?

2021-08-31T16:29:00.002+10:00

I've been working on portals recently and one of the issues for me was that the documentation just didn't quite hit the sweet spot. At least the bits I found were either too high-level or too implementation-specific. So here's a set of notes on how a portal works, in the hope that this is actually correct.

First, Portals are supposed to be a way for sandboxed applications (flatpaks) to trigger functionality they don't have direct access too. The prime example: opening a file without the application having access to $HOME. This is done by the applications talking to portals instead of doing the functionality themselves.

There is really only one portal process: /usr/libexec/xdg-desktop-portal, started as a systemd user service. That process owns a DBus bus name (org.freedesktop.portal.Desktop) and an object on that name (/org/freedesktop/portal/desktop). You can see that bus name and object with D-Feet, from DBus' POV there's nothing special about it. What makes it the portal is simply that the application running inside the sandbox can talk to that DBus name and thus call the various methods. Obviously the xdg-desktop-portal needs to run outside the sandbox to do its things.

There are multiple portal interfaces, all available on that one object. Those interfaces have names like org.freedesktop.portal.FileChooser (to open/save files). The xdg-desktop-portal implements those interfaces and thus handles any method calls on those interfaces. So where an application is sandboxed, it doesn't implement the functionality itself, it instead calls e.g. the OpenFile() method on the org.freedesktop.portal.FileChooser interface. Then it gets an fd back and can read the content of that file without needing full access to the file system.

Some interfaces are fully handled within xdg-desktop-portal. For example, the Camera portal checks a few things internally, pops up a dialog for the user to confirm access to if needed [1] but otherwise there's nothing else involved with this specific method call.

Other interfaces have a backend "implementation" DBus interface. For example, the org.freedesktop.portal.FileChooser interface has a org.freedesktop.impl.portal.FileChooser (notice the "impl") counterpart. xdg-desktop-portal does not implement those impl.portals. xdg-desktop-portal instead routes the DBus calls to the respective "impl.portal". Your sandboxed application calls OpenFile(), xdg-desktop-portal now calls OpenFile() on org.freedesktop.impl.portal.FileChooser. That interface returns a value, xdg-desktop-portal extracts it and returns it back to the application in respones to the original OpenFile() call.

What provides those impl.portals doesn't matter to xdg-desktop-portal, and this is where things are hot-swappable. ~~GTK and Qt both provide (some of) those impl portals,~~ There are GTK and Qt-specific portals with xdg-desktop-portal-gtk and xdg-desktop-portal-kde but another one is provided by GNOME Shell directly. You can check the files in /usr/share/xdg-desktop-portal/portals/ and see which impl portal is provided on which bus name. The reason those impl.portals exist is so they can be native to the desktop environment - regardless what application you're running and with a generic xdg-desktop-portal, you see the native file chooser dialog for your desktop environment.

So the full call sequence is:

At startup, xdg-desktop-portal parses the /usr/libexec/xdg-desktop-portal/*.portal files to know which impl.portal interface is provided on which bus name
The application calls OpenFile() on the org.freedesktop.portal.FileChooser interface on the object path /org/freedesktop/portal/desktop. It can do so because the bus name this object sits on is not restricted by the sandbox
xdg-desktop-portal receives that call. This is portal with an impl.portal so xdg-desktop-portal calls OpenFile() on the bus name that provides the org.freedesktop.impl.portal.FileChooser interface (as previously established by reading the *.portal files)
Assuming xdg-desktop-portal-gtk provides that portal at the moment, that process now pops up a GTK FileChooser dialog that runs outside the sandbox. User selects a file
xdg-desktop-portal-gtk sends back the fd for the file to the xdg-desktop-portal, and the impl.portal parts are done
xdg-desktop-portal receives that fd and sends it back as reply to the OpenFile() method in the normal portal
The application receives the fd and can read the file now

A few details here aren't fully correct, but it's correct enough to understand the sequence - the exact details depend on the method call anyway.

Finally: because of DBus restrictions, the various methods in the portal interfaces don't just reply with values. Instead, the xdg-desktop-portal creates a new org.freedesktop.portal.Request object and returns the object path for that. Once that's done the method is complete from DBus' POV. When the actual return value arrives (e.g. the fd), that value is passed via a signal on that Request object, which is then destroyed. This roundabout way is done for purely technical reasons, regular DBus methods would time out while the user picks a file path.

Anyway. Maybe this helps someone understanding how the portal bits fit together.

[1] it does so using another portal but let's ignore that
[2] not really hot-swappable though. You need to restart xdg-desktop-portal but not your host. So luke-warm-swappable only

Edit Sep 01: clarify that it's not GTK/Qt providing the portals, but xdg-desktop-portal-gtk and -kde

libei - a status update

2021-08-25T15:29:00.001+10:00

A year ago, I first announced libei - a library to support emulated input. After an initial spurt of development, it was left mostly untouched until a few weeks ago. Since then, another flurry of changes have been added, including some initial integration into GNOME's mutter. So, let's see what has changed.

A Recap

First, a short recap of what libei is: it's a transport layer for emulated input events to allow for any application to control the pointer, type, etc. But, unlike the XTEST extension in X, libei allows the compositor to be in control over clients, the devices they can emulate and the input events as well. So it's safer than XTEST but also a lot more flexible. libei already supports touch and smooth scrolling events, something XTest doesn't have or is struggling with.

Terminology refresher: libei is the client library (used by an application wanting to emulate input), EIS is the Emulated Input Server, i.e. the part that typically runs in the compositor.

Server-side Devices

So what has changed recently: first, the whole approach has flipped on its head - now a libei client connects to the EIS implementation and "binds" to the seats the EIS implementation provides. The EIS implementation then provides input devices to the client. In the simplest case, that's just a relative pointer but we have capabilities for absolute pointers, keyboards and touch as well. Plans for the future is to add gestures and tablet support too. Possibly joysticks, but I haven't really thought about that in detail yet.

So basically, the initial conversation with an EIS implementation goes like this:

Client: Hello, I am $NAME
Server: Hello, I have "seat0" and "seat1"
Client: Bind to "seat0" for pointer, keyboard and touch
Server: Here is a pointer device
Server: Here is a keyboard device
Client: Send relative motion event 10/2 through the pointer device

Notice how the touch device is missing? The capabilities the client binds to are just what the client wants, the server doesn't need to actually give the client a device for that capability.

One of the design choices for libei is that devices are effectively static. If something changes on the EIS side, the device is removed and a new device is created with the new data. This applies for example to regions and keymaps (see below), so libei clients need to be able to re-create their internal states whenever the screen or the keymap changes.

Device Regions

Devices can now have regions attached to them, also provided by the EIS implementation. These regions define areas reachable by the device and are required for clients such as Barrier. On a dual-monitor setup you may have one device with two regions or two devices with one region (representing one monitor), it depends on the EIS implementation. But either way, as libei client you will know that there is an area and you will know how to reach any given pixel on that area. Since the EIS implementation decides the regions, it's possible to have areas that are unreachable by emulated input (though I'm struggling a bit for a real-world use-case).

So basically, the conversation with an EIS implementation goes like this:

Client: Hello, I am $NAME
Server: Hello, I have "seat0" and "seat1"
Client: Bind to "seat0" for absolute pointer
Server: Here is an abs pointer device with regions 1920x1080@0,0, 1080x1920@1920,0
Server: Here is an abs pointer device with regions 1920x1080@0,0
Server: Here is an abs pointer device with regions 1080x1920@1920,0
Client: Send abs position 100/100 through the second device

Notice how we have three absolute devices? A client emulating a tablet that is mapped to a screen could just use the third device. As with everything, the server decides what devices are created and the clients have to figure out what they want to do and how to do it.

Perhaps unsurprisingly, the use of regions make libei clients windowing-system independent. The Barrier EI support WIP no longer has any Wayland-specific code in it. In theory, we could implement EIS in the X server and libei clients would work against that unmodified.

Keymap handling

The keymap handling has been changed so the keymap too is provided by the EIS implementation now, effectively in the same way as the Wayland compositor provides the keymap to Wayland clients. This means a client knows what keycodes to send, it can handle the state to keep track of things, etc. Using Barrier as an example again - if you want to generate an "a", you need to look up the keymap to figure out which keycode generates an A, then you can send that through libei to actually press the key.

Admittedly, this is quite messy. XKB (and specifically libxkbcommon) does not make it easy to go from a keysym to a key code. The existing Barrier X code is full of corner-cases with XKB already, I espect those to be necessary for the EI support as well.

Scrolling

Scroll events have four types: pixel-based scrolling, discrete scrolling, and scroll stop/cancel events. The first should be obvious, discrete scrolling is for mouse wheels. It uses the same 120-based API that Windows (and the kernel) use, so it's compatible with high-resolution wheel mice. The scroll stop event notifies an EIS implementation that the scroll interaction has stopped (e.g. lifting fingers off) which in turn may start kinetic scrolling - just like the libinput/Wayland scroll stop events. The scroll cancel event notifies the EIS implementation that scrolling really has stopped and no kinetic scrolling should be triggered. There's no equivalent in libinput/Wayland for this yet but it helps to get the hook in place.

Emulation "Transactions"

This has fairly little functional effect, but interactions with an EIS server are now sandwiched in a start/stop emulating pair. While this doesn't matter for one-shot tools like xdotool, it does matter for things like Barrier which can send the start emulating event when the pointer enters the local window. This again allows the EIS implementation to provide some visual feedback to the user. To correct the example from above, the sequence is actually:

...
Server: Here is a pointer device
Client: Start emulating
Client: Send relative motion event 10/2 through the pointer device
Client: Send relative motion event 1/4 through the pointer device
Client: Stop emulating

Properties

Finally, there is now a generic property API, something copied from PipeWire. Properties are simple key/value string pairs and cover those things that aren't in the immediate API. One example here: the portal can set things like "ei.application.appid" to the Flatpak's appid. Properties can be locked down and only libei itself can set properties before the initial connection. This makes them reliable enough for the EIS implementation to make decisions based on their values. Just like with PipeWire, the list of useful properties will grow over time. it's too early to tell what is really needed.

Repositories

Now, for the actual demo bits: I've added enough support to Barrier, XWayland, Mutter and GNOME Shell that I can control a GNOME on Wayland session through Barrier (note: the controlling host still needs to run X since we don't have the ability to capture input events under Wayland yet). The keymap handling in Barrier is nasty but it's enough to show that it can work.

GNOME Shell has a rudimentary UI, again just to show what works:

The status icon shows ... if libei clients are connected, it changes to !!! while the clients are emulating events. Clients are listed by name and can be disconnected at will. I am not a designer, this is just a PoC to test the hooks.

Note how xdotool is listed in this screenshot: that tool is unmodified, it's the XWayland libei implementation that allows it to work and show up correctly

The various repositories are in the "wip/ei" branch of:

And of course libei itself.

Where to go from here? The last weeks were driven by rapid development, so there's plenty of test cases to be written to make sure the new code actually works as intended. That's easy enough. Looking at the Flatpak integration is another big ticket item, once the portal details are sorted all the pieces are (at least theoretically) in place. That aside, improving the integrations into the various systems above is obviously what's needed to get this working OOTB on the various distributions. Right now it's all very much in alpha stage and I could use help with all of those (unless you're happy to wait another year or so...). Do ping me if you're interested to work on any of this.

It's templates all the way down - part 4

2021-07-28T13:22:00.000+10:00

Part 1, Part 2, Part 3

After getting thouroughly nerd-sniped a few weeks back, we now have FreeBSD support through qemu in the freedesktop.org ci-templates. This is possible through the qemu image generation we have had for quite a while now. So let's see how we can easily add a FreeBSD VM (or other distributions) to our gitlab CI pipeline:

.freebsd:
  variables:
     FDO_DISTRIBUTION_VERSION: '13.0'
     FDO_DISTRIBUTION_TAG: 'freebsd.0' # some value for humans to read
     
build-image:
  extends:
     - .freebsd
     - .fdo.qemu-build@freebsd
  variables:
    FDO_DISTRIBUTION_PACKAGES: "curl wget"

Now, so far this may all seem quite familiar. And indeed, this is almost exactly the same process as for normal containers (see Part 1), the only difference is the .fdo.qemu-build base template. Using this template means we build an image babushka: our desired BSD image is actual a QEMU RAW image sitting inside another generic container image. That latter image only exists to start the QEMU image and set up the environment if need be, you don't need to care what distribution it runs out (Fedora for now).

Because of the nesting, we need to handle this accordingly in our script: tag for the actual test job - we need to start the image and make sure our jobs are actually built within. The templates set up an ssh alias "vm" for this and the vmctl script helps to do things on the vm:

test-build:
  extends:
    - .freebsd
    - .fdo.distribution-image@freebsd
  script:
    # start our QEMU image
    - /app/vmctl start
    
    # copy our current working directory to the VM
    # (this is a yaml multiline command to work around the colon)
    - |
      scp -r $PWD vm:
      
    # Run the build commands on the VM and if they succeed, create a .success file
    - /app/vmctl exec "cd $CI_PROJECT_NAME; meson builddir; ninja -C builddir" && touch .success || true
    
    # Copy results back to our run container so we can include them in artifacts:
    - |
      scp -r vm:$CI_PROJECT_NAME/builddir .
      
    # kill the VM
    - /app/vmctl stop
    
    # Now that we have cleaned up: if our build job before
    # failed, exit with an error
    - [[ -e .success ]] || exit 1

Now, there's a bit to unpack but with the comments above it should be fairly obvious what is happening. We start the VM, copy our working directory over and then run a command on the VM before cleaning up. The reason we use touch .success is simple: it allows us to copy things out and clean up before actually failing the job.

Obviously, if you want to build any other distribution you just swap the freebsd out for fedora or whatever - the process is the same. libinput has been using fedora qemu images for ages now.

libinput and hold gestures

2021-07-27T15:58:00.001+10:00

Thanks to the work done by Josè Expòsito, libinput 1.19 will ship with a new type of gesture: Hold Gestures. So far libinput supported swipe (moving multiple fingers in the same direction) and pinch (moving fingers towards each other or away from each other). These gestures are well-known, commonly used, and familiar to most users. For example, GNOME 40 recently has increased its use of touchpad gestures to switch between workspaces, etc. Swipe and pinch gestures require movement, it was not possible (for callers) to detect fingers on the touchpad that don't move.

This gap is now filled by Hold gestures. These are triggered when a user puts fingers down on the touchpad, without moving the fingers. This allows for some new interactions and we had two specific ones in mind: hold-to-click, a common interaction on older touchscreen interfaces where holding a finger in place eventually triggers the context menu. On a touchpad, a three-finger hold could zoom in, or do dictionary lookups, or kill a kitten. Whatever matches your user interface most, I guess.

The second interaction was the ability to stop kinetic scrolling. libinput does not actually provide kinetic scrolling, it merely provides the information needed in the client to do it there: specifically, it tells the caller when a finger was lifted off a touchpad at the end of a scroll movement. It's up to the caller (usually: the toolkit) to implement the kinetic scrolling effects. One missing piece was that while libinput provided information about lifting the fingers, it didn't provide information about putting fingers down again later - a common way to stop scrolling on other systems.

Hold gestures are intended to address this: a hold gesture triggered after a flick with two fingers can now be used by callers (read: toolkits) to stop scrolling.

Now, one important thing about hold gestures is that they will generate a lot of false positives, so be careful how you implement them. The vast majority of interactions with the touchpad will trigger some movement - once that movement hits a certain threshold the hold gesture will be cancelled and libinput sends out the movement events. Those events may be tiny (depending on touchpad sensitivity) so getting the balance right for the aforementioned hold-to-click gesture is up to the caller.

As usual, the required bits to get hold gestures into the wayland protocol are either in the works, mid-flight or merge-ready so expect this to hit the various repositories over the medium-term future.

A pre-supplied "custom" keyboard layout for X11

2021-02-18T11:57:00.003+10:00

Last year I wrote about how to create a user-specific XKB layout, followed by a post explaining that this won't work in X. But there's a pandemic going on, which is presumably the only reason people haven't all switched to Wayland yet. So it was time to figure out a workaround for those still running X.

This Merge Request (scheduled for xkeyboard-config 2.33) adds a "custom" layout to the evdev.xml and base.xml files. These XML files are parsed by the various GUI tools to display the selection of available layouts. An entry in there will thus show up in the GUI tool.

Our rulesets, i.e. the files that convert a layout/variant configuration into the components to actually load already have wildcard matching [1]. So the custom layout will resolve to the symbols/custom file in your XKB data dir - usually /usr/share/X11/xkb/symbols/custom.

This file is not provided by xkeyboard-config. It can be created by the user though and whatever configuration is in there will be the "custom" keyboard layout. Because xkeyboard-config does not supply this file, it will not get overwritten on update.

From XKB's POV it is just another layout and it thus uses the same syntax. For example, to override the +/* key on the German keyboard layout with a key that produces a/b/c/d on the various Shift/Alt combinations, use this:

default
xkb_symbols "basic" {
    include "de(basic)"
    key <AD12>  { [      a,   b,   c,  d ]      };
};

This example includes the "basic" section from the symbols/de file (i.e. the default German layout), then overrides the 12th alphanumeric key from left in the 4th row from bottom (D) with the given symbols. I'll leave it up to the reader to come up with a less useful example.

There are a few drawbacks:

If the file is missing and the user selects the custom layout, the results are... undefined. For run-time configuration like GNOME it doesn't really matter - the layout compilation fails and you end up with the one the device already had (i.e. the default one built into X, usually the US layout).
If the file is missing and the custom layout is selected in the xorg.conf, the results are... undefined. I tested it and ended up with the US layout but that seems more by accident than design. My recommendation is to not do that.
No variants are available in the XML files, so the only accessible section is the one marked default.
If a commandline tool uses a variant of custom, the GUI will not reflect this. If the GUI goes boom, that's a bug in the GUI.

So overall, it's a hack[2]. But it's a hack that fixes real user issues and given we're talking about X, I doubt anyone notices another hack anyway.

[1] If you don't care about GUIs, setxkbmap -layout custom -variant foobar has been possible for years.
[2] Sticking with the UNIX principle, it's a hack that fixes the issue at hand, is badly integrated, and weird to configure.

Auto-updating XKB for new kernel keycodes

2021-01-22T10:58:00.000+10:00

Your XKB keymap contains two important parts. One is the mapping from the hardware scancode to some internal representation, for example:

  <AB10> = 61;

Which basically means Alphanumeric key in row B (from bottom), 10th key from the left. In other words: the /? key on a US keyboard.

The second part is mapping that internal representation to a keysym, for example:

  key <AB10> {        [     slash,    question        ]       };

This is the actual layout mapping - once in place this key really produces a slash or question mark (on level2, i.e. when Shift is down).

This two-part approach exists so either part can be swapped without affecting the other. Swap the second part to an exclamation mark and paragraph symbol and you have the French version of this key, swap it to dash/underscore and you have the German version of the key - all without having to change the keycode.

Back in the golden days of everyone-does-what-they-feel-like, keyboard manufacturers (presumably happily so) changed the key codes and we needed model-specific keycodes in XKB. The XkbModel configuration is a leftover from these trying times.

The Linux kernel's evdev API has largely done away with this. It provides a standardised set of keycodes, defined in linux/input-event-codes.h, and ensures, with the help of udev [0], that all keyboards actually conform to that. An evdev XKB keycode is a simple "kernel keycode + 8" [1] and that applies to all keyboards. On top of that, the kernel uses semantic definitions for the keys as they'd be in the US layout. KEY_Q is the key that would, behold!, produce a Q. Or an A in the French layout because they just have to be different, don't they? Either way, with evdev the Xkb Model configuration largely points to nothing and only wastes a few cycles with string parsing.

The second part, the keysym mapping, uses two approaches. One is to use a named #define like the "slash", "question" outlined above (see X11/keysymdef.h for the defines). The other is to use unicode directly like this example from the Devangari layout:

  key <AB10> { [ U092f, U095f, slash, question ] };

As you can see, mix and match is available too. Using Unicode code points of course makes the layouts less immediately readable but on the other hand we don't need to #define the whole of Unicode. So from a maintenance perspective it's a win.

However, there's a third type of key that we care about: functional keys. Those are the multimedia (historically: "internet") keys that most devices have these days. Volume up, touchpad on/off, cycle display connectors, etc. Those keys are special in that they don't have a Unicode representation and they are always mapped to the same fixed functionality. Even Dvorak users want their volume keys to do what it says on the key.

Because they have no Unicode code points, those keys are defined, historically, in XF86keysyms.h:

  #define XF86XK_MonBrightnessUp    0x1008FF02  /* Monitor/panel brightness */

And mapping a key like this looks like this [2]:

  key <I21>   {       [ XF86Calculator        ] };

The only drawback: every key needs to be added manually. This has been done for some, but not for others. And some keys were added with different names than what the kernel uses [3].

So we're in this weird situation where we have a flexible keymap system but the kernel already tells us what a key does anyway and we don't want to change that. Virtually all keys added in the last decade or so falls into that group of keys, but to actually make use of them requires a #define in xorgproto and an update to the keycodes and symbols in xkeyboard-config. That again introduces discrepancies and we end up in the situation where we're at right now: some keys don't work until someone files a bug, and then the users still need to wait for several components to be released and those releases trickle into the distributions.

10 years ago would've been a good time to make this more efficient. The situation wasn't that urgent then, most of the kernel keycodes added are >255 which means they cannot be used in X anyway. [4] The second best time to do it is now. What we need is basically a pass-through from kernel code to symbol and that's currently sitting in various MRs:

- xkeyboard-config can generate the keycodes/evdev file based on the list of kernel keycodes, so all kernel keycodes are mapped to internal representations by default

- xorgproto has reserved a range within the XF86 keysym reserved range for pass-through mappings, i.e. any KEY_FOO define from the kernel is mapped to XF86XK_Foo with a specific value [5]. The #define format is fixed so it can be parsed.

- xkeyboard-config parses theses XF86 keysyms and sets up a keysym mapping in the default keymap.

This is semi-automatic, i.e. there are helper scripts that detect changes and notify us, hooked into the CI, but the actual work must be done manually. These keysyms immediately become set-in-stone API so we don't want some unsupervised script to go wild on them.

There's a huge backlog of keys to be added (dating to kernels pre-v3.18) and I'll go through them one-by-one over the next weeks to make sure they're correct. But eventually they'll be done and we have a full keymap for all kernel keys to be immediately available in the XKB layout.

The last part of all of this is a calendar reminder for me to do this after every new kernel release. Let's hope this crucial part isn't the first to fail.

[0] 60-keyboard.hwdb has a mere ~1800 lines!
[1] Historical reasons, you don't want to know. *jedi wave*
[2] the XK_ part of the key name is dropped, implementation detail.
[3] This can also happen when a kernel define is renamed/aliased but we cannot easily do so for this header.
[4] X has an 8 bit keycode limit and that won't change until someone develops XKB2 with support for 32-bit keycodes, i.e. never.
[5] The actual value is an implementation detail and no client must care

Parsing HID Unit Items

2021-01-13T21:28:00.001+10:00

This post explains how to parse the HID Unit Global Item as explained by the HID Specification, page 37. The table there is quite confusing and it took me a while to fully understand it (Benjamin Tissoires was really the one who cracked it). I couldn't find any better explanation online which means either I'm incredibly dense and everyone's figured it out or no-one has posted a better explanation. On the off-chance it's the latter [1], here are the instructions on how to parse this item.

We know a HID Report Descriptor consists of a number of items that describe the content of each HID Report (read: an event from a device). These Items include things like Logical Minimum/Maximum for axis ranges, etc. A HID Unit item specifies the physical unit to apply. For example, a Report Descriptor may specify that X and Y axes are in mm which can be quite useful for all the obvious reasons.

Like most HID items, a HID Unit Item consists of a one-byte item tag and 1, 2 or 4 byte payload. The Unit item in the Report Descriptor itself has the binary value 0110 01nn where the nn is either 1, 2, or 3 indicating 1, 2 or 4 bytes of payload, respectively. That's standard HID.

The payload is divided into nibbles (4-bit units) and goes from LSB to MSB. The lowest-order 4 bits (first byte & 0xf) define the unit System to apply: one of SI Linear, SI Rotation, English Linear or English Rotation (well, or None/Reserved). The rest of the nibbles are in this order: "length", "mass", "time", "temperature", "current", "luminous intensity". In something resembling code this means:

 system = value & 0xf
 length_exponent = (value & 0xf0) >> 4
 mass_exponent   = (value & 0xf00) >> 8
 time_exponent   = (value & 0xf000) >> 12
 ...

The System defines which unit is used for length (e.g. SILinear means length is in cm). The actual value of each nibble is the exponent for the unit in use [2]. In something resembling code:

 switch (system)
   case SILinear: 
     print("length is in cm^{length_exponent}"); 
     break;
   case SIRotation: 
     print("length is in rad^{length_exponent}");
     break;
   case EnglishLinear: 
     print("length is in in^{length_exponent}");
     break;
   case EnglishRotation:
     print("length is in deg^{length_exponent}");
     break;
   case None:
   case Reserved"
     print("boo!");
     break;

For example, the value 0x321 means "SI Linear" (0x1) so the remaining nibbles represent, in ascending nibble order: Centimeters, Grams, Seconds, Kelvin, Ampere, Candela. The length nibble has a value of 0x2 so it's square cm, the mass nibble has a value of 0x3 so it is cubic grams (well, it's just an example, so...). This means that any report containing this item comes in cm²g³. As a more realistic example: 0xF011 would be cm/s.

If we changed the lowest nibble to English Rotation (0x4), i.e. our value is now 0x324, the units represent: Degrees, Slug, Seconds, F, Ampere, Candela [3]. The length nibble 0x2 means square degrees, the mass nibble is cubic slugs. As a more realistic example, 0xF014 would be degrees/s.

Any nibble with value 0 means the unit isn't in use, so the example from the spec with value 0x00F0D121 is SI linear, units cm² g s⁻³ A⁻¹, which is... Voltage! Of course you knew that and totally didn't have to double-check with wikipedia.

Because bits are expensive and the base units are of course either too big or too small or otherwise not quite right, HID also provides a Unit Exponent item. The Unit Exponent item (a separate item to Unit in the Report Descriptor) then describes the exponent to be applied to the actual value in the report. For example, a Unit Eponent of -3 means 10⁻³ to be applied to the value. If the report descriptor specifies an item of Unit 0x00F0D121 (i.e. V) and Unit Exponent -3, the value of this item is mV (milliVolt), Unit Exponent of 3 would be kV (kiloVolt).

Now, in hindsight all this is pretty obvious and maybe even sensible. It'd have been nice if the spec would've explained it a bit clearer but then I would have nothing to write about, so I guess overall I call it a draw.

[1] This whole adventure was started because there's a touchpad out there that measures touch pressure in radians, so at least one other person out there struggled with the docs...
[2] The nibble value is twos complement (i.e. it's a signed 4-bit integer). Values 0x1-0x7 are exponents 1 to 7, values 0x8-0xf are exponents -8 to -1.
[3] English Linear should've trolled everyone and use Centimetres instead of Centimeters in SI Linear.