Friday, January 6, 2023

X servers no longer allow byte-swapped clients (by default)

In the beginning, there was the egg. Then fictional people started eating that from different ends, and the terms of "little endians" and "Big Endians" was born.

Computer architectures (mostly) come with one of either byte order: MSB first or LSB first. The two are incompatible of course, and many a bug was introduced trying to convert between the two (or, more common: failing to do so). The two byte orders were termed Big Endian and little endian, because that hilarious naming scheme at least gives us something to laugh about while contemplating throwing it all away and considering a future as, I don't know, a strawberry plant.

Back in the mullet-infested 80s when the X11 protocol was designed both little endian and big endian were common enough. And back then running the X server on a different host than the client was common too - the X terminals back then had less processing power than a smart toilet seat today so the cpu-intensive clients were running on some mainfraime. To avoid overtaxing the poor mainframe already running dozens of clients for multiple users, the job of converting between the two byte orders was punted to the X server. So to this day whenever a client connects, the first byte it sends is a literal "l" or "B" to inform the server of the client's byte order. Where the byte order doesn't match the X server's byte order, the client is a "swapped client" in X server terminology and all 16, 32, and 64-bit values must be "byte-swapped" into the server's byte order. All of those values in all requests, and then again back to the client's byte order in all outgoing replies and events. Forever, till a crash do them part.

If you get one of those wrong, the number is no longer correct. And it's properly wrong too, the difference between 0x1 and 0x01000000 is rather significant. [0] Which has the hilarious side-effect of... well, pretty much anything. But usually it ranges from crashing the server (thus taking all other clients down in commiseration) to leaking random memory locations. The list of security issues affecting the various SProcFoo implementations (X server naming scheme for Swapped Procedure for request Foo) is so long that I'm too lazy to pull out the various security advisories and link to them. Just believe me, ok? *jedi handwave*

These days, encountering a Big Endian host is increasingly niche, letting it run an X client that connects to your local little-endian X server is even more niche [1]. I think the only regular real-world use-case for this is running X clients on an s390x, connecting to your local intel-ish (and thus little endian) workstation. Not something most users do on a regular basis. So right now, the byte-swapping code is mainly a free attack surface that 99% of users never actually use for anything real. So... let's not do that?

I just merged a PR into the X server repo that prohibits byte-swapped clients by default. A Big Endian client connecting to an X server will fail the connection with an error message of "Prohibited client endianess, see the Xserver man page". [2] Thus, a whole class of future security issues avoided - yay!

For the use-cases where you do need to let Big Endian clients connect to your little endian X server, you have two options: start your X server (Xorg, Xwayland, Xnest, ...) with the +byteswappedclients commandline option. Alternatively, and this only applies for Xorg: add Option "AllowByteSwappedClients" "on" to the xorg.conf ServerFlags section. Both of these will change the default back to the original setting. Both are documented in the Xserver(1) and xorg.conf(5) man pages, respectively.

Now, there's a drawback: in the Wayland stack, the compositor is in charge of starting Xwayland which means the compositor needs to expose a way of passing +byteswappedclients to Xwayland. This is compositor-specific, bugs are filed for mutter (merged for GNOME 44), kwin and wlroots. Until those are addressed, you cannot easily change this default (short of changing /usr/bin/Xwayland into a wrapper script that passes the option through).

There's no specific plan yet which X releases this will end up in, primarily because the release cycle for X is...undefined. Probably xserver-23.0 if and when that happens. It'll probably find its way into the xwayland-23.0 release, if and when that happens. Meanwhile, distributions interested in this particular change should consider backporting it to their X server version. This has been accepted as a Fedora 38 change.

[0] Also, it doesn't help that much of the X server's protocol handling code was written with the attitude of "surely the client wouldn't lie about that length value"
[1] little-endian client to Big Endian X server is so rare that it's barely worth talking about. But suffice to say, the exact same applies, just with little and big swapped around.
[2] That message is unceremoniously dumped to stderr, but that bit is unfortunately a libxcb issue.

11 comments:

hobbified said...

Having it be an option that's turned off by default just ensures that it will rapidly end up broken because of its "niche"ness, and then removed because it keeps causing breakage. Do it all the way or not at all.

ClassicHasClass said...

An opposing viewpoint to the comment above: as a current Fedora user on ppc64le, I actually have big-endian Power clients that *do* connect to my desktop X server, including a ppc64 AIX POWER6 which can only run big. I'm sad to see this and I don't disagree that it's likely to rot (and it makes Wayland even more unattractive than it already is), but I appreciate having the option instead of having it removed entirely and I'll be adding this for F38 if the change request is accepted.

Alex said...

How does something so horribly wrong get merged so easily?

Removing default support for byte-swapping is a tragic step towards removing a key abstraction of X without offering any suitable replacement - having to RESTART THE X SERVER to allow a big-endian client to connect is DAFT. I, for one, do NOT want to have to shut down ~100 windows and restart a server just to allow one, critical client host to connect, but that what this patch will do.

This should have been been something controllable during run, most likely from xset.

Historically, were it not for having so little CPU power in general at the time, it *should* have been the case that all traffic between X server and clients would be serialized to a standard network order, but allowing for byte-swapping made life computationally easier for the few little-endian hosts, and the goal of full interoperation was (mostly) achieved (I'm ignoring the middle-endian guys here).

With this patch, instead of following the current best practice that a defined network order is the answer (we have far more CPU power to burn now), we have what used to be the toys trying to shut out their creators.

This patch undermines interoperability by tossing up the straw man of security to cover that the patch concept is flawed from the outset. It implies that we can trust some numbers more because the didn't need to be byte-swapped, but that's an illusion, since none of them can be trusted. This patch should have been rejected.

awilfox said...

Building on both ClassicHasClass and Alex above:

I use ppc64 on my Talos II, a desktop from 2019, which means my X server is big-endian. I use Qemu-user to run WINE and Spotify on it, and while it is not particularly performant, it does work and lets me run some software without needing a physical x86 or Arm system. I can also just X forward Spotify from an Arm box, too.

I do appreciate that this is an option instead of being outright removed, but this use case is larger than you probably realise. Xorg is significantly more reliable than Wayland on big-endian systems, so these sorts of things affect a larger portion of the user base there.

That said, making it something you could set using `xset` or similar is something that perhaps we could work on as a community, so that it wouldn't require a server restart. Moving forward, we could also try to standardise the Xorg implementation of the X protocol on a single byte order. If there is already no expectation that foreign endianness clients can connect to the server, there's no reason that the client protocol implementation couldn't always use a single endianness. Then the only foreign endianness clients would be other clients running on non-Xorg implementations of X11.

I would probably recommend that the server should use network byte order, simply because the majority of non-Xorg implementations are going to be Xsun (BE only) and AIX (BE only), so then you still retain compatibility with those clients for "free".

Also, as an additional comment: please change the title of this post to "by default". The title caused a lot of panic in the Adélie Linux community, because it made it sound like cross-endian X11 was actually removed, instead of just made optional.

Peter Hutterer said...

> please change the title of this post to "by default".

Done!

Peter Hutterer said...

xset is not some magical tool, it's just a CLI for existing protocol requests. to add new functionality, you'd have to add the respective protocol extension and implement support for it in libxcb, libX11 and the server. *then* you can add it to xset. Let's say the easy case, adding it to XFixes and you're looking at 6+ months to get this into an *upstream* release. Let alone into a distro. And none of the current X developers will be willing to do this work.

So let's say you implement it all and it's ready to go. Except: now our malicious client just needs to call XFixesAllowSwappedClients() or whatever it's called and bam, the whole effort was for naught.

I don't see the issue with restarting it either. If you have frequent use-cases that require byte-swapped clients, enable the config (in xorg.conf or the compositor, once available) and you're sorted from then on.

Switching the server to a single byte order is *not* going to happen. There are zero people willing to put in the (multi-year) effort.

smurfix said...

Also I assume that being able to use xset to turn this on … also turns all the nice security holes back on which you tried to avoid by disabling byteswapping.

Thus, not being able to use xset for this is a feature. Otherwise you might as well leave the stuff enabled.

oPPen! said...

> Having it be an option that's turned off by default just ensures that it will rapidly end up broken because of its "niche"ness, and then removed because it keeps causing breakage. Do it all the way or not at all.

Not any more than enabling it by default. Consider this:
- The code is still being built => at least it will keep building.
- The code is only actually run if people use it => those people will enable it.
- The only other case where the code runs is when someone is trying to pwn you, we don't want that, do we?

Bit rot doesn't come from disabling something by default, it comes from not having any users that report breakage. And code that is not used isn't really valuable anyway.

Keep in mind that most of the time when you have a niche computer among your float this would amount to your regular maintenance of just changing the config once and keeping it the way you need, so I don't get the drama.

Re: alternatives: just describe the proper data structures like XCB and Wayland do and put a second socket that does the byte swapping in a different process and passes it on to the server. No need to duplicate the logic in X.org, no need for restarts (just start the second daemon and call it a day). So much for Unix and KISS I guess. But IMO it isn't even worth the effort.

Arne Ahrend said...
This comment has been removed by the author.
Arne Ahrend said...

Would it make sense to add a "reject all BE clients" option to the LE X server and activate this option in distros as a first step?

Alex said...

The posters mentioning that just changing xset isn't enough are, of course, perfectly correct. I suggested xset as a place to *expose* some new runtime flag due to (rather tenuous) parallel to "xset bc" (bug compatibility) which affects values in some protocol requests already. There be some other, better, during-runtime settable place to make a handle for this I'm not thinking of.

It's unrealistic to expect that X would be ported to use a network byte order, that's pretty much a baked-in architectural choice.

Having to kill and reconfigure the X server to reënable a feature it used to have, that one might want to only use for a single client (or forty, depending on the site - makes about as much sense as setting the default window limit to 16 (instead of the current 128)... Sure, many users wouldn't even notice how low it was, but power users would have to reconfigure EVERY X SERVER THEY USE - including being unable to at some places, or a time cost of a couple of hours in others, for something X currently already does transparently. This would be a user experience failure.

[A cynic would look at whether this was an attempt to *make* X less reliable, in order to advance the agenda of some other graphics server - I do NOT think that's the case here. The unfortunate patch sounds to have been from good intentions, just far away from understanding what X is even for]