Comment by Fiveplus

4 hours ago

>The goal is, that xfwl4 will offer the same functionality and behavior as xfwm4 does...

I wonder how strictly they interpret behavior here given the architectural divergence?

As an example, focus-stealing prevention. In xfwm4 (and x11 generally), this requires complex heuristics and timestamp checks because x11 clients are powerful and can aggressively grab focus. In wayland, the compositor is the sole arbiter of focus, hence clients can't steal it, they can only request it via xdg-activation. Porting the legacy x11 logic involves the challenge of actually designing a new policy that feels like the old heuristic but operates on wayland's strict authority model.

This leads to my main curiosity regarding the raw responsiveness of xfce. On potato hardware, xfwm4 often feels snappy because it can run as a distinct stacking window manager with the compositor disabled. Wayland, by definition forces compositing. While I am not concerned about rust vs C latency (since smithay compiles to machine code without a GC), I am curious about the mandatory compositing overhead. Can the compositor replicate the input-to-pixel latency of uncomposited x11 on low-end devices or is that a class of performance we just have to sacrifice for the frame-perfect rendering of wayland?

At least they are honest regarding the reasons, not a wall of text to justify what bails down to "because I like it".

Naturally these kinds of having a language island create some attrition regarding build tooling, integration with existing ecosystem and who is able to contribute to what.

So lets see how it evolves, even with my C bashing, I was a much happier XFCE user than with GNOME and GJS all over the place.

  • You know that all the Wayland primitives, event handling and drawing in gnome-shell are handled in C/native code through Mutter, right ? The JavaScript in gnome-shell is the cherry on top for scripting, similar to C#/Lua (or any GCed language) in game engines, elisp in Emacs, event JS in QtQuick/QML.

    It is not the performance bottleneck people seem to believe.

    • I can dig out the old GNOME tickets and related blog posts...

      Implementation matters, including proper use of JIT/AOT toolchains.

One thing to keep in mind is that composition does not mean you have to do it with vsync, you can just refresh the screen the moment a client tells you the window has new contents.

> ...or is that a class of performance we just have to sacrifice for the frame-perfect rendering of wayland?

I think I know what "frame perfect" means, and I'm pretty sure that you've been able to get that for ages on X11... at least with AMD/ATi hardware. Enable (or have your distro enable) the TearFree option, and there you go.

I read somewhere that TearFree is triple buffering, so -if true- it's my (perhaps mistaken) understanding that this adds a frame of latency.

  • > I read somewhere that TearFree is triple buffering, so -if true- it's my (perhaps mistaken) understanding that this adds a frame of latency.

    True triple buffering doesn't add one frame of latency, but since it enforces only whole frames be sent to the display instead of tearing, it can cause partial frames of latency. (It's hard to come up with a well-defined measure of frame latency when tearing is allowed.)

    But there have been many systems that abused the term "triple buffering" to refer to a three-frame queue, which always does add unnecessary latency, making it almost always the wrong choice for interactive systems.

  • only on the primary display. once you had more than one display there were only workarounds.

> Can the compositor replicate the input-to-pixel latency of uncomposited x11 on low-end devices or is that a class of performance we just have to sacrifice for the frame-perfect rendering of wayland?

I think this is ultimately correct. The compositor will have to render a frame at some point after the VBlank signal, and it will need to render with it the buffers on-screen as of that point, which will be from whatever was last rendered to them.

This can be somewhat alleviated, though. Both KDE and GNOME have been getting progressively more aggressive about "unredirecting" surfaces into hardware accelerated DRM planes in more circumstances. In this situation, the unredirected planes will not suffer compositing latency, as their buffers will be scanned out by the GPU at scanout time with the rest of the composited result. In modern Wayland, this is accomplished via both underlays and overlays.

There is also a slight penalty to the latency of mouse cursor movement that is imparted by using atomic DRM commits. Since using atomic DRM is very common in modern Wayland, it is normal for the cursor to have at least a fraction of a frame of added latency (depending on many factors.)

I'm of two minds about this. One, obviously it's sad. The old hardware worked perfectly and never had latency issues like this. Could it be possible to implement Wayland without full compositing? Maybe, actually. But I don't expect anyone to try, because let's face it, people have simply accepted that we now live with slightly more latency on the desktop. But then again, "old" hardware is now hardware that can more often than not, handle high refresh rates pretty well on desktop. An on-average increase of half a frame of latency is pretty bad with 60 Hz: it's, what, 8.3ms? But half a frame at 144 Hz is much less at somewhere around 3.5ms of added latency, which I think is more acceptable. Combined with aggressive underlay/overlay usage and dynamic triple buffering, I think this makes the compositing experience an acceptable tradeoff.

What about computers that really can't handle something like 144 Hz or higher output? Well, tough call. I mean, I have some fairly old computers that can definitely handle at least 100 Hz very well on desktop. I'm talking Pentium 4 machines with old GeForce cards. Linux is certainly happy to go older (though the baseline has been inching up there; I think you need at least Pentium now?) but I do think there is a point where you cross a line where asking for things to work well is just too much. At that point, it's not a matter of asking developers to not waste resources for no reason, but asking them to optimize not just for reasonably recent machines but also to optimize for machines from 30 years ago. At a certain point it does feel like we have to let it go, not because the computers are necessarily completely obsolete, but because the range of machines to support is too wide.

Obviously, though, simply going for higher refresh rates can't fix everything. Plenty of laptops have screens that can't go above 60 Hz, and they are forever stuck with a few extra milliseconds of latency when using a compositor. It is unideal, but what are you going to do? Compositors offer many advantages, it seems straightforward to design for a future where they are always on.

  • Love your post. So, don’t take this as disagreement.

    I’m always a little bewildered by frame rate discussions. Yes, I understand that more is better, but for non-gaming apps (e.g. “productivity” apps), do we really need much more than 60 Hz? Yes, you can get smoother fast scrolling with higher frame rate at 120 Hz or more, but how many people were complaining about that over the last decade?

    • Essentially, the only reason to go over 60 Hz for desktop is for a better "feel" and for lower latency. Compositing latency is mainly centered around frames, so the most obvious and simplest way to lower that latency is to shorten how long a frame is, hence higher frame rates.

      However, I do think that high refresh rates feel very nice to use even if they are not strictly necessary. I consider it a nice luxury.

      1 reply →

    • If our mouse cursors are going to have half a frame of latency, I guess we will need 60Hz or 120Hz desktops, or whatever.

      I dunno. It does seem a bit odd, because who was thinking about the framerates of, like, desktops running productivity software, for the last couple decades? I guess I assumed this would never be a problem.

      2 replies →

    • I agree. Keyboard-action-to-result-on-screen latency is much more important, and we are typically way above 17 ms for that.

      1 reply →

> Can the compositor replicate the input-to-pixel latency of uncomposited x11 on low-end devices or is that a class of performance we just have to sacrifice for the frame-perfect rendering of wayland?

well, the answer is just no, wayland has been consistently slower than X11 and nothing running on top can't really go around that