Hacker Newsnew | past | comments | ask | show | jobs | submit | audidude's commentslogin

In X11 we kept things simple by offering:

* Core protocol drawing (lines, rectangles, arcs, the classics)

* XRender for compositing and alpha

* XShm for shared-memory blits

* GLX if you felt like bringing a GPU to a 2D fight

* XVideo for overlay video paths

* Pixmaps vs Windows, because why have one drawable when you can have two subtly different ones

* And of course, indirect rendering over the network if you enjoy latency as a design constraint


And it worked very well for a remarkably long time. Even over dialup if you were patient. It still seems bizarre that the security flaws couldn’t be addressed, I never understood the Wayland push and I still don’t.

> I never understood the Wayland push and I still don’t.

What happened is basically this:

- X11 was fairly complex, had a lot of dead-weight code, and a whole bunch of fundamental issues that were building up to warrant an "X12" and breaking the core protocols to get fixed.

- Alongside this, at some point in the relatively distant past, the XFree86 implementation (the most used one at the time, which later begat Xorg) was massively fat and did a huge amount of stuff, including driver-level work - think "PCI in userspace".

- Over the years, more and more of the work moved out of the X server and into e.g. the Linux kernel. Drivers, mode setting, input event handling, higher-level input event handling (libinput). Xorg also got a bit cleaner and modularized, making some of the remaining hard bits available in library form (e.g. xkb).

- With almost everything you need for a windowing system now cleanly factored out and no longer in X, trying out a wholly new windowing system suddenly became a tractable problem. This enabled the original Wayland author to attempt it and submit it for others to appraise.

- A lot of the active X and GUI developer liked what they saw and it managed to catch on.

- A lot of the people involved were in fact not naive, and did not ignore the classic "should you iterate or start over" conundrum. Wayland had a fairly strong backward compat story very early on in the form of Xwayland, which was created almost immediately, and convinced a lot of people.

In the end this is a very classic software engineering story. Would it have been possible to iterate X11 toward what Wayland is now in-place? Maybe. Not sure. Would the end result look a lot like Wayland today? Probably, the Wayland design is still quite good and modern.

It's a lot like Python 2.x vs. 3.x in the end.


It never worked very well - in the age of bitmaps (and that's not a recent invention) that ultra-dumb pixel pushing simply no longer scales.

We have eons better ways to transport graphical buffers around (also, what about sound? That seems reasonable - but now should the display server also be a sound server?), so building it into such a core protocol just doesn't make sense.


Once applications moved local and GPUs became the rendering path X11's network transparency became pure overhead for 99% of users. Wayland fixes this by making shared-memory buffers the core primitive and remote access a separate concern.

Not to mention that the complexity of X11 shots through the roof once shared buffers comes into play.

X11 was ok for it's time, but fundamentally it's an really outdated design to solve 80s/90s issues over the network in the way you solved it back then.


It is INCREDIBLY outdated and forces all graphics to flow through a crappy 80s era network protocol even when there is no network. It is the recurrent laryngeal nerve of graphics technology.

If Wayland doesn't need network transparency why do Wayland clients and "server" still communicate through a socket though?

Why not use a much simpler command buffer model like on 3D APIs, and only for actually asynchronous commands (but not simple synchronous getters)?

PS: is io_uring using sockets for its "shared memory ringbuffers"? If not why? Too much overhead? And if there's notable socket overhead, why is Wayland using sockets (since it has to solve a simar problem as io_uring or 3D APIs).


It's only using sockets to pass a handle to a buffer though. Sockets are plenty fast, but not moving stuff in the first place can't be beaten.

Is it 99% of users? Of the linux (desktop/laptop) users I know, the majority use X-forwarding over ssh at least occasionally, while non-linux (desktop/laptop) also use X-forwarding (this is in an academic context), so while this may be an improvement for a subset of linux (desktop/laptop) users, across the whole linux user base (excluding both Android, which does not use wayland, and embedded, which I understand does use wayland), it's not.

I don't think I have used X-forwarding in the last 10 years unless for checking whether it's still there. Most of the time, it was, and running a browser even on a nearby machine was not a pleasant experience. Running Emacs was less bad, but the only things that actually worked well were probably xlogo and xload.

X-forwarding isn't used nearly enough to justify it having to be used all the time.

...and then you have long time Linux users (like me) who cannot feel any of the benefit of removing that overhead. The only difference I can tell between X and Wayland on my machines is that Wayland doesn't work with some stuff.

I'm pretty sure it simplifies the code a lot.

That doesn't help when the code talking to Wayland becomes much more complicated.

Why would it be more complicated?

Most apps are just using GTK and qt and doesn't even care about their x or Wayland backends.


This is what I was thinking when I read this. Wouldn't it just be easier to use GTK (or Qt) everywhere? They are already well supported on every other platform and can look very native the last time I checked.

> Wouldn't it just be easier to use GTK (or Qt) everywhere?

Which of those? GTK apps look alien on KDE desktops, and Qt apps look alien on GNOME desktops. Also, if you only need to create a window with a GL or Vulkan canvas, pulling in an entire UI framework dependency is overkill. There's SDL, GLFW, winit etc etc - but those also don't fix the 'native window chrome' problem in all situations and they all have to work around missing Wayland features. The bare window system functionality (managing windows - including window chrome and positioning(!), clipboard, drag'n'drop, ...) should really be part of the OS APIs (like it is on *every other* desktop operating system). Why does desktop Linux have to do its own thing, and worse (in the sense of: more developer hostile) than other desktop operating systems?


Frankly I don't get your problem or how is it different on any other OS. So your solution to GTK or qt looking alien is to look alien to everyone? Like there is no universe where "GTK doesn't look good, I will go with a custom written vulkan canvas" is a realistic scenario. Especially when all this has been blown way out of proportion when companies happily wrap their web apps into a browser and ship it as their software.

So again, how is it different elsewhere? What about windows, where even their own frameworks look alien because they have 3-4 of them? How is that the fault of Wayland somehow?!


> So your solution to GTK or qt looking alien is to look alien to everyone?

No? Where did I write that? I want my window to look and feel consistent with all other Linux desktop applications, and this is mainly achieved by having common window decorations (a problem that had already been solved by any other desktop operating system in the last 50 years).


> a problem that had already been solved by any other desktop operating system in the last 50 years

I just gave you an example of Windows that by default fails this requirement (see settings vs control panel or what that is called), let alone when you install applications using sorts of different frameworks.


Maybe other OSs solved this, but Windows didn’t - it just kept adding new UI libraries replacing older ones so that old software could still run and look old.

and then there are probably as many if not more that notice zero difference at all. and a sizable amount of people who notice things that are BETTER, such as for example actual support for HDR and 10bit, per-screen refresh rates etc

Not very different from Windows:

- Modern DX and old Direct Draw games often was a clusterfuck in order to keep them running fast in Windows 8 and up.

- XVideo and overlay video for Windows were 100% the same; green glitches on drawing, blue screens on screnshots et all.

- Same issue in Windows with pixmaps.

- RDP was fine there, even Gnome adopted it. But I prefer 9front's transparency, you don't need to get everything from the host to use it, with 9front you can just get the device you want, you can decouple auth from the GUI and a lot more, even the network and audio devices. Much fetter than X11's remote setups, VNC, RDP and whatnot.


9front is nice, but I'm not sure how much you'd get out of it without implementing the rest of Plan 9 in the target OS.

So the state of 2025 then tests a VTE that is from 2023? 4 major releases behind? And through a GTK 3 app, not even a GTK 4 one which will use the GPU?


Likewise I noticed that Konsole was version 23.08. I've just submitted a PR (https://github.com/jquast/ucs-detect/pull/14) to update it to 25.08.


Which one is that about specifically? Maybe the author could fix it.

Compared the results (https://ucs-detect.readthedocs.io/results.html#general-tabul...) with what I use day-to-day (Alacritty) and seems the results were created with the same version I have locally installed, from Arch/CachyOS repos, namely 0.16.1 (42f49eeb).


They accept PRs on the ucs-detect project for updated test results.


Red Hat announced RISC-V yesterday with RHEL 10. So this seems rather expected.

https://www.redhat.com/en/blog/red-hat-partners-with-sifive-...


Debian Trixie now in hard frozen, also has official support for RISC-V64 [1].

[1] What's new in Debian 13:

https://www.debian.org/releases/trixie/release-notes/whats-n...


As someone who went down this path many years ago, I think the GTK numbers in the article are a bit misleading. You wouldn't create 1000 buttons to do a flamegraph properly in GTK.

In Sysprof, it uses a single widget for the flamegraph which means in less than 150 mb resident I can browse recordings in the GB size range. It really comes down to how much data gets symbolized at load time as the captures themselves are mmap'able. In nominal cases, Sysprof even calculates those and appends them after the capture phase stops so they can be mmap'd too.

That just leaves the augmented n-ary tree key'd by instruction pointer converted to string key, which naturally deduplicates/compresses.

The biggest chunk of memory consumed is GPU shaders.


This is a bit of a mischaracterization of the Python side of things.

They only opted out for 3.11 which did not yet have the perf-integration fixes anyway. 3.12 uses frame-pointers just fine.


Any link to the fix or documentation about it? I could find added perf support but did not see anything about improved performance related to frame pointer use.


https://pagure.io/fesco/issue/2817#comment-826636 will probably get you started into the relevant paths. Python 3.12 was going to include frame-pointers anyway for perf to boot. So they needed to fix this regardless.


I think your viewpoint is valid.

My experience is on performance tuning the other side you mention. Cross-application, cross-library, whole-system, daemons, etc. Basically, "the whole OS as it's shipped to users".

For my case, I need the whole system setup correctly before it even starts to be useful. For your case, you only need the specific library or application compiled correctly. The rest of the system is negligible and probably not even used. Who would optimize SIMD routines next to function calls anyway?


It's a disaster no doubt.

But, at least from the GNOME side of things, we've been complaining about it for roughly 15 years and kept getting push-back in the form of "we'll make something better".

Now that we have frame-pointers enabled in Fedora, Ubuntu, Arch, etc we're starting to see movement on realistic alternatives. So in many ways, I think the moral hazard was waiting until 2023 to enable them.


I regularly have users run Sysprof and upload it to issues. It's immensely powerful to be able to see what is going on systems which are having issues. I'd argue it's one of the major reasons GNOME performance has gotten so much better in the recent-past.

You can't do that when step one is reinstall another distro and reproduce your problem.

Additionally, the overhead for performance related things that could fall into the 1% range (hint, it's not much) rarely are using the system libraries in such a way anyway that would cause this. They can compile that app with frame-pointers disabled. And for stuff where they do use system libraries (qsort, bsearch, strlen, etc) the frame pointer is negligible to the work being performed. You're margin of error is way larger than the theoretical overhead.


1% is a ton. 1% is crazy. Visa owns the world off just a 3% tax on everything else. Brokers make billions off of just 1% or even far less.

1% of all activity is only rational if you get more than 1% of all activity back out from those times and places where it was used.

1%, when it's of everything, is an absolutely stupendous collossal number that is absolutely crazy to try to treat as trivial.


Better analogy: you're paying 30% to apple, and over 50% in bad payday loans, and you're worried about the 3% visa/stripe overhead ... that's kinda crazy. But that's where we are in computer performance, there's 10x, 100x, and even greater inefficiencies everywhere, 1% for better backtraces is nothing.


Absolutely. We've gotten numerous double digit performance improvements across applications, libraries, and system daemons because of frame-pointers in Fedora (and that's just from me).


> Shadow stacks are cool but aren't they limited to a fixed number of entries?

Current available hardware yes. But I think some of the future Intel stuff was going to allow for much larger depth.

> Is the memory overhead of lookup tables for very large programs acceptable?

I don't think SFrame is as "dense" as DWARF as a format so you trade a bit of memory size for a much faster unwind experience. But you are definitely right that this adds memory pressure that could otherwise be ignored.

Especially if the anomalies are what they sound like, just account for them statistically. You get a PID for cost accounting in the perf_event frame anyway.


It does cause more memory pressure because the kernel will have to look at the user-space memory for decoding registers.

So yes it will be faster than alternatives to frame-pointers, but it still wont be as fast as frame pointers.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: