Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Why ACPI? (mjg59.dreamwidth.org)
266 points by ingve on Nov 1, 2023 | hide | past | favorite | 96 comments


I find it kind of amusing that the dynamic configuration problem of hardware is so tough and think about the old mainframe and minicomputer OS of the 1970s which avoided all that by starting out with some configuration that supported one terminal and limited storage devices and would recompile the OS for the exact hardware configuration of the machine and print it to a paper tape or magnetic tape and they'd boot off that. Thus you had a "systems programmer" at every mainframe installation.

That part of the industry got into dynamic configuration to support hot plugging and generally being able to change the hardware configuration without causing downtime.


> which avoided all that by starting out with some configuration that supported one terminal and limited storage devices and would recompile the OS for the exact hardware configuration of the machine and print it to a paper tape or magnetic tape and they'd boot off that.

Not even. The OEM-shipped machine-specific bootstrap tape (i.e. the one that "supported one terminal and limited storage devices") was still used for initial system bringup, even after you recompiled your userland software distribution of choice for your computer's weird ISA and wrote it out to a tape. The OEM-shipped machine-specific bootstrap tape got mounted on /; brought up the system just enough to mount other devices; and then the userland software distribution got mounted to /usr.

(Back then, you wouldn't want to keep running things from the machine-specific bootstrap tape — the binaries were mostly very cut-down versions, of the kind that you could punch in from panel toggle switches in a pinch. You couldn't unspool the tape, because the kernel and some early daemons were still executing from the tape; but you wouldn't want anything new to execute from there. Thus $PATH. In /usr/bin you'd find a better shell; a better ls(1); and even what you'd think of today as rather "low-level" subsystems — things like init(8). $PATH was /usr/bin:/bin because "once you have /usr/bin, you don't want to be invoking any of those hyperminimal /bin versions of anything any more; you want the nice convenient /usr/bin ones.")


Ah back when the whole supply chain had a single manufacturer and no one worried about whether someone might want to put in - say - two video cards or the like.

Apple still kind of exists in this space.


Ironically, Apple implemented dynamic hardware configuration long before it was a standard feature in PC platforms.

I was tempted to jump on the "two video cards" example, but the original IBM PC could support both a CGA (for color) and MDA (monochrome, sharper text) in the same host. I never did that myself, but every card I did use required you to flip switches or jumpers on each ISA board to configure its interrupts and memory address of its I/O ports.

Apple adopted NuBus for its Macintosh expansion platform. Boards were plug and play, automatically configured. Of course, the hardware required on the NuBus card to support this functionality was the better part of a whole separate Mac in its own right; the hardware dev kit cost $40,000.

Two video cards in a Mac just worked.

(Of course, I took your comment to refer to hardware less than 20 years old. But even now, there's dynamic hardware. Apple loved Thunderbolt because they wanted external expansion cards over a wire.)


Heh. Having a nostalgic moment remembering all the hours spent finding an equilibrium where all the devices in the machine could operate with the one true combination of IRQ and DMA jumpers set.

https://www.philscomputerlab.com/uploads/3/7/2/3/37231621/ju...


Wasn't like that all with DEC and I don't think so with IBM mainframes either.

It was common for DEC systems to have custom Unibus cards

https://en.wikipedia.org/wiki/Unibus

as these were really easy to make. They dealt with them by building custom drivers right into the OS when they build an OS image.

Circa 2002 a friend of mine developed custom printer interfaces and drivers for IBM z because IBM's printer interface couldn't support the high rate of printing that New York state needed to satisfy the mandate that any paperwork could be turned around in 2 days or less.

Whatever you say about NY it is impressive how fast you get back your tax returns, driver license, permit to stock triploid grass carp or anything routine like that.


But it also meant that release of a new computer often required new OS release, with DEC often patch releases that added just enough code to run the devices included in new computer, because the older versions would at best boot into something unusable.

As for IBM mainframes, the list of devices directly attachable at OS level is quite small, and even then application with appropriate privileges could directly send control words to a channel. That said, things like printers would probably be intermediated by communication controllers translating from channel interface to serial lines.


The PC is really incredibly unique as a computing platform in how open to third-party extension and customization it ended up becoming (even though it was definitely not IBM's intention!) This has mostly been very good for the consumer, but the combinatorial explosion of different hardware combinations was for a long time a compatibility nightmare, and to some extent still is.


I would like to offer a prophecy: For the next evolution of ACPI, Linux kernel devs (employed at hardware companies) will figure out a way to replace the bespoke bytecode with eBPF.

Windows will, of course, spawn a WSL instance when it needs to interact with ACPI. macOS is its own hardware platform, and will naturally come up with their own separate replacement.


There already is an eBPF for Windows, it's even Microsoft's own project https://github.com/microsoft/ebpf-for-windows


Ahh. Just as the prophecy foretold.


Unlikely. ACPI is made by Wintel vendors, so Windows will get support for the fancy new things and Linux will lag behind until the new thing is documented or reverse engineered.


ACPI is standardized via a specification. It's quite easy for non windows operating systems to support ACPI. I can't say the same for device tree as that requires reading Linux source.


Lots of things are available in a specification. HTML, for instance. Just being an open specification is insufficient when there's a superdominant implementation. At that point, that implementation is the specification.

In HTML, it was Internet Explorer for a long time, now it's Chrome/Chromium.

In ACPI, it's Windows. In fact, Linux pretends to be Windows because anything else is a shipwreck graveyard of disappointment and untested code.

https://askubuntu.com/questions/28848/what-does-the-kernel-b...

Also, see the whole necessity of patching your DSDT (Windows users: "The what?" Linux users: nodding sadly) https://wiki.archlinux.org/title/DSDT

Modern computers are sufficiently complicated that they really only support one OS. And of course, those are almost entirely Windows computers. Buy a computer with Linux preinstalled, with support, if you want to avoid having to care about things like this (or having a computer that never works quite right (e.g. doesn't reliably suspend or the fans are running wrong)).


The situation with HTML was worse in 2000 than it is today.

Early on Netscape introduced its own non-standard behavior for broken HTML (tags not properly closed.) Somewhere between 30-60% of HTML was broken so any competitive browser had to (i) render broken HTML and (ii) render broken HTML in the same undocumented way as Netscape!

Microsoft figured this out with IE but it was one barrier in the way to alternative browsers. This undocumented behavior was finally documented in the HTML 5 spec.

Now you might say the "whole stack" has the Chrome problem in that Chrome has some features that (some) other browsers don't have such as

https://caniuse.com/css-cascade-scope

https://caniuse.com/view-transitions

https://caniuse.com/css-text-wrap-balance

but a lot of those features are in the "cherry on top" category and there is a fairly healthy process of documenting how they work and the features proliferating to other browsers except when Apple doesn't want them to. (Gotta keep developers in that app store.)


Only because Web has effectively turned into ChromeOS with one vendor left standing.


It's not a perfect parallel. For instance, the tooling used to create HTML isn't almost universally provided by one vendor (Microsoft), and then run by the same vendor (still Microsoft.) It's also not like the CEO of that company we ever caught speculating on how to lock out competitors using that same technology. (OK, that's true of both HTML and ACPI)


That specification is a monster. "quite easy to support" is not a good description of the situation.


By comparison it is easier. The alternative is that you need to read even an order of magnitude more lines of GPL source code in the Linux kernel to write your OS, which may not be an option.


My understanding from ~10 years ago is:

There is a specification

Windows ACPI implementation is buggy

Hardware manufacturers implement Microsoft's implementation bug for bug

Everyone has to reverse engineer Microsoft's implementation because the standard isn't enough


From my 20 year memory it was other way around:

There is a specification.

Taiwanese hardware OEMs suck at programming, make mistakes.

Windows ACPI implementation is build to detect and work around those bugs. Think Win 3.x version of SimCity 2000 read after free bug and Microsoft hardcoding workaround in Windows 95 https://www.joelonsoftware.com/2000/05/24/strategy-letter-ii...

>Windows 95? No problem. Nice new 32 bit API, but it still ran old 16 bit software perfectly. Microsoft obsessed about this, spending a big chunk of change testing every old program they could find with Windows 95. Jon Ross, who wrote the original version of SimCity for Windows 3.x, told me that he accidentally left a bug in SimCity where he read memory that he had just freed. Yep. It worked fine on Windows 3.x, because the memory never went anywhere. Here’s the amazing part: On beta versions of Windows 95, SimCity wasn’t working in testing. Microsoft tracked down the bug and added specific code to Windows 95 that looks for SimCity. If it finds SimCity running, it runs the memory allocator in a special mode that doesn’t free memory right away. That’s the kind of obsession with backward compatibility that made people willing to upgrade to Windows 95.

Everyone else stumbles on those badly implemented ACPI Tables which seemingly work just fine in Windows land.


FreeBSD in 2023 is still masquerading as "Microsoft Windows NT" in order for things to work correctly. It has been this way since 2004. It works fine in "Windows Land" because the hardware is literally special-casing behavior for Windows!

  /*
   * OS name, used for the _OS object. The _OS object is essentially obsolete,
   * but there is a large base of ASL/AML code in existing machines that check
   * for the string below. The use of this string usually guarantees that
   * the ASL will execute down the most tested code path. Also, there is some
   * code that will not execute the _OSI method unless _OS matches the string
   * below. Therefore, change this string at your own risk.
   */
  #define ACPI_OS_NAME                    "Microsoft Windows NT"
https://cgit.freebsd.org/src/tree/sys/contrib/dev/acpica/inc...


_OS is basically irrelevant, _OSI has been used for over 20 years now. The right way to think about the values the OS presents to the firmware is in terms of a contract between the OS and the firmware in terms of mutual expectations. Windows effectively embodies a contract - the behaviour of any given Windows release will remain constant. There's no way to define an equivalent definition for Linux (because Linux's behaviour keeps changing to match hardware expectations), so it makes more sense for us to attempt to mimic the stable contract Windows provides (and it helps that that's the contract that the vendor has tested against in the first place). I went into this oh good lord over 15 years ago: https://mjg59.livejournal.com/96129.html


Your blog post is very insightful, thanks!


I feel like that outcome is inevitable. All implementations have bugs, and developers implement for reality rather than for a spec. Inevitably it leads to drift and the need to retain things that weren't right to begin with.

See also, the referer header. :)


> I feel like that outcome is inevitable. All implementations have bugs, and developers implement for reality rather than for a spec.

It depends. If the spec is clear then developers will generally implement the spec. If the spec is a mess then it becomes easier to just do what works on the popular implementations. Things like spec conformance suites, or even just writing up the spec well, can move the needle.


The lack of support usually comes from the other side: the h/w vendors aren't testing with anything but Windows.

I'm yet to see a Linux laptop where ACPI wasn't broken at least for one device (the most likely suspects are the components that aren't typically used in servers, s.a. webcams or wifi modems).


> macOS is its own hardware platform, and will naturally come up with their own separate replacement.

Actually, no. The M-series SoCs use device trees [1], and in fact their Apple SoC predecessors did just as well - the earliest I could find is the iPhone 3GS [2].

[1] https://lore.kernel.org/lkml/20230202-asahi-t8112-dt-v1-0-cb...

[2] https://www.theiphonewiki.com/wiki/DeviceTree


They're very device tree oriented. They've been using them since "new world PowerPC" Macs in the 90s. Even on x86, their boot loader constructs a device tree to describe the hardware to the kernel.


They have no incentive to use or benefit from ACPI. They don't have the problem of trying to scale to an innumerable number of hardware permutations. They have a limited set which they control the entire stack of. I would certainly be very confused if they went with such an overkill solution as well.


This appears logical but the reality is that the only reason you can’t immediately run MacOS on a generic X64 computer is that it doesn’t contain the licensing chip.

If you patch out that requirement (using a Hackintosh installation) and covert the ACPI tables into the format used by Apple it runs just fine, for as far as drivers are available for your hardware.


Sure, but they don't care about supporting that use pattern. If anything having Hackintoshes break is a bonus to them.


BPF doesn't really make sense here. It can't fully specify the kinds of computation an AML replacement would need since BPF is guaranteed to terminate (it's not Turing complete).


For this use case (hardware configuration), that might actually be desirable?


I don't think you'll get uptake removing constructs like

    while(*STATUS_REG & STATUS_BSY);
since AML is less a hardware description format and more a driver binary format.


macOS already uses (at least on ARM chips) device trees. I don’t see why they would go back to ACPI as long as they keep their SoC model.


Why bother with bespoke bytecode when we have a high quality, standard ISA?

RISC-V's base RV64I has 47 instructions. Legacy ISAs can simply emulate these 47 instructions.


Bytecode is presumably chosen to minimize the program length, while the RISC-V is at the opposite end of verbosity for representing a program.

You may be one of those who believe that RISC-V is a high-quality ISA, but this is not an universal opinion and it is a belief that is typically shared only by those who have been exposed to few or no other ISAs.

In the context of something like ACPI, I would be worried by the use of something like the RISC-V ISA, because this is an ISA very inappropriate for writing safe programs. Even something as trivial as checking for integer overflow is extremely complicated in comparison with any truly high-quality ISA.


For example, Open Firmware specification used a flavour of Forth for its bytecode.


How patronising. Can you give an example of an ISA that is higher quality than RISC-V?


There is no doubt that Aarch64 is a much higher quality ISA than RISC-V. The good or bad licensing model used for an ISA has nothing to do with the technical quality of an ISA.

Even the Intel/AMD ISA, despite its horrible encoding, after excluding many hundreds of obsolete instructions that are not needed any more and after excluding from the instruction encoding the prefix bytes that are needed only for backward compatibility, would be a higher quality ISA than RISC-V, especially for expressing any task that is computationally intensive. RISC-V is particularly bad for expressing computations with big integers.

The modern subset of the Intel/AMD ISA is better than RISC-V even from the point of view of some of the design criteria on which RISC-V is based.

For instance, the designers of RISC-V have omitted many useful features under the pretext that for good performance those features would require additional read and write ports to the register file.

The Intel/AMD ISA, by allowing one of the three operands of an instruction to be a memory operand, allows identical performance with an ISA with 3 register operands, while having one less read port and one less write port in the register file.

Having instructions with one memory operand works well only in CPUs with out-of-order execution. Nevertheless, the same performance as for an ISA with a memory operand can be achieved in a low-cost in-order CPU if there are a set of explicitly addressable load buffer registers, and one or more operands of an instruction can be a load buffer, instead of a general-purpose register.

So there would have been many better ways of accomplishing the design criteria of RISC-V. RISC-V is a simple ISA that is great for teaching, but its only advantage over any ISA that a competent designer could conceive in a few days is the large amount of already existing software tools, i.e. compilers, linkers, debuggers and so on, which would take years to duplicate for any brand-new ISA.


I’ve heard really good things about SuperH (sh-2, sh-4 etc.), designed by Hitachi for their processors including those used in the Sega Saturn and Sega Dreamcast. Really high code density was one big thing, but it was covered by patents until recently. There was a group trying to make a modern open source processor in VHDL for FPGAs and later on ASICs based on it, but I think it may have mostly fizzled out.


I do Dreamcast homebrew, and my opinion is the SH ISA is way over-optimized for 90's embedded applications. It wastes the tiny instruction space on useless things like read-modify-write bit operations, intended for setting hardware registers, and has a laughably small displacement for function calls (+/-2KB, with pretty much all calls requiring manually loading a function pointer from RAM). There are parts that are still nice compared to something like RISC-V, like post-increment loads and pre-decrement stores (but hardware designers seem to hate things like that, since they require an extra write port on the register file), and the code density can be pretty good (although GCC's output is awful), but there are so many ways the ISA could be easily improved.


(sorry for being off-topic) regarding https://news.ycombinator.com/item?id=16308439, I wonder how is this project going?


I haven't done much on it since then. While it's something I would like to finish, I realized that there are other, higher priority things that I should do first.


WDC 65C02, or even WDC 65C816.

Or how about MMIX?


Well first you need to define your criteria for "high quality"


The parent (adrian_b) has to, if anyone.

But as far as I can see, they simply have a (misguided) preference for CISC.

They're very vocal against load/store architectures, and they don't seem to understand the tradeoffs RISC-V does make.

They don't even seem to get that RISC-V has had the highest density in 64bit from the start (first ratified user spec, 2019), and now has highest density among the 32bit too (as of recent Zc ratification).


What makes you think RISC-V is a good fit for device configuration?


Simplicity, lack of flags or arithmetic exceptions, as well as clear ABI and environment call mechanism.

Hardware access could be clearly gated through ecalls, and the mechanisms for this could exist as a standard extension in the SBI interface.


I worked on windows kernel team and my first real projects were ACPI 1 and 2 implementation. It's been a while and ACPI was well on its way when i got there, the story at the time was that huge gaps in BIOS were problem and we need to move it into the kernel. There was also a big push from the industry at the same time to use EFI to allow devices to have a pre-os experience (e.g. play DVDs) and not be dependent upon Windows for those.

Another memory I have for that time was that powermgmt was a big priority. So i suspect the ability for the OS to do that via ACPI was strategic - I wasn't involved in the decision making.


This is all absolutely true, but it's not really an argument for or against ACPI or DTS or OF or any of that stuff. They're all sort of messy, but frankly all aimed at solving the wrong problem.

The root cause to every single problem in this space is simple: the OS and the firmware need to coordinate, period. That's a software problem. It's a complicated software problem. And it has to be solved with software engineering, with stuff like interface specifications and division of responsibilities and test suite and conformance tests and all the other boring stuff.

And ACPI is, sure, a reasonable language for expressing most of that work. But... no one does the work![1] Vendors make windows run and then ship it, and will occasionally fix bugs reported by critical customers. And that's all you get. Linux comes along later (even new versions of Windows often have crazy workarounds as I understand it) and needs to piece together whatever the firmware is doing ad hoc, because none of those vendors are going to bother with anyone that isn't Microsoft.

[1] In the PC world. In the world of integrated hardware things are much better. You can rely on your Mac or your Chromebook or your Galaxy S412 not to have these kinds of bugs, not becuase they use better firmware interface languages (arguably they don't -- I've seen DTS trees for ARM chromebooks!), but because they treat this as a software problem and have the firmware and OS teams work with each other.


The reality is that for highly integrated devices, you just ship a bunch of hacks, sometimes because you forgot to follow the spec and it was faster to patch a line in kernel than patch the firmware (Hello, intel mac broken ACPI tables!). A kernel driver for a phone component might have hardcoded set of quirks selected by string from device tree.

In the world of PCs, the reason Linux emulates Windows in terms of ACPI is because Microsoft not only is a big vendor - all those "designed for windows" labels on computers actually required passing test suites etc. Microsoft also publishes add-on specifications for things that are underspecced in ACPI - for example the ACPI spec does specify how to make an interface for backlight control. But it does not tell you the ranges that the OS and said interface have to support. Microsoft provides such description, that for example if OS responds with _OSI(Windows2003) then the supported ranges will be 0-5 (purely imagined example), but if it also responds _OSI(Windows2007) then the supported values can be 0-15, etc.

This is also why firmware situation on ARM is so shitty - vendors aren't forced to do the work, so they don't. With Windows, the vendor is external and it's pretty rare to avoid implementing things right (one example is Qualcomm fucking up Windows-on-ARM interfaces somewhat impressively and fixing it by injected drivers)


> The reality is that for highly integrated devices, you just ship a bunch of hacks,

That's true, but only in the specious sense that all integrated software is "a bunch of hacks". Fixing glitches due to misunderstandings between an API consumer and an API provider is something we all do, every day. And the essence of the problem is no different whether the technology is a Javascript app framework or a NVMe firmware driver.

I mean, sure, it's better to make sure that the NVMe firmware driver (picking on that because it was an example in the linked article) and the OS have a clean specification for interaction. And sometimes they do! But it's likewise important that everyone write appropriately idiomatic React code too, and we don't demand[1] a whole new technology idiom to prevent front end developers from doing dumb stuff.

The solution is about engineering practice and not technology, basically. ACPI isn't going to solve that, for the reason that ACPI didn't create the problem. It's not ACPI's problem to solve.

[1] Well, people kinda do, actually. Abstraction disease is another problem.


The problem with the PC world is that the firmware and OS teams are not only working for different companies, they're working on different timescales and release cadences. Android devices and Macs are in an entirely different situation here, so the only really comparable example is the Chromebook - and that's a market where if Google doesn't want to support your hardware you don't get to ship it.


The point isn't that they're comparable, obviously they aren't. It's that the techniques used to solve the problem (synchronize "timescales and release cadences" in your example) are engineering management things and not technologies.

It's true that "Linux vs. Lenovo" doesn't have the same engineering practice tools available. But... evangelizing technological tools like "ACPI" isn't a path to a solution, because... those tools just don't work.

Or rather they don't work in the absence of someone addressing the engineering practice problem. They work fine to do what they're designed to do. But what they're designed to do emphatically isn't "Make Lenovo Hardware Work With Linux".


My point is that focusing on tight integration between OS teams and the underlying platform is a great way to develop a device that only runs one operating system and a terrible way to develop a device that's supposed to be part of an open ecosystem. ACPI is the least bad way we currently have to solve the latter problem. It doesn't guarantee that a Lenovo will work with Linux, but in the absence of an explicit development program it gives it a fighting chance.


ACPI is only the "least bad way" because vendors don't release sufficient information to do anything else. Microsoft and Intel just specify progressively more convoluted nonsense, require OEMs to pass spec or they can't buy chips or Windows licenses, and nothing ever gets any better until some dedicated hobbyist reverse-engineers whatever ridiculous software most recently won someone a promotion. You don't need tight integration between hardware and OS teams, you just need any transparency from the hardware side, and we don't get that any more.

The least bad way would be for the people building the computers to document them sufficiently well for software engineers to support them, but part of the PC market mass hallucination is that every OEM's particular way of doing exactly the same shit is some magic secret recipe, and letting the competition see the herbs and spices would be the end of the world. Meanwhile the entire market segment derives from the original IBM PC being designed by adults with sufficient discipline that the resulting product consituted a platform. All we've seen since is various scavengers fighting ridiculous turf wars over the ruins of interoperability.

It's an industry-wide case of learned helplessness. It's a morass that is holding back innovation, and "at least the septic tank is only waist deep" isn't really much of a contribution toward solving anything. I wish there were for personal computers what Oxide is for datacenter engineering, but wishing isn't much of a contribution either.


Yes, the world would be better with better documentation. But that doesn't solve the "How do I ship a new platform that works with existing operating systems" problem. ACPI provides an abstraction that allows some amount of divergent platform behaviour while retaining compatibility with existing operating systems. Even if all vendors released docs (which isn't terribly realistic), we'd still end up having to write new drivers for them, getting those merged upstream, and then deal with the lag between that and Linux distributions shipping them to end users.


> And ACPI is, sure, a reasonable language for expressing most of that work. But... no one does the work![1]

Maybe. Or maybe the "work doesn't get done" in part because that interface language is simultaneously overengineered and underspecified, and people who start out with the best of intentions end up drowning under a pile of incomprehensible ACPI docs and copy and pasting whatever junk seems to make Windows handle it ok because that's the only way out of the nightmare.


I remember when I built a new PC, with some socket 370 pentium or another, around 2002.

I ran an i2c probe to find the addresses to read the fan tachometer. The scan wrote some bits that irrecoverably messed up the firmware, the board wouldn't boot and I replaced it.


Asus must have anticipated trouble because their boards from that time period hide I2C bus away by default :) You have to do special port knocking incantation to expose it https://www.vogons.org/viewtopic.php?p=1173247#p1173247

>They share i2c bus between clock generator/monitoring chip and ram SPD using a mux. If you switch this mux and forget to switch it back computer will have trouble rebooting because bios blindly assumes its always set to clock gen.


There are three ACPI stacks. The reference intel one, this is what linux, macos and freebsd uses. The microsoft one. and finally those madlads over at the openbsd project have their own. good for them.



And yet somehow APM-based systems broke a lot less often. If the only codepath that's ever going to be tested is "what windows does", maybe having a cruder API that doesn't expose so many details of what's happening (and instead just lets the BIOS opaquely do its thing) is the way to go?


Pretty much only laptops had APM at first. Hardware didn't change much on laptops. I still had to unplug the ethernet cable from my PCMCIA card every now and then to get it to sense the link, but it wasn't that bad once I got all the right linux patches.

Then power management moved into desktops and servers that had expansion slots and everything became horrible.


The APM was also tested in practice only with DOS and/or Windows (and not all Windows, which could also be an issue).

And yes, it really didn't work well with dynamically attachable devices that might have important state.


> The APM was also tested in practice only with DOS and/or Windows

Right. I'm speculating that since it had only a small number of entry points, Linux tended to end up following the tested codepath. Certainly it broke a lot less on Linux than ACPI does.


Because as much smaller specification that cared about less stuff, it only broke said smaller scope - The things that were cleaned up with ACPI were broken in other components, individual drivers, or simply didn't work due to lack of special drivers with hardcoded data that is instead presented through common way in ACPI.

There's quite a lot of functionality these days that is handled with generic drivers thanks to ACPI that required custom drivers for each device previously (something that OpenFirmware and device tree doesn't fix, as it tends to be slightly less expressive - just information "this device is here" maybe with few attributes, especially if you have only DT and no actual OpenFirmware runtime)


How much does it actually improve? E.g. if it's about having the driver understand the state of the device when coming back from suspend more closely, then I can see how theoretically that might be better - and I can also see how a cruder model with a less detailed interface might end up with higher reliability in practice (at the cost of being notionally slower to wake up and doing more reinitialisation etc. - but that might still end up being a good tradeoff).


I still cannot understand your problem with Device Trees after reading your article, I used to write a ARMv8 and a x86 kernel and found out that ACPI and Device Trees had same capacities, but less headhaches with DT.


I run NetBSD on several ARMv8 boards. One is ACPI only, all the rest use DeviceTree. Basically impossible to add any extra functionality to the ACPI only one, no problem doing this on the others.


Where do the device trees come from?


You may find https://en.wikipedia.org/wiki/Devicetree

It discusses this, alongside an interesting history of it, and the current state.


It just doesn't mention the actually interesting stuff. Like, take PCI, for example: there is a way [0] to enumerate all the devices, and it also supports PCI-to-PCI bridges. Nice! And I also understand where the information comes from: the introspection. With the device tree the info apparently (judging from the sibling comments) comes from the vendor who baked it into whatever storage and apparently it's static.

[0] https://en.wikipedia.org/wiki/PCI_configuration_space#Bus_en...


The same place ACPI tables come from. A flash chip on the motherboard.


That was the original dream for DT that so far hasn't been very successful.

In practice the kernel only wants to add support for DTs that conform to a reviewed and approved schema, but most/no manufacturer wants to do the effort of having every single thing on their DT schema bikeshedded to death on the device tree mailing list before the device is manufactured. So the vendor just makes up whatever kind of unapproved DT schema that only their own vendor kernel fork supports, and if mainline support is ever added it will have its own DT.


So I can't change the system's composition? Or do the ACPI/DT only show me the controller chips soldered to the motherboard, but not the devices that connected to them?


Yes, DT is mainly meant to describe non-discoverable hw blocks inside the SoC and the chips soldered to the motherboard. Discoverable devices like PCIE cards or USB devices aren't added to DT (except under very rare circumstances)

Nowadays there is also an overlay mechanism that is used for example on the Raspberry Pi expansion hats, so DT is not always limited to soldered-on things.


On PC platforms, ACPI tables are some mix of static and dynamic data. There's a table for CPU information, and you don't need to flash your firmware everytime you change your CPU, but you do sometimes. PC firmware generally does a PCI scan before booting an OS, and I think some of that ends up in ACPI tables too.


Usually they are shipped with the Linux kernel rather than mobo flash.


How does the kernel knows which of the many possible device trees describes my hardware? Or is the system comes with custom-compiled pre-installed Linux kernel, with the one correct DT baked into it by the manufacturer?


ARM devices almost always need custom kernels. Even if you had device tree provided to you, there isn't a lot of confidence the kernel will be able to perfectly parse it nor have all the necessary drivers. This isn't an arm specific issue but rather an ecosystem issue as no one drives standardization.


U-Boot or something else tells the kernel which DT to use, and gives its own DT to the kernel. If Linux has its own DT for the hardware, it will use that. Otherwise it falls back to what the bootloader provided.


I was expecting to read “… a stork” lol


The only reason I know ACPI exists is because every Linux laptop I ever had always spit a roll of error messages related to ACPI (and usually no support).

My understanding is that on top of the inherent problems outlined in the article, there's a more trivial problem of vendors not caring enough to do this right. So, typical for Linux laptops, hybernation and many other forms of power-saving either don't work at all, or are broken (eg. a laptop never wakes up from hybernation, or just the screen never wakes up etc.)


These days most of these failures are bugs in the Linux drivers, not bugs in the firmware. The Lenovo case I mention in the article is actually unusual in that respect.


I'm only slightly familiar with the specific features ACPI provides. But isn't the solution the following

For every "feature" provided by the SMM or bios.

Export a UUID ( eg NVME resume implementation1) Have that feature have an enable and disable function. Have each feature have a dependency on each iorange / firmware device it needs access to.

If the kernel know how to implement the feature it can just disable the feature and then as long as it follows the dependency tree and can see nothing else accesses those ranges. It can know that it has exclusive use. If it doesn't have exclusive use it must use the firmware to access those ranges if possible or fall back to no support

If the firmware has a feature without a disable function. The kernel knows it can never access that hardware directly/safely.

You could even have a "lock device" that if you take you know that SMM won't access those io ranges whilst you have the lock.

Obviously this all requires vendor support


This is actually how things are meant to work! Many ACPI features are gated behind a _DSM call that allows the OS to indicate that it has native support for functionality and the firmware should stop touching it itself. It, uh, basically works to the extent that Windows makes use of it.


As someone who had to test various ACPI configs and work my way through the docs, I will never do that again. I will literally get out of my seat and tender my resignation on the spot if they try to force me. Never again. It's probably the most overengineered thing I've ever had to work with.


Does anyone have a link to the 12k page long discussion?


It's a reference to a famous Twitter post: https://twitter.com/dril/status/107911000199671808


I really wish web denizens would use memes with their own local color which, when applicable, are also true.

I mean, I'm sure Matthew knows some really long, legendary Linux Kernel Mailing List thread that is at least somewhat related to this post. It would be fun because then we could click the hyperlink, and Linux Kernel Mailing List threads are nothing if not dramatic. :)

Instead, here every web tribe localizes a meme that traces back to a dead end Tweet. So instead of getting to travel back to the actual tribe's colorful flame history, I get the tribe's low-effort localization of a dead Twitter stump. Boo!


It's a meme, there isn't such a thread (probably). I mean, there is no lack of flame wars about ACPI out there, but that megathread mention is a meme used about controversial subjects.


> We called this Advanced Power Management (Advanced because before this power management involved custom drivers for every machine and everyone agreed that this was a bad idea)

Not sure this statement is really true.

An OS has to have drivers for diverse hardware already - an OS will be expected to support devices as varied as keyboards, mice, floppy drives, hard drives, VGA, PCI bus, etc.

I guess it sucks to have to develop 10 drivers for 10 different power management controllers, but:

- the industry could have done what they did for storage - make the controller standard on the hardware level.

and

- if companies could have come together to create ACPI, they could have come together to define standard power management hardware interfaces instead.

> and it involved the firmware having to save and restore the state of every piece of hardware in the system.

APM was a crappy idea too, except if you had to support DOS and things built on it like Windows 3.x and 95.

Ideally the power management controller would just shut the system off, provide something that an OS loader can read to know if the system was powered on cold or resumed, and let the OS be responsible for saving and loading state.

> ACPI decided to include an interpreted language to allow vendors to expose functionality to the OS without the OS needing to know about the underlying hardware.

> How is this better than just calling into the firmware to do it? Because the fact that ACPI declares that it's going to access these registers means the OS can figure out that it shouldn't, because it might otherwise collide with what the firmware is doing. With APM we had no visibility into that

So ACPI provides code your OS must execute with top privileges so it doesn't have to know about the hardware, but it still has to know about the hardware so it doesn't accidentally step on it. Definitely better than the manufacturer of any power management hardware just publishing a datasheet and letting the OS handle it like any other device. /s

> There's an alternative universe where we decided to teach the kernel about every piece of hardware it should run on. Fortunately (or, well, unfortunately) we've seen that in the ARM world. Most device-specific simply never reaches mainline, and most users are stuck running ancient kernels as a result

If datasheets were available for the hardware, then open source drivers could be created instead of only relying on closed binary blobs; then those could be mainlined and included with the kernel, and this problem would not exist. The problem is really vendors not releasing information on programming their hardware, not Linux. This goes back to the whole argument: if you pay for and own your hardware, why is the manufactuer able to hide these details from you by not releasing this information?


Your argument fails at its first line:

> An OS has to have drivers for diverse hardware already - an OS will be expected to support devices as varied as keyboards, mice, floppy drives, hard drives, VGA, PCI bus, etc.

No, it doesn't, because the OS we are talking about here is DOS.

APM was released in 1992: https://en.wikipedia.org/wiki/Advanced_Power_Management

This was before even Windows 3.1 shipped.

MS-DOS 5.0 was new and not that widely used but was catching on: https://en.wikipedia.org/wiki/Timeline_of_DOS_operating_syst...

DOS didn't support half the hardware you cite. It had no direct support for mice, CD-ROMs, PCI, VGA, or any of that. PCI 1.0 was released the same year.

In those days, most PC software used the BIOS to access standard hardware, and anything much past that was up to the vendor to ship a driver.

All APM really had to do was throttle the CPU and maybe, as a vendor extension, put the hard disk to sleep. That's about it.

Things like putting the display to sleep came along with the US Energy Star standard, released -- you guessed it -- in 1992.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: