Hacker Newsnew | past | comments | ask | show | jobs | submit | Tringi's commentslogin

I wasn't happy with existing solutions for clean shutdown of Azure Spot VMs so I put together this tiny service application that does basically what all those scripts do, but with way less resources and without continuously launching scripting engines.

It's nothing groudbreaking really. It detects that eviction is pending, usually 30s in advance, so it gives 10s notice and then initiates shutdown, properly noting the reason to Event Log.

It's open source and just a handful of lines of code, but way lighter than executing heavy script every second.


An observation and a rant. Many will not even notice that icons got more blurry with Windows 10 and then worse, but I needed to get this off my chest, and perhaps I'll find likeminded minds in this regard through this.


It's a little more complicated if you are to be using themes, GDI and common controls. Some time ago I put together this example: https://github.com/tringi/win32-dpi

The high DPI support in Windows went through quite an evolution since XP, but mostly to fix what app programmers messed up. You can have nice and crisp XP at 250% dpi if you do things right, e.g.: https://x.com/TheBobPony/status/1733196004881482191/photo/1


So how shall a launcher application determine which AArch64 ISA level is available on Windows on ARM device?

Full story: https://www.reddit.com/r/programming/comments/1gxnr85/determ...


There's this issue with Win32 synchronization API, discussed and documented in many blogs, videos and tutorials over the years, that comes up again and again, that one thread can only wait for 64 (MAXIMUM_WAIT_OBJECTS) kernel objects at the same time.

To work around this limit, programs have resolved to various unnecessarily complex solutions, like starting extra threads for the only purpose of waiting, refactoring the logic, or replacing events with posting I/O completion packets.

In fact, if the application is waiting in a Vista+ Thread Pool, the pool itself uses the first approach: Starts as many threads as needed to wait for all the events. Or rather it used to. With Windows 8, all Windows threadpool waits can now be handled by a single thread. It does it through a new capability of associating the Event with an I/O Completion Port, to which the signalled state is enqueued. But this capability was not exposed through Win32 API to regular programmers.

It was exposed though, by a barely document NT API NtAssociateWaitCompletionPacket, which, it seems, nobody is using, except a few rare high performance libraries, Rust runtime, and um security researchers.

So I took a liberty to abstract out the details and implement what a simple Win32 call could look like. In the following example I wait for 2000 events in a single thread, through a single IOCP.

> https://github.com/tringi/win32-iocp-events

Of course, for larger systems, the Thread Pool API is the right way. But if your program is already using IOCPs, is single-threaded and you don't have resources to solve locking and concurrency, or are just thread-pooling your own way, this may be the ideal solution to reduce thread count, complexity and resource requirements.


As awesome as these algorithms are, I can never imagine use case.

Any time I need to turn something into uppercase or lowercase, it's user's input or something Unicode. Which can be in any random language, so all the crazy Unicode casing rules apply.

I mean, Czech could work by canonical Unicode decomposition and then re-composition afer case change ...but the input can be in Russian or Greek. I can't imagine accelerating that, or rather taking into account all the ranges.


There are lots of protocols that are ASCII case-insensitive, such as SMTP, IMAP, HTTP, DNS, … so this kind of code is common in the hot path of lots of networking software.


Ah, yeah, that's true.


If your input is Russian mixed with English, it might still be actually faster to do a first pass to process all the ASCII English range first with the aforementioned techniques, then do Russian later in the second pass: imagine you’ll have way less if branches to do.


The Hub Event (working title) is similar to the standard Win32 Event Object, but multiple consumers (clients) of the event can reliably wait on them, while being set/reset by the producer.

When the software requires to signal/release all waiting threads, each exactly once, that are not necessarily in a single process. Where naïve approach would be to use the flawed PulseEvent API and the system does not already have established IPC by other means (IOCP).

The implementation is somewhat heavy for what it does, but I haven't figured out simpler one. Perhaps I'll get some ideas for v2 here. In fact I believe this thing should be provided as a another kernel object by the OS alongside events, semaphores, mutexes, etc.


Some time ago I did some test, 256 threads competing on a small number of cache lines, and found out that all, CreateMutex, CRITICAL_SECTION and SRWLOCK, were quite fair.

The most successful thread was only 25%, 15% and 9% ahead of the least successful one. On the contrary, in my simple usermode spinlock the unfairness would be 1000% or even 2000%.


Blocking new readers when a writer arrives is perfectly good and desirable. Blocking readers when the writer finishes, and there isn't any new queued, is definitely not.


And Twitter/X accounts.

I have had a lot more things fixed in Windows or MSVC from nagging devs on there than from reporting through any official channel.


+1 if you want to reach a real engineer they are going to be spending their free time on sites like X and not on sites like some community feedback and bug reporting form


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: