Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Atomic RMW is all you need. Imagine 3 state futex lock of taken, sleeping, and unlocked. Waking thread just needs to see old value but writes unlocked unconditionally.

I don't think this is more expensive than release certainly not because of the line where the write goes in most cache coherency protocols.

Granted this lock does wake up too many people but you did say usually uncontended.



And what, unconditionally futex wakes everyone?

That'll be hella slow


You did say uncontended!


So you want uncontended unlocks to make a syscall to wake the threads that aren’t there?

That syscall will be worse than a CAS just because it’s a syscall. Not to mention it’ll have to lock some kernel data structure just to figure out that there aren’t any threads to wake.


I think I'm not doing a great job of explaining it.

There are three states of the mutex: locked=1, unlocked=0, and sleeping=2. A thread comes along to a locked mutex, and atomically moves it to sleeping and sleeps via futex. The freeing thread unconditionally writes 0 and atomically sees the old value: if 2 wake all, if 1 no need to wake anybody.

Maybe that is equally expensive to a CAS, but I don't think it ultimately is.


“Atomically sees the old value” means you’re doing an atomic RMW, which is more expensive than just a store with a release fence.

Atomic RMW is almost as expensive as CAS. Exactly as expensive as CAS on some cpus.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: