There was a giant super-long GitHub issue about improving Rust std mutexes a few years back. Prior to that issue Rust was using something much worse, pthread_mutex_t. It explained the main reason why the standard library could not just adopt parking_lot mutexes:
> One of the problems with replacing std's lock implementations by parking_lot is that parking_lot allocates memory for its global hash table. A Rust program can define its own custom allocator, and such a custom allocator will likely use the standard library's locks, creating a cyclic dependency problem where you can't allocate memory without locking, but you can't lock without first allocating the hash table.
> After some discussion, the consensus was to providing the locks as 'thinnest possible wrapper' around the native lock APIs as long as they are still small, efficient, and const constructible. This means SRW locks on Windows, and futex-based locks on Linux, some BSDs, and Wasm.
> This means that on platforms like Linux and Windows, the operating system will be responsible for managing the waiting queues of the locks, such that any kernel improvements and features like debugging facilities in this area are directly available for Rust programs.
> This means SRW locks on Windows, and futex-based locks on Linux, some BSDs, and Wasm.
Note that the SRW Locks are gone, except if you're on a very old Windows. So today the Rust built-in std mutex for your platform is almost certainly basically a futex though if it is on Windows it is not called a futex and from some angles is better - the same core ideas of the futex apply, we only ask the OS to do any work when we're contended, there is no OS limited resource (other than memory) and our uncontended operations are as fast as they could ever be.
SRW Locks were problematic because they're bulkier than a futex (though mostly when contended) and they have a subtle bug and for a long time it was unclear when Microsoft would get around to fixing that which isn't a huge plus sign for an important intrinsic used in all the high performance software on a $$$ commercial OS...
Mara's work (which you linked) is probably more work, and more important, but it's not actually the most recent large reworking of Rust's Mutex implementation.
> Prior to that issue Rust was using something much worse, pthread_mutex_t
Presumably you're referring to this description, from the Github Issue:
> > On most platforms, these structures are currently wrappers around their pthread equivalent, such as pthread_mutex_t. These types are not movable, however, forcing us to wrap them in a Box, resulting in an allocation and indirection for our lock types. This also gets in the way of a const constructor for these types, which makes static locks more complicated than necessary.
pthread mutexes are const-constructible in a literal sense, just not in the sense Rust requires. In C you can initialize a pthread_mutex_t with the PTHREAD_MUTEX_INITIALIZER initializer list instead of pthread_mutex_init, and at least with glibc there's no subsequent allocation when using the lock. But Rust can't do in-place construction[1] (i.e. placement new in C++ parlance), which is why Rust needs to be able to "move" the mutex. Moving a mutex is otherwise non-sensical once the mutex is visible--it's the address of the mutex that the locking is built around.
The only thing you gain by not using pthread_mutex_t is a possible smaller lock--pthread_mutex_t has to contain additional members to support robust, recursive, and error checking mutexes, though altogether that's only 2 or 3 additional words because some are union'd. I guess you also gain the ability to implement locking, including condition variables, barriers, etc, however you want, though now you can't share those through FFI.
[1] At least not without unsafe and some extra work, which presumably is a non-starter for a library type where you want to keep it all transparent.
> The effect of referring to a copy of the object when locking, unlocking, or destroying it is undefined.
I.e., if I pthread_mutex_init(&some_addr, ...), I cannot then copy the bits from some_addr to some_other_addr and then pthread_mutex_lock(&some_other_addr). Hence not movable.
> Moving a mutex is otherwise non-sensical once the mutex is visible
What does "visible" mean here? In Rust, if you hold a mutable reference to an object, there are no other references to that object, hence it is safe to move.
I’m actually thinking of the sheer size of pthread mutexes. They are giant. The issue says that they wanted something small, efficient, and const constructible. Pthread mutexes are too large for most applications doing fine-grained locking.
Seems like the simple solution to this problem would be to have both, no?
A simple native lock in the standard library along with a nicer implementation (also in the standard library) that depends on the simple lock?
My takeaway is that the documentation should make more explicit recommendations depending on the situation -- i.e., people writing custom allocators should use std mutexes; most libraries and allocations that are ok with allocation should use parking_lot mutexes; embedded or libraries that don't want to depend on allocate should use std mutexes. Something like that.
Author of the original WTF::ParkingLot here (what rust’s parking_lot is based on).
I’m surprised that this only compared to std on one platform (Linux).
The main benefit of parking lot is that it makes locks very small, which then encourages the use of fine grained locking. For example, in JavaScriptCore (ParkingLot’s first customer), we stuff a 2-bit lock into every object header - so if there is ever a need to do some locking for internal VM reasons on any object we can do that without increasing the size of the object
> The main benefit of parking lot is that it makes locks very small, which then encourages the use of fine grained locking. For example, in JavaScriptCore (ParkingLot’s first customer), we stuff a 2-bit lock into every object header - so if there is ever a need to do some locking for internal VM reasons on any object we can do that without increasing the size of the object
IMHO that's a very cool feature which is essentially wasted when using it as a `Mutex<InnerBlah>` because the mutex's size will get rounded up to the alignment of `InnerBlah`. And even when not doing that, afaict `parking_lot` doesn't expose a way to use the remaining six bits in `parking_lot::RawMutex`. I think the new std mutexes made the right choice to use a different design.
> I’m surprised that this only compared to std on one platform (Linux).
Can't speak for the author, but I suspect a lot of people really only care about performance under Linux. I write software that I often develop from a Mac but almost entirely deploy on Linux. (But speaking of Macs: std::mutex doesn't yet use futexes on macOS. Might happen soon. https://github.com/rust-lang/rust/pull/122408)
> I suspect a lot of people really only care about performance under Linux
Yeah this is true
How can a parking_lot lock be less than 1 byte? does this uses unsafe?
Rust in general doesn't support bit-level objects unless you cast things to [u8] and do some shifts and masking manually (that is, like C), which of course is wildly unsafe for data structures with safety invariants
I don’t know the details of the Rust port but I don’t imagine the part that involves the two bits to require unsafe, other than in the ways that any locking algorithm dances with unsafety in Rust (ownership relies on locking algorithms being correct)
This is very similar to how Java's object monitors are implemented. In OpenJDK, the markWord uses two bits to describe the state of an Object's monitor (see markWord.hpp:55). On contention, the monitor is said to become inflated, which basically means revving up a heavier lock and knowing how to find it.
I'm a bit disappointed though, I assumed that you had a way of only using 2 bits of an object's memory somehow, but it seems like the lock takes a full byte?
The idea is that six bits in the byte are free to use as you wish. Of course you'll need to implement operations on those six bits as CAS loops (which nonetheless allow for any arbitrary RMW operation) to avoid interfering with the mutex state.
This is one of the biggest design flaws in Rust's std, in my opinion.
Poisoning mutexes can have its use, but it's very rare in practice. Usually it's a huge misfeature that only introduces problems. More often than not panicking in a critical section is fine[1], but on the other hand poisoning a Mutex is a very convenient avenue for a denial-of-service attack, since a poisoned Mutex will just completely brick a given critical section.
I'm not saying such a project doesn't exist, but I don't think I've ever seen a project which does anything sensible with Mutex's `Poisoned` error besides ignoring it. It's always either an `unwrap` (and we know how well that can go [2]), or do the sensible thing and do this ridiculous song-and-dance:
let guard = match mutex {
Ok(guard) => guard,
Err(poisoned) => poisoned.into_inner()
};
Suffice to say, it's a pain.
So in a lot of projects when I need a mutex I just add `parking_lot`, because its performance is stellar, and it doesn't have the poisoning insanity to deal with.
[1] -- obviously it depends on a case-by-case basis, but if you're using such a low level primitive you should know what you're doing
> It's always either an `unwrap` (and we know how well that can go [2])
If a mutex has been poisoned, then something must have already panicked, likely in some other thread, so you're already in trouble at that point. It's fine to panic in a critical section if something's horribly wrong, the problem comes with blindly continuing after a panic in other threads that operate on the same data. In general, you're unlikely to know what that panic was, so you have no clue if the shared data might be incompletely modified or otherwise logically corrupted.
In general, unless I were being careful to maintain fault boundaries between threads or tasks (the archetypical example being an HTTP server handling independent requests), I'd want a panic in one thread to cascade into stopping the program as soon as possible. I wouldn't want to swallow it up and keep using the same data like nothing's wrong.
> If a mutex has been poisoned, then something must have already panicked, likely in some other thread, so you're already in trouble at that point.
I find that in the majority of cases you're essentially dealing with one of two cases:
1) Your critical sections are tiny and you know you can't panic, in which case dealing with poisoning is just useless busywork.
2) You use a Mutex to get around Rust's "shared xor mutable" requirement. That is, you just want to temporarily grab a mutable reference and modify an object, but you don't have any particular atomicity requirements. In this case panicking is no different than if you would panic on a single thread while modifying an object through a plain old `&mut`. Here too dealing with poisoning is just useless busywork.
> I'd want a panic in one thread to cascade into stopping the program as soon as possible.
Sure, but you don't need mutex poisoning for this.
> so you have no clue if the shared data might be incompletely modified or otherwise logically corrupted.
One can make a panic wrapper type if they cared: It's what the stdlib Mutex currently does:
MutexGuard checks if its panicking during drop using `std::thread::panicking()`, and if so, sets a bool on the Mutex. The next acquirer checks for that bool & knows state may be corrupted. No need to bake this into the Mutex itself.
To the contrary, the projects I've been part of have had no end of issues related to being cancelled in the middle of a critical section [1]. I consider poisoning to be table stakes for a mutex.
Well, I mean, if you've made the unfortunate decision to hold a Mutex across await points...?
This is completely banned in all of my projects. I have a 100k+ LOC project running in production, that is heavily async and with pervasive usage of threads and mutexes, and I never had a problem, precisely because I never hold a mutex across an await point. Hell, I don't even use async mutexes - I just use normal synchronous parking lot mutexes (since I find the async ones somewhat pointless). I just never hold them across await points.
We're currently working on separating poison from mutexes, such that the default mutexes won't have poisoning (no more `.lock().unwrap()`), and if you want poisoning you can use something like `Mutex<Poison<T>>`.
Yeah, I'm looking forward to it!
While we're at it, another thing that'd be nice to get rid of is `AssertUnwindSafe`, which I find even more pointless.
There are cases where it is useful.
I had a case where if the mutex was poisened it was possible to reset the lock to a safe state (by writing a new value to the locked content).
Or you may want to drop some resource or restart some operation instead of panicing if it is poisoned.
But I agree that the default behavior should be that the user doesn't have to worry about it.
I will personally recommend that unless you are writing performance sensitive code*, don’t use mutexes at all because they are too low-level an abstraction. Use MPSC queues for example, or something like RCU. I find these abstractions much more developer friendly.
There was a giant super-long GitHub issue about improving Rust std mutexes a few years back. Prior to that issue Rust was using something much worse, pthread_mutex_t. It explained the main reason why the standard library could not just adopt parking_lot mutexes:
From https://github.com/rust-lang/rust/issues/93740
> One of the problems with replacing std's lock implementations by parking_lot is that parking_lot allocates memory for its global hash table. A Rust program can define its own custom allocator, and such a custom allocator will likely use the standard library's locks, creating a cyclic dependency problem where you can't allocate memory without locking, but you can't lock without first allocating the hash table.
> After some discussion, the consensus was to providing the locks as 'thinnest possible wrapper' around the native lock APIs as long as they are still small, efficient, and const constructible. This means SRW locks on Windows, and futex-based locks on Linux, some BSDs, and Wasm.
> This means that on platforms like Linux and Windows, the operating system will be responsible for managing the waiting queues of the locks, such that any kernel improvements and features like debugging facilities in this area are directly available for Rust programs.
> This means SRW locks on Windows, and futex-based locks on Linux, some BSDs, and Wasm.
Note that the SRW Locks are gone, except if you're on a very old Windows. So today the Rust built-in std mutex for your platform is almost certainly basically a futex though if it is on Windows it is not called a futex and from some angles is better - the same core ideas of the futex apply, we only ask the OS to do any work when we're contended, there is no OS limited resource (other than memory) and our uncontended operations are as fast as they could ever be.
SRW Locks were problematic because they're bulkier than a futex (though mostly when contended) and they have a subtle bug and for a long time it was unclear when Microsoft would get around to fixing that which isn't a huge plus sign for an important intrinsic used in all the high performance software on a $$$ commercial OS...
Mara's work (which you linked) is probably more work, and more important, but it's not actually the most recent large reworking of Rust's Mutex implementation.
> Prior to that issue Rust was using something much worse, pthread_mutex_t
Presumably you're referring to this description, from the Github Issue:
> > On most platforms, these structures are currently wrappers around their pthread equivalent, such as pthread_mutex_t. These types are not movable, however, forcing us to wrap them in a Box, resulting in an allocation and indirection for our lock types. This also gets in the way of a const constructor for these types, which makes static locks more complicated than necessary.
pthread mutexes are const-constructible in a literal sense, just not in the sense Rust requires. In C you can initialize a pthread_mutex_t with the PTHREAD_MUTEX_INITIALIZER initializer list instead of pthread_mutex_init, and at least with glibc there's no subsequent allocation when using the lock. But Rust can't do in-place construction[1] (i.e. placement new in C++ parlance), which is why Rust needs to be able to "move" the mutex. Moving a mutex is otherwise non-sensical once the mutex is visible--it's the address of the mutex that the locking is built around.
The only thing you gain by not using pthread_mutex_t is a possible smaller lock--pthread_mutex_t has to contain additional members to support robust, recursive, and error checking mutexes, though altogether that's only 2 or 3 additional words because some are union'd. I guess you also gain the ability to implement locking, including condition variables, barriers, etc, however you want, though now you can't share those through FFI.
[1] At least not without unsafe and some extra work, which presumably is a non-starter for a library type where you want to keep it all transparent.
> The effect of referring to a copy of the object when locking, unlocking, or destroying it is undefined.
https://pubs.opengroup.org/onlinepubs/9699919799/functions/V...
I.e., if I pthread_mutex_init(&some_addr, ...), I cannot then copy the bits from some_addr to some_other_addr and then pthread_mutex_lock(&some_other_addr). Hence not movable.
> Moving a mutex is otherwise non-sensical once the mutex is visible
What does "visible" mean here? In Rust, if you hold a mutable reference to an object, there are no other references to that object, hence it is safe to move.
I’m actually thinking of the sheer size of pthread mutexes. They are giant. The issue says that they wanted something small, efficient, and const constructible. Pthread mutexes are too large for most applications doing fine-grained locking.
Seems like the simple solution to this problem would be to have both, no?
A simple native lock in the standard library along with a nicer implementation (also in the standard library) that depends on the simple lock?
My takeaway is that the documentation should make more explicit recommendations depending on the situation -- i.e., people writing custom allocators should use std mutexes; most libraries and allocations that are ok with allocation should use parking_lot mutexes; embedded or libraries that don't want to depend on allocate should use std mutexes. Something like that.
Author of the original WTF::ParkingLot here (what rust’s parking_lot is based on).
I’m surprised that this only compared to std on one platform (Linux).
The main benefit of parking lot is that it makes locks very small, which then encourages the use of fine grained locking. For example, in JavaScriptCore (ParkingLot’s first customer), we stuff a 2-bit lock into every object header - so if there is ever a need to do some locking for internal VM reasons on any object we can do that without increasing the size of the object
> The main benefit of parking lot is that it makes locks very small, which then encourages the use of fine grained locking. For example, in JavaScriptCore (ParkingLot’s first customer), we stuff a 2-bit lock into every object header - so if there is ever a need to do some locking for internal VM reasons on any object we can do that without increasing the size of the object
IMHO that's a very cool feature which is essentially wasted when using it as a `Mutex<InnerBlah>` because the mutex's size will get rounded up to the alignment of `InnerBlah`. And even when not doing that, afaict `parking_lot` doesn't expose a way to use the remaining six bits in `parking_lot::RawMutex`. I think the new std mutexes made the right choice to use a different design.
> I’m surprised that this only compared to std on one platform (Linux).
Can't speak for the author, but I suspect a lot of people really only care about performance under Linux. I write software that I often develop from a Mac but almost entirely deploy on Linux. (But speaking of Macs: std::mutex doesn't yet use futexes on macOS. Might happen soon. https://github.com/rust-lang/rust/pull/122408)
> I suspect a lot of people really only care about performance under Linux
Yeah this is true
How can a parking_lot lock be less than 1 byte? does this uses unsafe?
Rust in general doesn't support bit-level objects unless you cast things to [u8] and do some shifts and masking manually (that is, like C), which of course is wildly unsafe for data structures with safety invariants
Original post: https://webkit.org/blog/6161/locking-in-webkit/
Post that mentions the two bit lock: https://webkit.org/blog/7122/introducing-riptide-webkits-ret...
I don’t know the details of the Rust port but I don’t imagine the part that involves the two bits to require unsafe, other than in the ways that any locking algorithm dances with unsafety in Rust (ownership relies on locking algorithms being correct)
This is very similar to how Java's object monitors are implemented. In OpenJDK, the markWord uses two bits to describe the state of an Object's monitor (see markWord.hpp:55). On contention, the monitor is said to become inflated, which basically means revving up a heavier lock and knowing how to find it.
I'm a bit disappointed though, I assumed that you had a way of only using 2 bits of an object's memory somehow, but it seems like the lock takes a full byte?
The idea is that six bits in the byte are free to use as you wish. Of course you'll need to implement operations on those six bits as CAS loops (which nonetheless allow for any arbitrary RMW operation) to avoid interfering with the mutex state.
This article elaborates how it works.
The original webkit blog post about parking lot mutex implementation is a great read https://webkit.org/blog/6161/locking-in-webkit/
> Poisoning: Panic Safety in Mutexes
This is one of the biggest design flaws in Rust's std, in my opinion.
Poisoning mutexes can have its use, but it's very rare in practice. Usually it's a huge misfeature that only introduces problems. More often than not panicking in a critical section is fine[1], but on the other hand poisoning a Mutex is a very convenient avenue for a denial-of-service attack, since a poisoned Mutex will just completely brick a given critical section.
I'm not saying such a project doesn't exist, but I don't think I've ever seen a project which does anything sensible with Mutex's `Poisoned` error besides ignoring it. It's always either an `unwrap` (and we know how well that can go [2]), or do the sensible thing and do this ridiculous song-and-dance:
Suffice to say, it's a pain.So in a lot of projects when I need a mutex I just add `parking_lot`, because its performance is stellar, and it doesn't have the poisoning insanity to deal with.
[1] -- obviously it depends on a case-by-case basis, but if you're using such a low level primitive you should know what you're doing
[2] -- https://blog.cloudflare.com/18-november-2025-outage/#memory-...
> It's always either an `unwrap` (and we know how well that can go [2])
If a mutex has been poisoned, then something must have already panicked, likely in some other thread, so you're already in trouble at that point. It's fine to panic in a critical section if something's horribly wrong, the problem comes with blindly continuing after a panic in other threads that operate on the same data. In general, you're unlikely to know what that panic was, so you have no clue if the shared data might be incompletely modified or otherwise logically corrupted.
In general, unless I were being careful to maintain fault boundaries between threads or tasks (the archetypical example being an HTTP server handling independent requests), I'd want a panic in one thread to cascade into stopping the program as soon as possible. I wouldn't want to swallow it up and keep using the same data like nothing's wrong.
> If a mutex has been poisoned, then something must have already panicked, likely in some other thread, so you're already in trouble at that point.
I find that in the majority of cases you're essentially dealing with one of two cases:
1) Your critical sections are tiny and you know you can't panic, in which case dealing with poisoning is just useless busywork.
2) You use a Mutex to get around Rust's "shared xor mutable" requirement. That is, you just want to temporarily grab a mutable reference and modify an object, but you don't have any particular atomicity requirements. In this case panicking is no different than if you would panic on a single thread while modifying an object through a plain old `&mut`. Here too dealing with poisoning is just useless busywork.
> I'd want a panic in one thread to cascade into stopping the program as soon as possible.
Sure, but you don't need mutex poisoning for this.
> so you have no clue if the shared data might be incompletely modified or otherwise logically corrupted.
One can make a panic wrapper type if they cared: It's what the stdlib Mutex currently does:
MutexGuard checks if its panicking during drop using `std::thread::panicking()`, and if so, sets a bool on the Mutex. The next acquirer checks for that bool & knows state may be corrupted. No need to bake this into the Mutex itself.
To the contrary, the projects I've been part of have had no end of issues related to being cancelled in the middle of a critical section [1]. I consider poisoning to be table stakes for a mutex.
[1] https://sunshowers.io/posts/cancelling-async-rust/#the-pain-...
Well, I mean, if you've made the unfortunate decision to hold a Mutex across await points...?
This is completely banned in all of my projects. I have a 100k+ LOC project running in production, that is heavily async and with pervasive usage of threads and mutexes, and I never had a problem, precisely because I never hold a mutex across an await point. Hell, I don't even use async mutexes - I just use normal synchronous parking lot mutexes (since I find the async ones somewhat pointless). I just never hold them across await points.
We're currently working on separating poison from mutexes, such that the default mutexes won't have poisoning (no more `.lock().unwrap()`), and if you want poisoning you can use something like `Mutex<Poison<T>>`.
Yeah, I'm looking forward to it!
While we're at it, another thing that'd be nice to get rid of is `AssertUnwindSafe`, which I find even more pointless.
There are cases where it is useful.
I had a case where if the mutex was poisened it was possible to reset the lock to a safe state (by writing a new value to the locked content).
Or you may want to drop some resource or restart some operation instead of panicing if it is poisoned.
But I agree that the default behavior should be that the user doesn't have to worry about it.
I will personally recommend that unless you are writing performance sensitive code*, don’t use mutexes at all because they are too low-level an abstraction. Use MPSC queues for example, or something like RCU. I find these abstractions much more developer friendly.
[dead]
[dead]