Using pattern match directly on LockResult causes deadlock
Came across this as I was playing around with implementing memoisation via double-checked-locking with sync::RWLock:
The following, which puts the .read() result into a temporary binding works fine:
// ...
{
let lock_result = self.cache.read(); // <-- temporary binding
match lock_result {
Ok(ref data) if data.len() > to => Some(data[to].clone()),
_ => None,
}
}
.unwrap_or_else(|| {
// We need to write now, so get a write lock
let mut data = self.cache.write().unwrap();
// ...
The following, which directly matches on the result of .read(), dies in a deadlock:
// ...
{
match self.cache.read() { // <-- direct pattern matching
Ok(ref data) if data.len() > to => Some(data[to].clone()),
_ => None,
}
}
.unwrap_or_else(|| {
// We need to write now, so get a write lock
let mut data = self.cache.write().unwrap();
// ...
I'm guessing the direct pattern match is compiled/desugared in a way that the RwLockReadGuard never goes out of scope and thus the read lock is never released, but I'm not sure if this is the expected behaviour. It certainly surprised me, especially since the entire "read" block is scoped by its own set of braces.
For extra context, here is the relevant part in my scratchpad project. I have a concurrency test for it that fails to finish if the pattern matching section is changed to match the latter case.
Temporaries live for the entire statement, never shorter. There are cases where they are made to live longer (e.g. to make stuff like
let x = &5;work) and obviously the surrounding expression can move the temporary into a function where it'll be dropped earlier, but "left to its own devices" a temporary will live until the end of the statement.The second snippet shows part of one huge multi-line statement, and the extra set of braces around the
matchis as irrelevant as it is in, say,let x = {5};. In contrast, in the first snippet theself.cache.read()temporary is moved into a short-lived variable and thus dropped at the brace between thematchand theunwrap_or_elsecall.So everything is working as intended IMHO. Still, what can we do to help people who run into it in the future? Maybe a lint could look for multiple locking API calls within the same expression. However, it would need to be carefully written to avoid a false positive for code like the second. Or maybe it's better to work on MIR more generally and try to detect overlapping code regions holding the same lock, even if it's spread across statements. Sounds like a nice idea for clippy (cc @Manishearth @llogiq).
Hanna Kruppe at 2016-11-06 12:00:34
Interesting, I think for me, the main point of confusion is this:
The second snippet shows part of one huge multi-line statement, and the extra set of braces around the match is as irrelevant as it is in, say,
let x = {5};From my reading of the Rust book, I was under the impression that braces
{}define blocks and blocks define binding scope, so that any bindings defined inside them will be out of scope outside them. For example this section on bindings alludes to that.In this case, it was primarily non-obvious to me that the braces would be irrelevant for the scoping of pattern match bindings (causing the lock to not be dropped), and I'm now wondering what the reasoning is behind that. Is it reasonable and/or possible to make the braces significant in this case?
Otherwise, yes, a warning would be nice :)
PS for what it's worth, turning it into an
if letto make the binding more explicit also deadlocksif let Ok(ref data) = self.cache.read() { if data.len() > to { Some(data[to].clone()) } else { None } } else { None }EDIT: a few more findings
Adding a
();(or anything reallyprintln!,let x = 1) to make the block multiline doesn't help, the following still deadlocks:{ (); // block consists of an expression statement and an expression but still fails to release lock match self.cache.read() { Ok(ref data) if data.len() > to => Some(data[to].clone()), _ => None, } }This however, works fine:
{ let t = match self.cache.read() { Ok(ref data) if data.len() > to => Some(data[to].clone()), _ => None, }; t }Lloyd at 2016-11-06 12:38:47
The destructors for temporaries in tail expressions run when the containing block ends. The intent is to make braces "transparent". We need good examples for that.
Ariel Ben-Yehuda at 2016-11-06 22:25:10
Writing sound multithreaded code is hard enough as it is, and so is testing it correctly to find bugs. In this case I got almost everything right, but at the end, got snagged by what looks like a hard-to-understand syntactical mistake.
I still don't really understand the full breadth of why some examples work and others don't. But I'd like to pitch in to make it harder to for others to make the same mistake.
Lloyd at 2016-11-07 02:18:29
This, as expected, reproduces today. I'm uncertain whether we don't want it to... it feels like this isn't actually a bug (cc @rust-lang/lang, can you confirm?)
Mark Rousskov at 2017-05-16 00:15:01
Reviewing this after 1/2 year, I have to say I'm still uneasy about the current behaviour. I get that there may be reasons for it from a language-implementations point of view, but from a end-user/language-user standpoint, it doesn't seem desirable.
In this particular case, my expectations of referential transparency as well as brace-based scoping were broken, leading to a deadly silent runtime failure.
Having said that, I'd really like to see a reason why the current behaviour is desirable for the end user. Short of one, IMHO the implementation should be fixed, or at the very least, docs updated and a warning thrown at compile time
EDIT: grammar fix.
Lloyd at 2017-05-16 01:46:49
Oh, no, I forgot to include my minimal example. Well, I can retype it up if anyone wants me to, otherwise I'll leave it as-is since composing it was rather.. difficult, IIRC.
Mark Rousskov at 2017-05-16 01:51:15
I agree that these rules can be confusing, particularly around
matchappearing in tail position. That said, I'm not sure what's the best fix -- and obviously any change here would most likely be backwards incompatible (i.e., some people may be relying on the dtor to run when it does).Niko Matsakis at 2017-05-16 01:52:45
I believe this is a duplicate of #21114.
Ryan Scheel at 2017-05-29 04:11:06
@Havvy Definitely looks related given this comment and the following response.
Feels good to see that it is a bug that needs to be fixed after all.
Lloyd at 2017-05-29 04:23:39
The reason this works is because you are supposed to be able to write this:
{ match self.cache.read() { // <-- direct pattern matching Ok(ref data) => Some(data) _ => None, } }.map(|data| { // use `data` - the lock better be held })That pattern is quite common in some places and should better work.
Ariel Ben-Yehuda at 2017-05-29 10:14:55
Of course, because destructor locations aren't sensitive to lifetime inference, the lock must also be held in your example.
Also, #21114 is about the borrow checker being overly stupid, not about destructor ordering, so @pnkfelix was wrong in his diagnosis.
Ariel Ben-Yehuda at 2017-05-29 10:19:34
@arielb1 Thanks for that example. I must say, it's one of the most convincing examples I've seen thus far. That said, I think:
- The braces surrounding the
matchand beforemapare superfluous; you can do without them (Playground link):
match mutex.lock() { Ok(ref data) => Some(data.0), _ => None, }.map(|data| { data + 1 })- That pattern seems a bit unnecessarily verbose; why not just work with
ref datainside the pattern match arm?
I have no doubt though that there is code like this out there; it does seem very convenient to do something like this and perhaps sometimes it's can happen due to personal preference for style.
Lloyd at 2017-05-29 11:24:08
- The braces surrounding the
I've not programmed in rust ever, but just from the little I've read about it,
{ //start scoping. match self.cache.read() { //gain a lock and read. Ok(ref data) => Some(data) _ => None, } //no semi colon so data leaks out of the next scope e.g. it's not unit() //end scope. }.map(|data| { //data is available, but the lock is out of scope. // use `data` - the lock better be held, but it's not, because you ended the scope, error? })vs some pseudo code ish looking thing.
{ let x = match self.cache.read() { Ok(ref data) => Some(data) _ => None, } x.maintainLock? x.lend? //how do you punch the lock through this? guessing you don't or can't. so why does it make sense to have brackets in the first place? }.map(|data| { // use `data` - the lock better be held })let x = { match self.cache.read() { Ok(ref data) => Some(data) _ => None, } } //this deadzone is worrying though. x.lock().map(|data| { // use `data` - the lock better be held })I can't think of a way that makes syntactic sense, given the simple rules I've briefly heard about.
Ryan Leach at 2017-06-03 05:30:49
@ryantheleach - Temporaries created in the final expression of a block live as long as temporaries in the statement containing the block.
Ryan Scheel at 2017-06-03 10:54:36
That makes a little more sense now.
Ryan Leach at 2017-06-03 15:02:47
Not knowing the precise rules, my intuition was that temporaries would be dropped "as soon as possible". This is clearly wrong, as people explained in their comments and as explained in the reference.
Adjusting the language so that it matches my intuition is impossible in general, I think, because it would change the behavior of existing code. So instead it's the intuition -- or at least the knowledge -- that should be changed. So I opened two issues: one for The Book and one for clippy.
I think however that this issue is precisely what linear data types (rust-lang/rfcs#814, or at least part thereof) intend to solve. In particular, the compiler could give a warning when a value of a data type that is marked as "must be dropped explicitly", goes out of scope without being dropped explicitly. (Edit: please note that RFC 814 is different from what I propose here.) I'm not sure however what would be the best way to treat compound data types (e.g. when a field of a struct is marked as "must be dropped explicitly"), so I'm leaving this as a suggestion.
PieterPenninckx at 2017-07-26 09:23:58
Just doing some triage.
I do not believe this to be a duplicate of #21114.
Some people connected the two issues due to this comment https://github.com/rust-lang/rust/issues/21114#issuecomment-74438455 and my response https://github.com/rust-lang/rust/issues/21114#issuecomment-74508522, but its important to note a couple things.
- First, the specific comment from that example (https://github.com/rust-lang/rust/issues/21114#issuecomment-74438455) is an instance of #21114, in the sense that the compiler had been rejecting that code. (Under NLL, that example is no longer rejected.)
- (Or at least, the reduced test that was provided is an instance of #21114. Its possible that the original non-reduced code was closer to an instance of #37612.)
- Second, #21114 was deliberately titled to note that spurious semi-colons/unit exprs had become necessary.
The examples given here, however, are about the consequences of the dynamic semantics of the language. The consequences listed here are, by definition, not cases where adding of a semi-colon or a
let x = ...; xis spurious; rather, it has an observable effect on the drop order of rvalue temporaries.Anyway, this is a long winded build up to me saying that while I do not believe this to be a duplicate of #21114, I do think it is a duplicate or perhaps sub-bug of #46413.
Felix S Klock II at 2018-11-09 14:07:35
- First, the specific comment from that example (https://github.com/rust-lang/rust/issues/21114#issuecomment-74438455) is an instance of #21114, in the sense that the compiler had been rejecting that code. (Under NLL, that example is no longer rejected.)
the following case looks much more confusing (and error-prone) than matching directly on lock result, and it seems to be of the same reason, we happen to find it in our production code which no one spot it may cause problem when the code is merged.
in the following POC, using wait_for_update1 would not deadlock, and wait_for_update2 will
use std::sync::Arc; use tokio::sync::{Notify, RwLock}; #[derive(Default)] struct ProxyInner { nr_updates: u64, } impl ProxyInner { fn update(&mut self) { self.nr_updates += 1; } fn get_update(&self) -> Option<u64> { if self.nr_updates % 2 == 0 { Some(self.nr_updates) } else { None } } } #[derive(Clone)] struct ProxyClient { inner: Arc<RwLock<ProxyInner>>, notify: Arc<Notify>, } impl ProxyClient { fn new() -> ProxyClient { ProxyClient { inner: Arc::new(RwLock::new(ProxyInner::default())), notify: Arc::new(Notify::new()), } } async fn update(&self) { self.inner.write().await.update(); // do something self.notify.notify_waiters(); } // works fine async fn wait_for_update1(&self) -> Option<u64> { loop { let res = self.inner.read().await.get_update(); match res { Some(nr) => return Some(nr), None => { self.notify.notified().await; } } } } // may deadlock async fn wait_for_update2(&self) -> Option<u64> { loop { match self.inner.read().await.get_update() { Some(nr) => return Some(nr), None => { self.notify.notified().await; } } } } } #[tokio::main(worker_threads = 2)] async fn main() { let pc = ProxyClient::new(); let task1 = { let pc = pc.clone(); tokio::spawn(async move { loop { pc.update().await; } }) }; let task2 = { let pc = pc.clone(); tokio::spawn(async move { loop { let nr = pc.wait_for_update1().await; println!("task2: nr = {:?}", nr); } }) }; task1.await.unwrap(); task2.await.unwrap(); }虎 at 2023-09-15 07:39:43
I think I encountered a similar bug, in the following code:
use std::sync::Mutex; #[derive(Debug)] struct Test { a: i32, b: i32, } impl Test { pub fn new(value: &Mutex<i32>) -> Self { Self { a: { *value.lock().expect("cannot lock") }, b: { *value.lock().expect("cannot lock") }, } } } fn main() { let value = Mutex::new(0); let t = Test::new(&value); println!("{:?}", t); }the
newfunction deadlocks. I could not find a convincing explanation why theLockGuardis not dropped between the initializations ofaandb.tzachar at 2024-10-10 07:31:50