Better temporary lifetimes (tracking issue for RFC 66)

b8465d2
Opened by Niko Matsakis at 2023-06-26 16:27:53

Tracking issue for rust-lang/rfcs#66: better temporary lifetimes.

Some unresolved questions to be settled when implementing:

  1. This implies that the lifetimes of temporaries is not known until after typeck. I think this is ok but it is a phase change which can sometimes be tricky (currently temporary lifetimes are known before typeck).
  2. We have to specify the precise rules over when a temporary is extended. There are various subtle cases to be considered:
  • Clearly we must consider whether the parameter type is a reference with a lifetime that also appears in the return type. Does the variance in the return type matter? (I think: no, too subtle and not worth it.)
  • When do we decide what the type of the parameter is? Do we consider the declared type, the type after inference, or a hybrid?

Some examples where this matters:

fn identity<T>(x: T) -> T { x }

// Are these the same or different?

foo(&3);
foo::<&int>(&3);

My take: Probably we should just consider the fully inferred type.

  1. Tracking issue for rust-lang/rfcs#66: better temporary lifetimes.

    Some unresolved questions to be settled when implementing:

    1. This implies that the lifetimes of temporaries is not known until after typeck. I think this is ok but it is a phase change which can sometimes be tricky (currently temporary lifetimes are known before typeck).
    2. We have to specify the precise rules over when a temporary is extended. There are various subtle cases to be considered:
    • Clearly we must consider whether the parameter type is a reference with a lifetime that also appears in the return type. Does the variance in the return type matter? (I think: no, too subtle and not worth it.)
    • When do we decide what the type of the parameter is? Do we consider the declared type, the type after inference, or a hybrid?

    Some examples where this matters:

    fn identity<T>(x: T) -> T { x }
    
    // Are these the same or different?
    
    foo(&3);
    foo::<&int>(&3);
    

    My take: Probably we should just consider the fully inferred type.

    Unresolved questions

    • [ ] Resolve the concern about semantic changes described in comment.

    Niko Matsakis at 2022-03-17 13:37:07

  2. cc me

    Josh Matthews at 2014-07-22 17:30:57

  3. I've got an example which seems related. Apparently as of 470dbef29 a let binding is necessary before a call to .as_slice() to keep it alive long enough to use in a function?

    Kevin Cantú at 2014-07-26 15:50:44

  4. :+1:

    Johannes Schickling at 2014-12-06 21:54:47

  5. Is there any reason we can't just rewrite:

    foo().bar().baz(&bip().bop().boop())
    

    as:

    let mut tmp = foo();
    let mut tmp = foo.bar();
    {
        let mut tmp2 = bip();
        let mut tmp2 = tmp2.bop();
        let mut tmp2 = tmp2.boop();
        let mut tmp2 = &tmp2;
        tmp.baz(tmp2)
    }
    

    And let the optimizer do it's thing? Given that rust's move semantics, I can't see how this could cause any problems. This is just a sanity check; I'd be happy with a "no, it's complicated".

    This would fix the write!(io::stdout().lock(), "{}", thing); case.

    Steven Allen at 2016-03-02 17:21:27

  6. The thing is that destructors sometimes have side-effects (for example, RefCell destructors). If we did the rewrite you suggested, at least according to the current rules, it would mean that the destrutors always execute at the end of the current block, more or less, which is usually not what people want. The current rules are mostly that temporary destructors end at the end of the current statement, unless the temporary is being assigned into a let-bound variable, in which case they live as long as the variable. (This is roughly what C++ does as well.) The goal of this RFC was to be smarter about knowing when the temporary will be assigned into a let-bound variable.

    On Wed, Mar 02, 2016 at 09:22:19AM -0800, Steven Allen wrote:

    Is there any reason we can't just rewrite:

    foo().bar().baz(&bip().bop().boop())
    

    as:

    let mut tmp = foo();
    let mut tmp = foo.bar();
    let mut tmp2 = bip();
    let mut tmp2 = tmp2.bop();
    let mut tmp2 = tmp2.bop();
    let mut tmp2 = tmp2.boop();
    let mut tmp2 = &tmp2;
    tmp.baz(tmp2)
    

    And let the optimizer do it's thing? Given that rust's move semantics, I can't see how this could cause any problems. This is just a sanity check; I'd be happy with a "no, it's complicated".


    Reply to this email directly or view it on GitHub: https://github.com/rust-lang/rust/issues/15023#issuecomment-191333403

    Niko Matsakis at 2016-03-02 21:24:52

  7. The goal of this RFC was to be smarter about knowing when the temporary will be assigned into a let-bound variable.

    Assuming you mean "when a temporary would be referenced by a let-bound variable", I see why this is more complicated.

    Steven Allen at 2016-03-03 01:10:41

  8. I just want to leave my 2c that this is a very frustrating issue for newbies. I didn't think of myself as a newbie, having written two fairly stable libraries in rust, but today I was trying to write a memory manager for micro-controllers, learning about unsafe and PhantomData and all sorts of fun stuff -- all of which went great. Then, when I was ready to test my library I hit this compiler error which I had never seen before.

    Thinking it was something I did with the internals, I began to question my entire way of doing things. Long story short it took me the better part of 4 hours to finally realize that nothing was wrong with my library. No -- what was wrong is that I hadn't put a let statement on every unwrap. I really feel like rust has left me down -- the behavior as-is is super unintuative.

    I wish I had listened to the compiler better... maybe that is the lesson I really should learn here.

    Rett Berg at 2016-08-28 05:43:51

  9. @nikomatsakis What's the status of this issue?

    Brian Anderson at 2017-03-01 18:38:17

  10. @brson no change at all. But I've had it on my list of things to write up instructions for. In fact, I was just coming to take a stab at that.

    Niko Matsakis at 2017-03-02 00:55:07

  11. OK, so, it's taking me a lot longer to prepare those instructions than I expected. In part this is because the original RFC is not very well specified. I've started a "amendment" (basically a rewrite) that both specifies the current behavior and tries to more precisely specify what RFC 66 should do. This will hopefully get done soon, you can see a draft here.

    My basic plan (at a very high level) is to:

    • step 1: convert the construction of the "extended temporary" tables into on-demand
      • I hope that typeck doesn't need these results; I think they are only needed by borrowck, but I have to verify
    • step 2: have that construction also demand the signatures of functions that are called and the typeck tables, so that we can find out what methods are being called
      • if that doesn't work, a bit more refactoring will be needed
      • if it does work, we should be able to use that to compute the temporary tables easily enough...

    Niko Matsakis at 2017-03-02 16:07:25

  12. I'm wondering about how this feature will interact with non-lexical lifetimes. Consider the following:

    fn main() {
        let x;
        {
            x = &String::from("foo");
        }
    }
    

    Currently, this results in an error: temporary value dropped here while still borrowed. Under RFC 66, the lifetime of the temporary value (the result of String::from("foo")) would be extended to encompass the lifetime of its referents. The original RFC is pretty unspecific about the exact behavior of this extension.

    My question: with NLLs, the lifetime of the reference in x is shortened:

    fn main() {
        let x;
        {
            x = &String::from("foo");
            // `x` is dead here
            // Is the `String` dropped here?
        }
        // Or is it dropped here?
    }
    

    Should the lifetime of the temporary created by String::from be extended to match the lexical (scope-based) lifetime of x, or would the compiler see that the reference stored in x is dead and drop the String immediately?

    Taylor Cramer at 2017-08-07 17:33:54

  13. Just throwing my opinion at this one. It's been one of the more frustrating issues with getting myself acquainted with Rust. A lot of popular languages now actually encourage method chaining. C#'s LINQ being the prime example.

    var peopleOldEnough = people.Where(p => p.Age >= 18).Select(p => p.FirstName);
    

    Not being able to call methods in a way that, at this point, feels natural when there's not a compelling reason why it needs to be that way only serves to act as a barrier between programmers and the language. I would love if this RFC got more attention. I bet this issue would be generating a lot of buzz if people knew

    • It's probably 100% possible to implement today
    • There is an existing backwards-compatible RFC for this feature

    I understand why this isn't a high-priority issue. But, the beginner experience should be considered one of the most crucial things to nail down to drive Rust adoption. And, as a beginner, this is a bit of a roadblock.

    Nate Pisarski at 2017-12-11 02:39:17

  14. Yes, I hit it few times per week and get upset when I need to write a sequence of let statements instead of a chain. Chain emphasizes that the final result is the most important thing in a statement. It reduces pressure on brain for a programmer / reader. A sequence of let identifiers makes every let defined variable equally important for a reader to follow, and it is only to finally realise that all except the last variable do not really matter. Clear productivity loss even for experienced programmers, in my opinion. Longer it is open, more loss is accumulated.

    Andrey at 2017-12-13 10:19:08

  15. Does thinking about this as sets of constraints help?

    http://smallcultfollowing.com/babysteps/blog/2018/04/27/an-alias-based-formulation-of-the-borrow-checker/

    It certainly would make Rust a lot more approachable if a variable lived for as long as it was needed. We could use tooling to make it clear where an expression's lifetime ends. While we loose a little on explicitness, we gain a lot on on readability and coding ergonomics.

    Squirrel at 2018-05-15 10:36:24

  16. @gilescope

    It certainly would make Rust a lot more approachable if a variable lived for as long as it was needed.

    Note that we are not proposing that — or at least, we are not proposing that we use lifetime inference to decide when a destructor runs. That would prevent us from making improvements like NLL without changing the runtime semantics of code, which is not good. This RFC is actually much more limited — it runs before any inference etc has been done.

    I've not carved out any time for my expanded version of the RFC -- I should at least push it somewhere I guess -- but I still think this is a good idea =)

    Niko Matsakis at 2018-05-15 16:46:13

  17. Forum discussion of cases where this is a semantic breaking change:

    https://internals.rust-lang.org/t/request-for-a-smarter-lifetime-format-as-str-should-be-valid/12423/13

    Matt Brubeck at 2020-05-28 16:47:40

  18. We discussed this in today's @rust-lang/lang meeting, and this needs some further design work in order to be actionable. We still think this is a good idea, but it needs an owner and some specific design work.

    Josh Triplett at 2021-11-10 18:26:03

  19. Some implementation notes and other thoughts here:

    https://hackmd.io/TT5mmiwtTZKESnlPT_2Fnw

    Niko Matsakis at 2022-02-17 19:47:59

  20. @dingxiangfei2009 and I just had a call discussing how to implement this RFC.

    We settled on an implementation plan like this:

    • Refactor the code that computes the temporary scope information:
      • Instead of storing that information in the ScopeTree, we would store it in the TypeckResults.

    We can compute the information in typeck_with_fallback, right around here:

    https://github.com/rust-lang/rust/blob/461e8078010433ff7de2db2aaae8a3cfb0847215/compiler/rustc_typeck/src/check/mod.rs#L465-L469

    It needs to be accessed by the "compute generator interiors" code, so it must happen sometime before that. It should happen right around the point where we analyze closure upvars, so that type info is available.

    This initial PR would not affect semantics. Once it is done, we can modify the code to examine type signatures and extend the lifetime of call arguments.

    Niko Matsakis at 2022-03-17 13:33:39

  21. If you just want to avoid having a bunch of let bindings everywhere, I realized you can get around that by doing it in a macro.

    macro_rules! baz {
        ( $x:ident ) => {
            let mut k= foo();
            let mut $x = k.bar();
        };
    }
    

    Ken Reed at 2023-06-26 16:27:53