Tracking issue for RFC #1909: Unsized Rvalues (unsized_locals, unsized_fn_params)
This is a tracking issue for the RFC "Unsized Rvalues " (rust-lang/rfcs#1909).
Steps:
- [ ] Implement the RFC (cc @rust-lang/compiler -- can anyone write up mentoring instructions?)
- [ ] Adjust documentation (see instructions on forge)
- [ ] Stabilization PR (see instructions on forge)
Unresolved questions:
-
[ ] How can we mitigate the risk of unintended unsized or large allocas? Note that the problem already exists today with large structs/arrays. A MIR lint against large/variable stack sizes would probably help users avoid these stack overflows. Do we want it in Clippy? rustc?
-
[ ] How do we handle truely-unsized DSTs when we get them? They can theoretically be passed to functions, but they can never be put in temporaries.
-
[ ] Decide on a concrete syntax for VLAs.
This is a tracking issue for the RFC "Unsized Rvalues " (rust-lang/rfcs#1909).
Steps:
- [ ] Implement the RFC (cc @rust-lang/compiler -- can anyone write up mentoring instructions?)
- [ ] Adjust documentation (see instructions on forge)
- [ ] Stabilization PR (see instructions on forge)
Blocking bugs for
unsized_fn_params:- https://github.com/rust-lang/rust/issues/111175
- https://github.com/rust-lang/rust/issues/115709 (bad interaction with extern_type: we either need to be okay with post-mono checks or need a trait for "dynamically sized" types)
- Reject unsized arguments for functions with non-Rust ABI
Related bugs:
- [x] https://github.com/rust-lang/rust/issues/61335 -- ICE when combined with async-await
- [x] https://github.com/rust-lang/rust/issues/68304 --
Box<dyn FnOnce>doesn't respect self alignment
Unresolved questions:
- [ ] What are the MIR semantics for unsized locals? We currently do not have operational semantics for them, and the way they currently work, there are no good operational semantics. This needs a complete from-scratch re-design.
- [ ] Can we carve out a path of "guaranteed no alloca" optimization? (See #68304 for some related discussion)
- [ ] Given that LLVM doesn't seem to support alloca with alignment, how do we expect to respect alignment limitations? (See #68304 for one specific instance)
- [ ] How can we mitigate the risk of unintended unsized or large allocas? Note that the problem already exists today with large structs/arrays. A MIR lint against large/variable stack sizes would probably help users avoid these stack overflows. Do we want it in Clippy? rustc?
- [ ] How do we handle truely-unsized DSTs when we get them? They can theoretically be passed to functions, but they can never be put in temporaries.
- [ ] Decide on a concrete syntax for VLAs.
- [ ] What about the interactions between async-await/generators and unsized locals?
- [ ] We currently allow
extern typearguments withunsized_fn_params, but that does not make much sense and leads to ICEs: https://github.com/rust-lang/rust/issues/115709
Niko Matsakis at 2019-06-04 17:18:01
How do we handle truely-unsized DSTs when we get them?
@aturon: Are you referring to
extern type?Aaron Hill at 2018-02-20 21:52:05
@Aaron1011 that was copied straight from the RFC. But yes, I presume that's what it's referring to.
Aaron Turon at 2018-02-20 21:59:47
Why would unsized temporaries ever be necessary? The only way it would make sense to pass them as arguments would be by fat pointer, and I cannot think of a situation that would require the memory to be copied/moved. They cannot be assigned or returned from functions under the RFC. Unsized local variables could also be treated as pointers.
In other words, is there any reason why unsized temporary elision shouldn't be always guaranteed?
Lance Roy at 2018-02-28 03:36:15
Is there any progress on this issue? I'm trying to implement VLA in the compiler. For the AST and HIR part, I added a new enum member for
syntax::ast::ExprKind::Repeatandhir::Expr_::ExprRepeatto save the count expression as below:enum RepeatSyntax { Dyn, None } syntax::ast::ExprKind::Repeat(P<Expr>, P<Expr>, RepeatSyntax) enum RepeatExprCount { Const(BodyId), Dyn(P<Expr>), } hir::Expr_::ExprRepeat(P<Expr>, RepeatExprCount)But for the MIR part, I have no idea how to construct a correct MIR. Should I update the structure of
mir::RValue::Repeatand correspondingtrans_rvaluefunction? What should they look like? What is the expected LLVM-IR?Thanks in advance if someone would like to write a simple mentoring instruction.
F001 at 2018-05-11 09:55:26
I'm trying to remove the
Sizedbounds and translate MIRs accordingly.Masaki Hara at 2018-05-26 14:23:00
An alternative that would solve both of the unresolved questions would be explicit
&movereferences. We could have an explicitalloca!expression that returns&move T, and truly unsized types work with&move Tbecause it is just a pointer.If I remember correctly, the main reason for this RFC was to get
dyn FnOnce()to be callable. SinceFnOnce()is not implementable in stable Rust, would it be a backward-compatible change to makeFnOnce::call_oncetake&move Selfinstead? If that was the case, then we could make&move FnOnce()be callable, as well asBox<FnOnce()>(viaDerefMove).cc @arielb1 (RFC author) @qnighy (currently implementing this RFC in #51131) @eddyb (knows a lot about this stuff)
Michael Hewson at 2018-07-14 19:02:42
@mikeyhew There's not really much of a problem with making by-value
selfwork and IMO it's more ergonomic anyway. We might eventually even haveDerefMovewithout&moveat all.Eduard-Mihai Burtescu at 2018-07-14 20:15:21
@eddyb
I guess I can see why people think it's more ergonomic: in order to opt into it, you just have to add
?Sizedto your function signature, or in the case of trait methods, do nothing. And maybe it will help new users of the language, since&movewouldn't be show up in documentation everywhere.If we're going to go ahead with this implicit syntax, then there are a few details that would be good to nail down:
-
If this is syntactic sugar for
&movereferences, what does it desugar too? For function arguments, this could be pretty straightforward: the lifetime of the reference would be limited to the function call, and if you want to extend it past that, you'd have to use explicit&movereferences. Sofn call_once(f: FnOnce(i32))) -> i32desugars too
fn call_once(f: &move FnOnce(i32)) -> i32and you can call the function directly on its argument, so
foo(|x| x + 1)desugars tofoo(&move (|x| x + 1)).And to do something fancier, you'd have to resort to the explicit version:
fn make_owned_pin<'a, T: 'a + ?Sized>(value: &'a move T) -> PinMove<'a, T> { ... } struct Thunk<'a> { f: &'a move FnOnce() }Given the above semantics,
DerefMovecould be expressed using unsized rvalues, as you said:EDIT: This is kind of sketchy though. What happens if the implementation is wrong, and doesn't call
f?// this is the "closure" version of DerefMove. The alternative would be to have an associated type // `Cleanup` and return `(Self::Target, Self::Cleanup)`, but that wouldn't work with unsized // rvalues because you can't return a DST by value fn deref_move<F: FnOnce(Self::Target) -> O, O>(self, f: F) -> O; // explicit form fn deref_move<F: for<'a>FnOnce(&'a move Self::Target) -> O, O>(&'a move self, f: F) -> O;I should probably write an RFC for this.
-
When do there need to be implicit allocas? I can't actually think of a case where an implicit alloca would be needed. Any function arguments would last as long as the function does, and wouldn't need to be alloca'd. Maybe something involving stack-allocated dynamic arrays, if they are returned from a block, but I'm pretty sure that's explicitly disallowed by the RFC.
Michael Hewson at 2018-07-20 03:49:03
-
@eddyb have you seen @alercah's RFC for DerefMove? https://github.com/rust-lang/rfcs/pull/2439
Michael Hewson at 2018-07-20 05:02:04
As a next step, I'll be working on trait object safety.
Masaki Hara at 2018-08-20 00:25:36
@mikeyhew Sadly @alercah just postponed their
DerefMoveRFC, but I think a separate RFC for&movethat complements that (when it does get revived) would be very much desirable. I would be glad to assist with that even, if you're interested.Alexander Regueiro at 2018-08-22 01:59:08
@alexreg I would definitely appreciate your help, if I end up writing an RFC for
&move.The idea I have so far is to treat unsized rvalues as a sort of sugar for
&movereferences with an implicit lifetime. So if a function argument has typeT, it will be either be passed by value (ifTisSized) or as a&'a move T, and the lifetime'aof the reference will outlive the function call, but we can't assume any more than that. For an unsized local variable, the lifetime would be the variable's scope. If you want something that lives longer than that, e.g. you want to take an unsized value and return it, you'd have to use an explicit&movereference so that the borrow checker can make sure it lives long enough.Michael Hewson at 2018-08-23 03:53:40
@mikeyhew That sounds like a reasonable approach to me. Has anyone specified the supposed semantics of
&moveyet, even informally? (Also, I'm not sure if bikeshedding on this has already been done, but we should probably consider calling it&own.)Alexander Regueiro at 2018-08-24 03:19:41
Not sure if this is the right place to document this, but I found a way to make a subset of unsized returns (technically, all of them, given a
T -> Box<T>lang item) work without ABI (LLVM) support:- only
RustABI functions can return unsized types - instead of passing a return pointer in the call ABI, we pass a return continuation
- we can already pass unsized values to functions, so if we could CPS-convert Rust functions (or wanted to), we'd be done (at the cost of a stack that keeps growing)
- @nikomatsakis came up with something similar (but only for
Box) a few years ago
- however, only the callee (potentially a virtual method) needs to be CPS-like, and only in the ABI, the callers can be restricted and/or rely on dynamic allocation, not get CPS-transformed
- while
Clonebecoming object-safe is harder, this is an alright starting point:
// Rust definitions trait CloneAs<T: ?Sized> { fn clone_as(&self) -> T; } impl<T: Trait + Clone> CloneAs<dyn Trait> for T { fn clone_as(&self) -> dyn Trait { self.clone() } } trait Trait: CloneAs<dyn Trait> {}// Call ABI signature for `<dyn Trait as CloneAs<dyn Trait>>::clone_as` fn( // opaque pointer passed to `ret` as the first argument ret_opaque: *(), // called to return the unsized value ret: fn( // `ret_opaque` from above opaque: *(), // the `dyn Trait` return value's components ptr: *(), vtable: *(), ) -> (), // `self: &dyn Trait`'s components self_ptr: *(), self_vtable: *(), ) -> ()- the caller would use the
ret_opaquepointer to pass one or more sized values to its stack frame- could allow
retreturn one or two pointer-sized values, but that's an optional optimization
- could allow
- we can start by allowing composed calls, of this MIR shape:
y = call f(x); // returns an unsized value z = call g(y); // takes the unsized value and returns a sized one// by compiling it into: f(&mut z, |z, y| { *z = call g(y); }, x)- this should work out of the box for
{Box,Rc,...}::new(obj.clone_as()) - while we could extract entire portions of the MIR into these "return continuations", that's not necessary for being able to express most things: worst case, you write a separate function
- since
Box::newworks, anything with a global allocator around could fall back to thatlet y = f(x);would work as well aslet y = *Box::new(f(x));- its cost might be a bit high, but so would that of a proper "unsized return" ABI
- we can, at any point, switch to an ABI where e.g. the value is copied onto the caller's stack, effectively "extending it on return", and there shouldn't be any observable differences
cc @rust-lang/compiler
Eduard-Mihai Burtescu at 2018-08-25 16:27:22
- only
@alexreg
Has anyone specified the supposed semantics of &move yet, even informally?
I don't think it's been formally specified. Informally,
&'a move Tis a reference that owns itsT. It's like- an
&'a mut Tthat owns theTinstead of mutably borrowing it, and therefore drops theTwhen dropped, or - a
Box<T>that is only valid for the lifetime'a, and doesn't free heap allocated memory when dropped (but still drops theT).
(Also, I'm not sure if bikeshedding on this has already been done, but we should probably consider calling it &own.)
Don't think that bikeshed has been painted yet. I guess
&ownis better. It requires a new keyword, but afaik it can be a contextual keyword, and it more accurately describes what is going on. Often times you would use it to avoid moving something in memory, so calling it&move Twould be confusing, and plus there's the problem of&move ||{}, which looks like&move (||{})but would have to mean& (move ||{})for backward compatibility.Michael Hewson at 2018-08-25 20:40:59
- an
@mikeyhew Oh, I'm sorry I haven't replied to the
&movethread, only noticed just now that some of it was addressed at me. I don't think&moveis a good desugaring for unsized values.At most,
&move T(without an explicit lifetime), is an ABI detail of passing down aTargument "by reference" - we already do this for types larger than registers, and it naturally extends to unsizedT.And even with
&move Tin the language, you don't get a mechanism for returning "detached" ownership of an unsized value, based solely on it, as returning&'a move Tmeans theTvalue must live "higher-up" (i.e.'ais for how long the caller provided the memory space for theTvalue).My previous comment, https://github.com/rust-lang/rust/issues/48055#issuecomment-415980600 provides one such mechanism, that we can implement today (others exist but require e.g. adding a new calling convention to LLVM). Only one such mechanism, if general enough to support all callees (of
<hr/>RustABI), is needed, in order to support the surface-level feature, and its details can/should remain implementation-defined.So IMO
&move Tis orthogonal and the only intersection here is that it "happens to" match what some ABIs do when they have to pass large (or in this case, of unknown size) values in calls. I do want something like&move Tfor opting intoBox's behavior from any library etc. but not for this.Eduard-Mihai Burtescu at 2018-08-26 06:27:23
@eddyb
Automatic boxing by the compiler feels weird to me (it occurs only in this special case), and people will do
MyRef::new(x.clone(), do_xyz()?), which feels like it would require autoboxing to implement sanely.However, we can force unsized returns to always happen the "right way" (i.e., to immediately be passed to a function with an unsized parameter) by a check in MIR. Maybe we should do that?
Ariel Ben-Yehuda at 2018-08-26 07:59:24
@arielb1 Yes, I'm suggesting that we can start with that MIR check and maybe add autoboxing if we want to (and only if an allocator is even available).
Eduard-Mihai Burtescu at 2018-08-26 09:44:40
A check that the unsized return value is immediately passed to another function, which can then be CPS'd without all the usual problems of CPS in Rust, sounds like it would work well enough technically. However, the user experience might be non-obvious:
- The restriction on what to do with the return value can be learned, but is still weird
- Having to CPS-transform manually in more complex scenarios is quite annoying
- It affects peak stack usage in a way that is not visible from the source code and completely surprising unless one knows how it's implemented
I think it's worthwhile thinking long and hard whether there might be more tailored features that address the usage patterns that we want to enable with unsized return values. For example, cloning trait objects might also be achieved by extending
box.Hanna Kruppe at 2018-08-28 14:48:06
@rkruppe If we can manage to "pass" unsized values down the stack without actually copying them, I don't think the stack usage should actually increase from what you would need to e.g. call
Box::new.For example, cloning trait objects might also be achieved by extending
box.Sure, but, AFAIK, all ideas, until now, for "emplace" / boxing, seemed to involve generics, whereas my scheme operates at the ABI level and requires making no assumptions about the unsized type. (e.g. @nikomatsakis' previous ideas relied on the fact that
Clone::clonereturnsSelf, therefore, you can get the size/alignment from itsselfvalue, and allocate the destination before calling it)Eduard-Mihai Burtescu at 2018-08-29 01:26:07
@rkruppe If we can manage to "pass" unsized values down the stack without actually copying them, I don't think the stack usage should actually increase from what you would need to e.g. call Box::new.
The stack size increase is not from the unsized value but from the rest of the stack frame of the unsized-value-returning function. For example,
fn new() -> dyn Debug { let _buf1 = [0u8; 1 << 10]; // this remains allocated "after returning" ... "hello world" // ... because here we actually call take() } fn take(arg: dyn Debug) { let _buf2 = [0u8; 1 << 10]; println!("{:?}", arg); } fn foo() { take(new()); // 2 KB peak stack usage, not 1 KB }(And we can't do tail call optimization because
newis passing a pointer to its stack frame to the continuation, so its stack frame needs to be preserved until after the call.)Hanna Kruppe at 2018-08-29 11:13:55
@rkruppe I assume you meant
takeinstead ofarg? ~~However, I think you need to take into account that_buf1is not live when the the function returns, and continuation get called. At that point, only a copy of"hello world"should be left on the stack.~~~~In other words, your
newfunction is entirely equivalent to this:~~fn new() -> dyn Debug { let return_value: dyn Debug = { let _buf1 = [0u8; 1 << 10]; "hello world" }; return_value }EDIT: @rkruppe clarified below.
Eduard-Mihai Burtescu at 2018-09-08 12:08:49
We discussed on IRC, for the record: LLVM does not shrink the stack frame (by adjusting the stack pointer), not ever for static allocs and only on explicit request (stacksave/stackrestore intrinsics) for dynamic allocas.
Hanna Kruppe at 2018-09-08 21:58:42
@eddyb would you help me with design decision? For by-value trait object safety, I need to insert shims to force the receiver to be passed by reference in the ABI. For example, for
trait Foo { fn foo(self); }to be object safe,
<Foo as Foo>::fooshould be able to call<T as Foo>::foowithout knowingT. However, concrete<T as Foo>::foos may receiveselfdirectly, making it difficult to call the method without knowledge ofT. Therefore we need a shim of<T as Foo>::foothat always receivesselfindirectly.The problem is: how do I introduce two instances of
<i32 as Foo>::foofor example? AFAICT there are two ways:- Introduce a new DefPath
Foo::foo::{{vtable-shim}}. This is rather straightforward but we'll need to change all the related IDs:NodeId,HirId,DefId,Node, andDefPath. This seems too much for mere shims. - Use the same DefPath
Foo::foo, but include another salt for the symbol hash to distinguish it from the originalFoo::foo. It still affects back torustc::tybut will not changesyntax.
I've been mainly working with the first approach, but am being attracted by the second one. I'd like to know if there are any circumstances that we should prefer either of these.
Masaki Hara at 2018-09-09 06:07:28
- Introduce a new DefPath
I'd think you'd use a new variant in
InstanceDef, and manually craft it, when creating the vtable. And in the MIR shim code, you'd need to habdle that new variant. I would expect that would be enough for the symbols to be distinct (if not, that's a separate bug).Eduard-Mihai Burtescu at 2018-09-09 06:19:39
With respect to the surface syntax for VLAs, I'm highly (to put it mildly) skeptical of permitting
[T; n]because:- the risk of introducing accidental VLAs is too high particularly for folks who are used to
new int[n](Java). nmay be aconst n: usizeparameter and therefore "captures no variables" is not enough.
I'd suggest that we should use
[T; dyn expr]or[T; let expr]to distinguish; this also has the advantage that you don't have to store anything in a temporary.I'll add an unresolved question for this.
Mazdak Farrokhzad at 2018-10-07 10:20:40
- the risk of introducing accidental VLAs is too high particularly for folks who are used to
[T; dyn expr]makes sense to me, for the reason you suggest.Alexander Regueiro at 2018-10-07 14:17:26
As #54183 is merged,
<dyn FnOnce>::call_onceis callable now! I'm going to work on these two things:Next step I: unsized temporary elision
As a next step, I'm thinking of implementing unsized temporary elision. I think we can do this as follows: firstly, we can restrict allocas to "unsized rvalue generation sites". The only unsized rvalue generation sites we already have are dereferences of
Box. We can codegen other unsized rvalue assignments as mere copies of references.In addition to that, we can probably infer lifetimes of each unsized rvalue using the borrow checker. Then, if the origin of an unsized rvalue generation site lives longer than required, we can elide the alloca there.
Next step II:
Box<FnOnce>Not being exactly part of #48055, it would be useful to implement
FnOnceforBox<impl FnOnce>. I think I can do this without making it insta-stable or immediately breakingfnbox#28796, thanks to specialization.Masaki Hara at 2018-10-27 23:33:01
@qnighy Great work. Thanks for your ongoing efforts with this... and I agree, those seem like good next steps to take.
Alexander Regueiro at 2018-10-27 23:58:27
Ah, annotating a trait impl with
#[unstable]doesn't make sense?impl CoerceUnsized<NonNull<U>> for NonNull<T>has one but doesn't show any message on the doc. It can be used without features. Does anyone have an idea to preventimpl<F: FnOnce> FnOnce for Box<F>from being insta-stable?Masaki Hara at 2018-10-28 02:21:22
Calling
Box<FnOnce()>seems to work now:#![feature(unsized_locals)] fn a(f: Box<FnOnce()>) { f() }Simon Sapin at 2018-10-28 13:56:31
Indeed, stability attributes do not work on
impl(unfortunately). As far as I know there is no way to add unstable impls if both the trait and the type are stable.However
FnBoxitself is unstable, so breaking it is acceptable if a migration path is immediately available. (Even though we should prefer not to if there’s a reasonable alternative.)Simon Sapin at 2018-10-28 14:06:51
In addition to that, we can probably infer lifetimes of each unsized rvalue using the borrow checker. Then, if the origin of an unsized rvalue generation site lives longer than required, we can elide the alloca there.
This sounds like a full-on MIR local reuse optimization, which may seem deceptively simple at first, but requires quite the complex analysis of all possible interactions in time, including across loop iterations. I had something at some point that could even apply here, but I'm not sure it could be landed any time soon.
Eduard-Mihai Burtescu at 2018-11-05 06:05:25
Since unsized values can be used in more circumstances, should we add
T: ?Sizedbound for more types in std? For example:pub enum Option<T: ?Sized>{} pub enum Result<T: ?Sized, E> {} pub struct AssertUnwindSafe<T: ?Sized>(T);F001 at 2018-11-08 03:28:19
@F001 Unsized enums are disallowed today and part of it has to do with the fact that all enum layout optimizations would have to either be disabled or reified into the vtable.
Eduard-Mihai Burtescu at 2018-11-08 08:27:25
While playing with the
unsized_localsfeature, I ran into a stumbling block:mem::forget()andmem::drop()don't acceptT: ?Sized.What's the plan for allowing
?Sizedtypes in these and related functions?Deleted user at 2018-11-08 12:35:12
@stjepang ~~sounds like an oversight~~, feel free to open a PR to relax those bounds. EDIT: we probably need to go around and see what APIs can benefit from this now.
Eduard-Mihai Burtescu at 2018-11-08 13:10:13
@eddyb shouldn't we wait until
unsized_localsis stabilized before doing that?Mazdak Farrokhzad at 2018-11-08 19:19:41
@Centril It seems unlikely that we'd see anyone relying on them accepting
?Sizedon stable without using theunsized_localsfeature, but I suppose it is technically possible to do so.Taylor Cramer at 2018-11-08 19:58:48
@cramertj I don't see how, moves ~~aren't~~ shouldn't be allowed out of
!Sizedplaces, on stable. (looks like my intuition was wrong, and might have some bugs, see https://github.com/rust-lang/rust/pull/55785#issuecomment-437162862)Eduard-Mihai Burtescu at 2018-11-08 22:15:28
@eddyb You can observe that the function can be constructed with those type params, just not apply them:
#![feature(unsized_locals)] // for the fn declaration fn my_drop<T>(_: T) {} fn my_unsized_drop<T: ?Sized>(_: T) {} trait Trait {} fn main() { let _ = my_drop::<()>; // let _ = my_drop::<dyn Trait>; // This one won't compile let _ = my_unsized_drop::<()>; let _ = my_unsized_drop::<dyn Trait>; // This will compile }Taylor Cramer at 2018-11-08 22:27:26
As I said in https://github.com/rust-lang/rust/pull/51131#issuecomment-394076603, I've once attempted to add
Sizedness check for function arguments (it would solve https://github.com/rust-lang/rust/issues/50940 too); that, however, turned out to be difficult. The problem case was something like this:pub trait Pattern<'a> { type Searcher; } fn clone<'a, P: Pattern<'a>>(this: &MatchesInternal<'a, P>) -> MatchesInternal<'a, P> where P::Searcher: Clone { MatchesInternal(this.0.clone()) } pub struct MatchesInternal<'a, P: Pattern<'a>>(P::Searcher); fn main() {}Here
P::Searcher: CloneimpliesP::Searcher: Sized. As bounds take precedence in trait selection, during type inference,P::Searcherarbitrarily appears in places whereSizedis demanded. That (in combination with the additionalSizedbounds) broke compilation of this simple code.Therefore, I'm thinking of adding an independent pass (or adding a check to an existing pass) after typeck to ensure
Sizedness (although I have other things to do regarding unsized rvalues).Masaki Hara at 2018-11-09 14:20:51
(moved from internals)
I recently make a lot of mistakes by missing
implin function signatures. Almost everytime I was protected by the fact that unsized values cannot appear in function signatures, either argument or return position.For myself, I would turn on
#![deny(bare_trait_objects)]to gain more protection.I think this could happen to others as well, so I guess unless we prohibited raw Trait appear in function sigtures, stablizing
unsized_localswould make things worse as we lost protections. Maybe when it lands it is a good time to also make#![deny(bare_trait_objects)]default?earthengine at 2019-01-15 07:21:24
@earthengine I agree. I wish we had implemented this and made
#![deny(bare_trait_objects)]default for the 2018 edition, in fact! It's too late now, sadly (until the next edition, at least).Alexander Regueiro at 2019-01-16 01:58:24
Could it be made deny-by-default only in those places where it was previously disallowed, so that no currently-invalid code becomes accepted?
Alexis Hunt at 2019-01-17 13:51:59
@alercah That sounds complicated due to the way the linting system currently works... maybe possible though. @arielb1, any thoughts?
Alexander Regueiro at 2019-01-17 17:34:06
Maybe make it two separate lints?
Alexis Hunt at 2019-01-17 18:53:41
Yes, that's the obvious solution... seems kind of ugly though?
Alexander Regueiro at 2019-01-17 19:14:13
Could they be pseudonamespaced like clippy lints?
#[deny(bare_trait_objects::all)]or#[allow(bare_trait_objects::unsize)]Alexis Hunt at 2019-01-17 19:16:26
That's an interesting idea. I don't know, but @arielb1 probably does.
Alexander Regueiro at 2019-01-17 19:25:47
It's possible to have lint groups such that
bare_trait_objectsimpliesbare_trait_objects_as_unsized_rvalues, for instance. That might be a cleaner solution.varkor at 2019-01-17 19:29:14
Just found that the
Intotrait have an implicitSizedbound. It is for sure#![feature(unsized_local)]will enable callinginto()on unsized objects. So shall we relax this by adding aT: ?Sizedbound? Would there be any compatibility issues?earthengine at 2019-02-14 03:51:21
Quick question: has anything been written about the interaction between unsized locals and generator functions?
Eg, since generators (including coroutines and async functions) rely on storing their locals in a compiler-generated state machine, how would that state machine be generated when some of these locals are unsized and alloca-stored?
Some possible answers:
- Unsized locals aren't allowed in async functions and other generators.
- Unsized locals aren't allowed to be accessed across yield/.await points.
- Unsized locals are allowed, but the resulting generator/future is unsized as well.
Olivier FAURE at 2020-02-08 14:49:36
Good question! I don't think we can permit unsized locals in that case. In some of the original versions of this feature, the intent was to limit unsized locals to cases where they could be codegen'd without alloca -- but I seem to remember we landed on a more expansive version. This seems like an important question to resolve before we go much farther. I'm going to add it to the list of unresolved questions.
Niko Matsakis at 2020-02-14 18:20:27
It's definitely an interesting question: it relates somewhat to the "no recursion" check on async functions too. In both cases, completely disallowing it is actually overly restrictive: the only hard constraint is that those locals/allocas do not live across a yield point.
Another option would be to automatically convert allocas to heap allocations in that case, although Rust doesn't really have any precedent for that sort of implicitness.
Diggory Blake at 2020-05-18 23:19:28
Will there be any way to do an unsized coercion on an unsized local without using dynamic memory allocation? The RFC didn't seem clear on this point. At the moment, as far as I can tell the only way is to go through
Boxand try to get the compiler to optimize out the memory allocation. For example, if you havefn run_fn_dyn<'a>(f: dyn FnOnce() -> u32 + 'a) -> u32 { f() + 1 }and want to run it on a known size
FnOnce() -> u32, you have to convert it like this:fn run_fn<'a, F: FnOnce() -> u32 + 'a>(f: F) -> u32 { // With optimizations enabled the dynamic allocation seems to be removed. let f = { // Declare b in local scope so that it gets dropped before run_fn_dyn is called. Otherwise // the compiler isn't smart enough to figure out that the memory allocation is unnecessary // and remove it. let b = Box::new(f) as Box<dyn FnOnce() -> u32 + 'a>; *b }; run_fn_dyn(f) }Lance Roy at 2020-12-20 22:11:57
I'm not sure how exactly how such (https://stackoverflow.com/questions/70463366/the-data-structure-that-is-the-result-of-stack-based-flattening-of-nested-homoge) data structure for some known dimensionality (=level of nesting) should interact with allocation on the heap. It seems that my data structure must keep track of its lengths in terms of gcd(sizeof(T), sizeof(usize)) and allow conversion to length in bytes.
EDIT: even better than that, it can track the count of lengths len_count and count of elements elem_count. Then the byte-length of the data structure will be the integer linear combination len_count * sizeof(usize) + elem_count * sizeof(T).
Dmitrii - Demenev at 2021-12-24 01:53:39
@rustbot label F-unsized_fn_params
Jules Bertholet at 2023-05-05 20:09:57
We should probably reject unsized arguments for non-Rust ABIs... it makes little sense to do this with an
extern "C"function since the C ABI does not support unsized arguments.Ralf Jung at 2023-11-01 06:53:36
With https://github.com/rust-lang/rust/pull/111374, unsized locals are no longer blatantly unsound. However, they still lack an actual operational semantics in MIR -- and the way they are represented in MIR doesn't lend itself to a sensible semantics; they need a from-scratch re-design I think. We are getting more and more MIR optimizations and without a semantics, the interactions of unsized locals with those optimizations are basically unpredictable.
The issue with their MIR form is that and assignment
let x = y;gets compiled to MIR likeStorageLive(x); // allocates the memory for x x = Move(y); // copies the data from y to xHowever, when
xis unsized, we cannot allocate the memory for x in the first step, since we don't know how big x is. The IR just fundamentally doesn't make any sense, with the way we now understandStorageLiveto work.If they were suggested for addition to rustc today, we'd not accept a PR adding them to MIR without giving them semantics. Unsized locals are the only part of MIR that doesn't even have a proposed semantics that could be implemented in Miri. (We used to have a hack, but I removed it because it was hideous and affected the entire interpreter.) I'm not comfortable having even an unstable feature be in such a bad state, with no sign of improvement for many years. So I still feel that unsized locals should be either re-implemented in a well-designed way, or removed -- the current status is very unsatisfying and prone to bugs. Unstable features are what we use to experiment, and sometimes the result of an experiment is that whatever we do doesn't work and we need to try something else.
(Unsized argument do not have that issue: function arguments get marked as "live" when the stack frame is pushed, and at that moment we know the values for all the arguments. Allocating the local and initializing it are done together as part of argument passing. That means we can use the size information we get from the value that the caller chose to allocate the right amount of memory in the callee.)
Ralf Jung at 2023-12-03 09:38:16
Sorry I didn't follow the discussion before. I would like to know if
unsized-rvaluecan also solve thestron stack problem?For now, if we want to write the equivalent (not exactly if the terminator
'\0'is considered) of the C++ code:char on_stack_str[] = "hello, world!";We have to write it in this way, right?
let on_stack_str: [u8; 13] = *b"hello, world!"; let on_stack_str: &str = std::str::from_utf8(&on_stack_str).unwrap();Here,
[u8; _]andfrom_utf8are obviously useless noises for a literal.After implementing
unsized-rvalue, will such a thing become possible?let on_stack_str: str = "hello, world!";VLA is not even needed here, since the length of literal is known at compile-time.
Asuna at 2023-12-03 10:11:00
@SpriteOvO I'm confused, why can't you write
let s: &str = "hello, world!";?What benefits does using
unsizedvariable here give you? It's immutable anyway so why not just use&str?Jiahao XU at 2023-12-03 11:50:54
@SpriteOvO please open a new issue; tracking issues are for tracking the general progress of a feature, not any specific questions.
Ralf Jung at 2023-12-03 12:03:38
What benefits does using
unsizedvariable here give you? It's immutable anyway so why not just use&str?@NobodyXu The benefit is that it can be mutable, and it's not
'static, which allows the literal to exist only on code binary sections and not on constant binary sections.please open a new issue; tracking issues are for tracking the general progress of a feature, not any specific questions.
@RalfJung Sorry for the noise, I might consider opening an issue after thinking more details about this.
Asuna at 2023-12-03 12:06:42