RawVec stores a capacity field even if T is zero-sized
When std::mem::size_of::<T>() == 0 RawVec<T> has always a capacity equal to std::usize::MAX elements. It currently stores a cap: usize that always contains this value but doing so is unnecessary in this case.
@withoutboats this would be a not so far fetched case for allowing struct specialization like @arielb1 proposed.
Time to interpret what https://doc.rust-lang.org/nightly/std/vec/struct.Vec.html#guarantees actually means for this
Most fundamentally, Vec is and always will be a (pointer, capacity, length) triplet. No more, no less. The order of these fields is completely unspecified, and you should use the appropriate methods to modify these. The pointer will never be null, so this type is null-pointer-optimized.
Plain reading says that the capacity field must be there. It's a bit silly, since the only interface with the representation should be with
Vec::from_raw_parts, where this "triplet" guarantee isn't needed.bluss at 2017-10-21 15:16:44
@bluss I am finishing overhauling the growth-strategy for vectors (#29931, #27627, I'd like to find a mentor to discuss it before submitting a PR) but I'd like to improve on this afterwards.
My proposal would be to amend those guarantees as follows:
Most fundamentally, Vec<T> where T is not zero-sized is and always will be a (pointer, capacity, length) triplet. No more, no less. The order of these fields is completely unspecified, and you should use the appropriate methods to modify these. The pointer will never be null, so this type is null-pointer-optimized. When T is zero-sized the layout of
Vec<T>is unspecified.And then to replace the
capfield ofRawVecwith a zero-sized type whenTis zero-sized.gnzlbg at 2017-10-21 15:23:16
FWIW I think that the standard library should make more guarantees than it currently does, in particular with respect to the space and time complexity of operations.
Yet I think this is an example of something that the standard library should never guarantee. The wording kills me:
Vec is and always will be a (pointer, capacity, length) triplet. No more, no less. The order of these fields is completely unspecified
What is this guarantee for? It sounds like I can rely on this for serialization purposes, but then the
Vectype has different sizes in 32 and 64-bit architectures. This wording also prevents other equally-sized implementations, e.g., a triple of pointers, that might optimize better on a different backend, and the current topic, optimizations for zero-sized types.Changing this wording would be a breaking change, but if we can't change this, this should become an example of how the types of guarantees that standard library types should not be making.
gnzlbg at 2017-10-21 15:38:18
Cool. Does it include using usable_size as well? (Don't worry if not, I think that's fine to address as a separate little project too.)
I think the guarantees are definitely overreaching there, but since they are not well defined enough to actually be used for much of anything, maybe the whole triplet thing can be reworded — the Vec has a representation equivalent to pointer, capacity and length, it's just that the capacity doesn't need to be stored for some element types.
By the way, do RawVec and Vec need to store a pointer field for ZST?
For questions, I'll certainly answer anything I can if you find me here or on IRC. Then I suppose @Gankro can as well, if he has time.
I haven't heard about struct field specialization, and would love to read more about that.
bluss at 2017-10-21 15:44:06
I agree it could also forgo the pointer. All a ZST Vec should need to store is a length. I guess it only depends on whether
from_raw_partsfollowed byas_ptrneeds to preserve arbitrary pointer values.Josh Stone at 2017-10-21 18:21:02
@cuviper I am talking about
RawVechere, but I agree,Veconly needs to store a length,RawVecdoesn't need to store anything.gnzlbg at 2017-10-24 15:29:00
We can't implement this until some kind of specialization allows it in a backwards compatible way. Aside from doubts about associated type specialization, the problem of keeping Vec covariant in T., Which is not the case with a.t.spec. now AFAIU.
bluss at 2017-10-24 15:50:12
@bluss can you elaborate about how this kind of specialization would change the variance of
Vec? AFAIK the only way in which a user ofVec<T>can tell whether this specialization is applied is by observing themem::size_of::<Vec<T>>. We would need some "hacks" to make iteration on these vectors work, but that should be doable (e.g. using a static variable and returning references to that).gnzlbg at 2017-10-25 15:15:36
@gnzlbg Yeah, I was meaning to come back to you with an example.
Here's some code that shows that code that compiles for current
<details>Vec<T>does not compile if we instead use an associated type (that depends on T) in the representation.
</details>struct CurrentVec<T> { elements: Box<T> } struct SmartRepresentationVec<T> { elements: <T as Smart>::Representation, } trait Smart { type Representation; } impl<T> Smart for T { type Representation = Box<T>; } // This compiles! fn change_lifetime1<'a, T>(input: CurrentVec<&'static T>) -> CurrentVec<&'a T> { input } // This does not! The SmartRepresentationVec is invariant in its parameter. fn change_lifetime2<'a, T>(input: SmartRepresentationVec<&'static T>) -> SmartRepresentationVec<&'a T> { input }bluss at 2017-10-27 17:49:28
@bluss why is that?
gnzlbg at 2017-10-28 14:56:55
I don't know exactly why the rule is like this.
It looks like niko explains the rule here: https://github.com/rust-lang/rust/issues/21726#issuecomment-71949910 by showing how the old ~~"covariant"~~ rule (for us here, "covariant in T" means "be like
Vec<T>") was unsound in some situations.bluss at 2017-10-28 15:11:19
I would like to see an implementation of this in a PR.
My reading of the Guarantees section seems like the "no more, no less" claim is just a generalization of the following claims, not a further claim of its own. It just means to say that Vec can't be a rope, can't do small-vector optimization, must support efficient conversion to
Box<[T]>when len == capacity, etc all the other following claims. I agree with https://github.com/rust-lang/rust/issues/45431#issuecomment-338411384 that the "no more, no less" claim is not actionable by itself and so there is no problem rewording it.David Tolnay at 2017-11-18 22:09:26
Notice that in https://github.com/rust-lang/unsafe-code-guidelines/issues/224 a different extension is discussed. That is, adding a
mem::CompressedNonNull<T>type that's a ZST ifTis a ZST, in which case the address is always justmem::align_of::<T>().With this and that optimizations,
Vec<T>would "semantically" still have 3 fields (ptr, size, cap), butptrandcapwould be ZSTs whenTis a ZST, makingVec<ZST>1-word wide.A way to provide wording that guarantees that these optimizations will never happen would be to, e.g., guarantee that the
size_of::<Vec<T>>() == size_of::<usize>() * 3for allTs. The current wording is a bit ambiguous on what's allowed or not.gnzlbg at 2020-01-10 12:32:59
Is there a tangible use case for making Vec smaller for some types?
bluss at 2020-04-13 14:34:16
As noted in #117763, fixing this could allow giving
Cow<[T]>a more optimal layout forTof all sizes. Special-casing the type of the capacity field when T is a ZST, would allow capping it toisize::MAXin all other cases. This would create a new niche, whichCow's layout could exploit to good effect.@rustbot label A-layout A-specialization I-heavy I-slow
Jules Bertholet at 2023-11-09 19:52:12