vtable addresses differ based on the number of codegen units

dcab25c
Opened by PeterHatch at 2024-12-21 05:01:43

I'm seeing the ptr::eq function from the standard library sometimes return false when it should return true. This is happening in debug mode in nightly and beta, but not stable, and not in any version in release mode.

Here's the simplest I could get a reproduction of the issue:

use std::ptr;
use container::Container;

trait Trait {}
struct Struct {}
impl Trait for Struct {}

mod container {
    use Trait;

    pub(crate) struct Container<'a> {
        pub(crate) item: &'a Trait,
    }
}

impl<'a> Container<'a> {
    fn new(item: &'a Struct) -> Container<'a> {
        Container { item }
    }
}

fn main() {
    let item = Struct {};
    let container = Container::new(&item);

    assert!(ptr::eq(container.item, &item));
}

Expected the assertion to pass, and it fails on beta and nightly, in debug mode only.

Meta

rustc +beta --version --verbose: rustc 1.22.0-beta.3 (cc6ed0640 2017-11-13) binary: rustc commit-hash: cc6ed0640fbcd2dff95b4532fd12aa0d6c545f28 commit-date: 2017-11-13 host: x86_64-pc-windows-msvc release: 1.22.0-beta.3 LLVM version: 4.0

rustc +nightly --version --verbose: rustc 1.23.0-nightly (5041b3bb3 2017-11-19) binary: rustc commit-hash: 5041b3bb3d953a14f32b15d1e41341c629acae12 commit-date: 2017-11-19 host: x86_64-pc-windows-msvc release: 1.23.0-nightly LLVM version: 4.0

  1. The vtable pointers are different (try transmuting container.item and &item as &Trait to (usize, usize)). And the problem disappears with -C codegen-units=1 (the default is 16 in debug mode since beta). This points to the issue being something about deduplicating vtables between CGUs?

    Alex Burka at 2017-11-20 23:23:16

  2. I ran into this a while ago, and @eddyb said that we never make any guarantees around uniqueness of vtables: https://github.com/rust-lang/rust/pull/30485#issuecomment-166521737.

    Steven Fackler at 2017-11-20 23:37:11

  3. @durka is there anything I can clarify?

    Steven Fackler at 2017-11-21 03:11:14

  4. Something I'd like clarified - this is considered a bug right? It seems to me that If we never make any guarantees around uniqueness of vtables, then ptr::eq should not be comparing vtables. Is that how other people are seeing it?

    PeterHatch at 2017-11-21 04:07:55

  5. Not necessarily - imagine you had a

    #[derive(Debug)]
    #[repr(C)]
    struct Foo {
        bar: u32,
    }
    
    let foo = Foo { bar: 0 };
    
    let a: &Debug = &foo;
    let b: &Debug = &foo.bar;
    

    The data pointers of a and b are equal, but they probably shouldn't compare equal.

    Steven Fackler at 2017-11-21 04:33:36

  6. It seems pretty weird that the same trait gets different vtables.

    Alex Burka at 2017-11-21 05:13:50

  7. Okay, but to clarify again - this is a discussion of how to fix the issue? Not whether there is an issue? Because I'm appreciating the discussion of what caused it, and potential issues with simple ways to fix it, but I'm still not completely sure if people agree that it is an issue that needs fixing?

    To discuss the potential fix more - shouldn't they compare equal? From the documentation of ptr::eq ( https://doc.rust-lang.org/std/ptr/fn.eq.html) I expected them to compare equal. (Maybe that documentation should change?) I mean, it certainly could be surprising (especially when using the == operator directly, rather than the function), but that's kind of a weird situation, and I'm not even sure what you'd want in that case? It seems like either way could be more convenient some of the time.

    Regardless, it seems important that, if there's going to be no guarantees about vtables, that doesn't also mean that there is no guarantee the comparing pointers to trait objects will ever return true. As far as I can tell, it's just not useful at that point?

    PeterHatch at 2017-11-21 05:23:26

  8. @durka Not all that weird - we could defer vtable instantiation until the final link, but then you'd have to do a bunch of rewriting of all of the rlibs you're linking to rather than just having each compilation unit create the vtables it needs.

    @PeterHatch The documentation should definitely note the behavior around trait objects, yeah.

    Steven Fackler at 2017-11-21 05:25:07

  9. Yeah, if the behavior of comparing pointers to trait objects depends on random details of the compilation environment like CGUs, LTO, etc, it seems like a footgun for the operator to have T: ?Sized at all. But since that ship has sailed, perhaps some loud documentation warnings and a clippy lint are in order.

    Alex Burka at 2017-11-21 05:38:15

  10. [T] is the other unsized case and comparison does behave reasonably there, so it's not a total lost cause.

    Steven Fackler at 2017-11-21 05:50:05

  11. @arielb1 states:

    If you have different codegen units (e.g. different modules) they can have different vtables. I think that's a bug.

    Jake Goulding at 2017-11-25 22:38:48

  12. Also note that you can throw away part of the fat pointer if all you care about is the data portion:

    let a = container.item as *const _ as *const ();
    let b = &item as *const _ as *const ();
    assert!(ptr::eq(a, b));
    

    Jake Goulding at 2017-11-25 22:44:41

  13. Why would it be a bug? Maybe it's suboptimal? But we make no guarantees.

    Eduard-Mihai Burtescu at 2017-11-25 23:58:44

  14. It's worth giving a warning in ptr::eq about comparing *const Trait fat pointers.

    I suppose Rc::ptr_eq or Arc::ptr_eq compares *mut RcBox<T>s or *mut ArcInner<T>s so the vtable pointer is there. Is this vtable pointer necessarily equal according to the semantics? I suppose no because it comes from outside so these need the same warning.

    In other words, I can cast an Rc<Struct> to an Rc<Trait> which unsizes by attaching a vtable pointer outside the Rc<..>, right? If I do this in different compilation units, then this vtable pointer will be different, but the *mut RcBox<T> shall remain the same, right?

    Should maybe Rc::ptr_eq and Arc::ptr_eq use this as *const () trick?

        pub fn ptr_eq(this: &Self, other: &Self) -> bool {
            this.ptr.as_ptr() as *const () == other.ptr.as_ptr() as *const ()
        }
    

    Jeff Burdges at 2018-08-27 03:02:22

  15. Some points raised in duplicates:

    • Debug-printing the wide ptr ignores the vtable, so this leads to rather confusing assertion messages (the pointers compare unequal but print equal).
    • One proposal was to compare the vtable content instead of the vtable address.

    Also @Diggsey raises this concern

    fn identity(x: Box<dyn Trait>) -> Box<dyn Trait> { x } If the compiler is able to prove that the function is only called with Box<Foo>, then the compiler would be free to return TraitRepr { ptr: x.ptr(), vtable: MyFooVtable }, even if it was passed a different vtable.

    I don't think that is true. The behavior of the function is to return a copy of the argument, and I don't think we permit the mere copying of a wide ptr to change the vtable to a different but equivalent one. I don't see why the optimizer would do that, and I don't see why we would permit it in the spec.

    Ralf Jung at 2020-03-07 08:20:14

  16. I don't think that is true. The behavior of the function is to return a copy of the argument, and I don't think we permit the mere copying of a wide ptr to change the vtable to a different but equivalent one. I don't see why the optimizer would do that, and I don't see why we would permit it in the spec.

    Devirtualization is the optimisation I was thinking of. I imagine we would eventually like to be able to compile code operating on a dyn Trait object to monomorphised code when the compiler can figure out the concrete type. At the boundary between the monomorphised code and the "trait object" code, we would have to reconstruct the vtable pointer, and this would come from the current codegen unit.

    Diggory Blake at 2020-03-07 14:07:14

  17. Assuming that the underlying problem is not easy to fix quickly, we should at least work around it in the standard library. I opened #80505 for Rc::ptr_eq, Arc::ptr_eq, Weak::ptr_eq.

    Anders Kaseorg at 2020-12-30 11:03:13

  18. I think this might be a good usecase for having TypeId of non-'static types.

    My idea is that all vtables would have the TypeId of the (possibly non-'static) object, and that's what would be compared instead of the vtable address.

    rodrimati1992 at 2021-03-14 08:24:14

  19. I didn't see it mentioned yet (here or in the duplicates), but the ability to compare wide pointers is due to an RFC. By my reading, that RFC intends vtable pointers to be equal if and only if their types are equal; e.g. quoting the RFC:

    Implement PartialEq/Eq for fat raw pointers, defined as comparing both the unsize-info and the address. This means that these are true:

        &s as &fmt::Debug as *const _ == &s as &fmt::Debug as *const _ // of course
    

    And from the PR:

    I definitely think vtables should be part of the comparison. Two examples:

    1. A pointer to a struct and a pointer to its field field should not be equal. :)
    
    2. In general, if the vtable is unequal, it means some types are unequal, so even if it is the same memory, it must be being interpreted differently, or else a subset of that memory (as in case 1). The behavior will not be the same when you use the object, so why should it be equal? If you had static type information (a thin pointer), you couldn't even compare the two pointers, as you'd get a type error. So overall, doesn't make sense to me.
    

    (We discussed this in the lang team meeting, these points were raised there.)

    The ability landed in September 2015 (Rust 1.4), but the behavior was noted in December 2015, so this is probably an incomplete implementation as opposed to a stable-to-stable regression. (But I didn't actually do the leg-work to confirm that.)

    QuineDot at 2021-10-31 05:55:58

  20. Can we solve this by linking the vtables with LLVM linkonce (or linkonce_odr) linkage?

    As noted by @burdges in https://github.com/rust-lang/rust/pull/80505#issuecomment-753449698:

    If I understand, all these repeated vtables waste considerable space, not because vtable take much space, but because they imply redundent monomorpized code, yes? If so, then merging vtables should permit the linker to strip all this redundent code too.

    If vtables cannot be inlined then linkonce should not have any downsides, right?

    Anders Kaseorg at 2022-10-26 13:56:54

  21. linkonce will block optimizations, while linkonce_odr is not allowed AFAICT as the different vtables contains reference different copies of the same methods in case of generics I believe and are thus not identical. Additionally I'm not sure if linkonce is supported everywhere. And finally linkonce/linkonce_odr doesn't help with dylibs. Those will still get their own copies.

    bjorn3 at 2022-10-26 14:37:07

  22. The linkonce_odr contract is that merged globals are “equivalent”, not identical. As I understand it, different copies of the same methods should be considered equivalent for this purpose. Clang uses linkonce_odr for C++ vtables (at least when all method definitions are inside the class), which have similar considerations.

    Even if this solution is partial, is it controversial to suggest that partial solutions might be better than nothing if a full solution does not seem to be forthcoming? If nothing else, we’d generate better code in the supported cases.

    Anders Kaseorg at 2022-10-26 14:57:03

  23. As I understand it, different copies of the same methods should be considered equivalent for this purpose.

    Even if they are optimized differently?

    Even if this solution is partial, is it controversial to suggest that partial solutions might be better than nothing if a full solution does not seem to be forthcoming? If nothing else, we’d generate better code in the supported cases.

    A partial solution makes people think it works without actually working IMO.

    bjorn3 at 2022-10-26 16:59:17

  24. Even if they are optimized differently?

    Yes. The counterexample given in the documentation is “two functions with different semantics”, not two functions with the same semantics that are optimized differently.

    A partial solution makes people think it works without actually working IMO.

    This is a nondeterministic bug that depends on the way code is split between codegen units—people who would be inclined to think it works without actually working, would already think that.

    Anders Kaseorg at 2022-10-26 17:04:30

  25. cg_clif actually deterministically generates a new vtable for every function and static using it. It also doesn't have the comdat support necessary to support linkonce.

    bjorn3 at 2022-10-26 17:17:06

  26. So how viable is this as a way to avoid false negatives when comparing *const dyn Traits for equality?

    Store a TypeId as an entry in vtables, then instead of comparing vtable addresses, compare the TypeId. AFAICT, that would solve the false negative comparison problem, the downside would be that it'd increase the size of all vtables.

    note: this would entirely be an implementation detail, the TypeId entry could be removed in the future if vtables are guaranteed unique.

    rodrimati1992 at 2022-10-30 03:34:50

  27. TypeId only works for 'static types, though.

    Ralf Jung at 2022-10-30 09:26:29

  28. The internal intrinsic works for all types and vtables are the same independent of lifetimes anyway.

    bjorn3 at 2022-10-30 11:04:06

  29. A partial solution reduces binary size and cache pressure. What optimizations does linkonce block?

    Yes, dynlibs have their own copies, but maybe even these could be deduplicated somewhat.

    Another question: Could/should the current orphan rules be further embraced as a mechanism for deduplicating vtable methods?

    Jeff Burdges at 2022-10-30 13:57:18