GDB pretty printers incorrectly print Path slices

321bd81
Opened by Alex Crichton at 2023-04-05 17:45:52

Given this source file:

use std::path::Path;

fn main() {
    let a = Path::new("a/b");
    let b = a.parent().unwrap();
    let c = "a/b";
    let d = &c[..1];

    let e = 3;
}

When compiled and run through rust-gdb it yields:

$ rustc -g foo.rs
$ cat cmds
break foo.rs:9
run
print *a
print *b
print c
print d
quit
$ rust-gdb ./foo -x cmds
GNU gdb (Ubuntu 7.11.1-0ubuntu1~16.04) 7.11.1
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./foo...done.
Breakpoint 1 at 0x7c8c: file /home/alex/code/cargo/foo.rs, line 9.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

Breakpoint 1, foo::main () at /home/alex/code/cargo/foo.rs:9
9	    let e = 3;
$1 = Path = {inner = OsStr = {inner = Slice = {inner = 0x555555590c80 <str> "a/b"}}}
$2 = Path = {inner = OsStr = {inner = Slice = {inner = 0x555555590c80 <str> "a/b"}}}
$3 = "a/b"
$4 = "a"
A debugging session is active.

	Inferior 1 [process 31595] will be killed.

Quit anyway? (y or n) [answered Y; input not from terminal]

The variable b was printed as a/b instead of a, although string slices appear to work!

  1. cc @michaelwoerister

    Alex Crichton at 2017-07-19 15:18:00

  2. For me with rustc 1.20, I don't see the strings at all:

    (gdb) p *a
    $1 = Path = {
      inner = OsStr = {
        inner = Slice = {
          inner = 0x1000448c0 <str>
        }
      }
    }
    (gdb) p *b
    $2 = Path = {
      inner = OsStr = {
        inner = Slice = {
          inner = 0x1000448c0 <str>
        }
      }
    }
    

    Tom Tromey at 2017-10-02 17:02:29

  3. inner = OsStr = {

    debugger_pretty_printers_common.py looks for OsString:

            # OS STRING
            if (unqualified_type_name == "OsString" and
                self.__conforms_to_field_layout(OS_STRING_FIELD_NAMES)):
                return TYPE_KIND_OS_STRING
    

    So perhaps that difference is the problem.

    Tom Tromey at 2017-10-03 22:05:37

  4. @tromey Good find! &OsStr is the slice version of OsString (like &str is for String). A pretty printer for &OsStr should look pretty similar to the one for &str.

    Michael Woerister at 2017-10-04 08:51:29

  5. For Unix, OsStr is defined here. It's just a [u8]; when I look with gdb I see it is nul-terminated. So I wonder if both OsString and OsStr should be printed with length=-1 rather than trying to extract the length.

    Tom Tromey at 2017-10-04 13:38:45

  6. So I wonder if both OsString and OsStr should be printed with length=-1 rather than trying to extract the length.

    No, that's wrong.

    Tom Tromey at 2017-10-04 15:00:48

  7. I don't really understand what is going on in this code, I think. I don't know how the runtime finds the length of the [u8], though it clearly does.

    One guess is that a is actually a wider pointer. And that's experimentally supported by something like:

    (gdb) p a
    $28 = (std::path::Path *) 0x555555599800 <str>
    (gdb) p &a
    $29 = (std::path::Path **) 0x7fffffffdf10
    (gdb) p /x 0x7fffffffdf10 + 8
    $30 = 0x7fffffffdf18
    (gdb) p $ as *mut usize
    $31 = (usize *) 0x7fffffffdf18
    (gdb) p *$
    $32 = 3
    

    However, the type of a is just an ordinary pointer in the DWARF:

     <1><1aff>: Abbrev Number: 10 (DW_TAG_pointer_type)
        <1b00>   DW_AT_type        : <0x1331>
        <1b04>   DW_AT_name        : (indirect string, offset: 0x2d1): &std::path::Path
    

    And Path says it has size 1:

     <3><1331>: Abbrev Number: 8 (DW_TAG_structure_type)
        <1332>   DW_AT_name        : (indirect string, offset: 0x2cc): Path
        <1336>   DW_AT_byte_size   : 1
        <1337>   Unknown AT value: 88: 1
    

    So maybe these are more debuginfo bugs.

    If this is all accurate then print *a is not going to work out, because that extra information in a will be lost by the dereference. But, I have a feeling something above is wrong, I just don't know what.

    Tom Tromey at 2017-10-04 21:13:04

  8. So the problem is that Path as itself isn't really a thing you can talk about since it's a dynamically sized type.

    print *path where path is an &Path is kind of meaningless. Rust never lets you create lvalues that are DSTs; you can only refer to DSTs behind pointer types. In Rust, *path would lead to a compiler error, unless it's in an rvalue that gets rereferenced later.

    The simple solution here is to make &Path print the type, much like &str and &[u8] do (or should be doing). We can also forbid *path.

    However, forbidding *path makes it impossible to inspect custom DSTs themselves. GDB currently does not do autoderef, whereas Rust does. So in Rust the following all works:

    struct CustomDST {
        a: u8,
        b: bool,
        c: str
    }
    
    let x: &CustomDST = make_thing();
    let y = x.a; // autoderefs x, but is ok
    let z = (*x).b; // explicitly derefs x into an rvalue, also ok
    let w = &x.c; // autoderefs x, attempts to access dynamically sized field, but re-references it so we get an `&str`
    

    We'd need to introduce a similar mechanism where we don't allow printing DSTs directly but allow taking references to them.


    Alternatively; given that gdb runs at runtime and all the sizes are known, we can allow print *path to work, and just print the path as if it were of known size -- we know the size because it's at runtime. I'm not sure how to represent this within GDB however.

    Manish Goregaokar at 2017-11-03 20:55:42

  9. print *path where path is an &Path is kind of meaningless. Rust never lets you create lvalues that are DSTs; you can only refer to DSTs behind pointer types. In Rust, *path would lead to a compiler error, unless it's in an rvalue that gets rereferenced later.

    I think this has swapped "lvalue" (aka place) and "rvalue" (aka value). E.g. {place} (e.g. {*ptr}) is a value ("rvalue") expression, forcing moving/copying from place, while &value (e.g. &f()) will create a place ("lvalue") to borrow, which *&value can access.

    Eduard-Mihai Burtescu at 2019-10-18 15:26:07

  10. I meant to reply to this ages ago, but was reminded only today by a StackOverflow issue.

    Alternatively; given that gdb runs at runtime and all the sizes are known, we can allow print *path to work, and just print the path as if it were of known size -- we know the size because it's at runtime. I'm not sure how to represent this within GDB however.

    Normally gdb wants to know what things look like under the hood. It's fine for the Rust compiler to emit a description of this underlying reality, even if it doesn't line up with some user-facing object; one can sprinkle DW_AT_artificial around to make this more clear if need be. For example, we recently changed the Ada compiler to do this for "unconstrained arrays" -- these are really a structure which consists of a pointer to the array data, and a pointer to another structure whose fields describe the bounds. Rust could do something similar: emit an artificial structure that describes the DST "as it is". We did something similar for the vtable / trait object problem.

    Tom Tromey at 2020-05-21 21:21:01

  11. Things have changed a bit, probably due to #92718:

    (gdb) info local
    d = "a"
    c = "a/b"
    b = &std::path::Path {
      data_ptr: 0x555555561f73,
      length: 1
    }
    a = &std::path::Path {
      data_ptr: 0x555555561f73,
      length: 3
    }
    

    And, data_ptr does point to the right thing:

    (gdb) x/3xc a.data_ptr
    0x555555561f73:	97 'a'	47 '/'	98 'b'
    

    There's still some weirdness in here though, like:

    (gdb) p sizeof( *a.data_ptr)
    $6 = 0
    

    Tom Tromey at 2022-03-18 03:51:27