Avoid shipping duplicate artifacts in the host and target sysroot
All released compilers have identical dynamic libraries in two locations. The locations on Linux are:
$sysroot/lib/*.dylib$sysroot/lib/rustlib/$target/lib/*.dylib
All of these artifacts are byte-for-byte equivalent (they're just copies of one another). These duplicate artifacts inflate our installed size, inflate downloads, and cause weird bugs like https://github.com/rust-lang/rust/issues/39870. Although https://github.com/rust-lang/rust/issues/39870 is itself fixed it's just a hack fix for now that would be ideally solved by fixing this issue!
Some possible thoughts I personally have on this are:
- Symlinks won't work because they don't work on Windows
- Hard links may work here, but I'm not sure. This'd require a lot of updates to lots of tools (rust-installer, rustup, etc)
- Simply not shipping one of these is going to be very difficult.
$sysroot/libis required forrustcitself to run correctly (that dir is typically inLD_LIBRARY_PATHor the equivalent) and$sysroot/lib/rustlib/$target/libis where the compiler looks for target libraries. The compiler can't look in$sysroot/libfor libs as that's typically got a ton of libs on Unix systems. - The most plausible solution in my mind is to create our own pseudo-symlink file format. When assembling a sysroot this is what rustbuild itself would emit (instead of copying files) but it'd basically be a file with the literal contents
rustc-look-in-your-libdir. That way something like$sysroot/lib/rustlib/$target/lib/libstd.dylibwould exist but essentially be an empty file (not a valid dynamic library). Instead rustc would look at$sysroot/lib/libstd.dylibfor that file instead.
Unsure if I'm on the right track there, but hopefully can get discussion around this moving!
Can you clarify:
$sysroot/lib/rustlib/$target/libis where the compiler looks for target libraries. The compiler can't look in$sysroot/libfor libs as that's typically got a ton of libs on Unix systems.What kinds of Bad Stuff (tm) would happen if there were "a ton of libs" in the place the compiler looks for target libraries?
Alex Burka at 2017-06-13 22:15:14
Oh sure yeah, I'm basically thinking of https://github.com/rust-lang/rust/issues/20342, which is the direct consequence of looking in all of
$sysroot/libfor libs.Alex Crichton at 2017-06-13 22:18:48
inflate downloads
I'm not sure about that. Afaik we order files by name inside download folders to avoid precisely that.
est31 at 2017-06-13 23:27:46
@est31 The
$sysroot/lib/*.dyliblibraries are in a different component than the$sysroot/lib/rustlib/$target/lib/*.dyliblibraries. Because they're in different components, compression can't eliminate the redundancy.Peter Atashian at 2017-06-13 23:38:53
The most plausible solution in my mind is to create our own pseudo-symlink file format. When assembling a sysroot this is what rustbuild itself would emit (instead of copying files) but it'd basically be a file with the literal contents rustc-look-in-your-libdir
This seems unlikely to work with the dynamic linker in cases where rpath is disabled or unavailable. I believe the main reason for the historical redundancy here is so that the dylibs rustc needs are literally located in /usr/local/lib.
Brian Anderson at 2017-06-15 00:54:55
I favor a hardlink solution, but teaching rust-installer/rustup how to do that across components is pretty hairy.
A (relatively) simple solution would be to leave the components as they are, but have rustup deduplicate them with hardlinks at install time. If that were combined with a proposed optimization to have rustup use the combined package when possible, the effect would be that downloads were deduplicated (via compression), and disk space was deduplicated (via hardlinks).
Brian Anderson at 2017-06-15 00:58:28
The most plausible solution in my mind is to create our own pseudo-symlink file format. When assembling a sysroot this is what rustbuild itself would emit (instead of copying files) but it'd basically be a file with the literal contents rustc-look-in-your-libdir
Doing this in the opposite direction seems like it would work (pseudo-symlinks in libdir), but then you lose the consistency of having all the real libs in libdir, and seemingly the installer would have to be responsible for setting that up.
Brian Anderson at 2017-06-15 01:05:40
Oh sorry yeah I was thinking that
$sysroot/lib/rustlib/$target/lib/*.dylibwould be a "pseudo symlink" to the versions in$sysroot/lib, that way we wouldn't mess with the libraries thatrustcitself needs to execute.Upon further reflection though I do agree that this seems like a rustup problem sort of. We still want to produce a
rust-stdpackage with all of the libraries in it, not a bunch of "pseudo symlink" pointers which point to nonexistent libraries. We basically want rustup toolchains andmake installinstalled-toolchains to have this "symlink behavior" but everything else should stay as-is today.Alex Crichton at 2017-06-15 02:18:19
FWIW, in Fedora packaging I do replace the rustlib libraries with actual symlinks to the libdir. I suppose it wouldn't hurt if those were "pseudo" symlinks, but I want to be careful about that redirection. Namely, I've got
/usr/lib/rustlib/$target/lib/so all targets share a common/usr/lib/rustlib/, and then 64-bit rustc will get its libraries from/usr/lib64because that's how Fedora arranges things.(I've kind of hacked that in place after
./x.py installsince rustbuild et al. don't allow separating the libdir and rustlibdir paths, but maybe they should.)Josh Stone at 2017-06-20 00:04:45
We basically want rustup toolchains and make install installed-toolchains to have this "symlink behavior" but everything else should stay as-is today.
A suggestion: use symlinks in
make installon Unix systems that support them and punt on a Windows solution for now. It seems the complaints about double-packaging and related issues are currently exclusively from Unix packagers so I think you could get away with just addressing it there for now.Mike McQuaid at 2017-06-20 13:50:54
Windows users would still like to avoid having to download those libraries twice. It's not really critical or anything, just something that would be helpful in the future when someone gets around to it maybe.
Peter Atashian at 2017-06-20 14:34:42
Windows users would still like to avoid having to download those libraries twice.
Yep but they have to do that today. I'm not sure it's worth avoiding a straightforward solution to the problem for Unix systems because there's no obvious solution for Windows.
Mike McQuaid at 2017-06-21 06:59:25
Oh sorry yeah I was thinking that $sysroot/lib/rustlib/$target/lib/*.dylib would be a "pseudo symlink" to the versions in $sysroot/lib, that way we wouldn't mess with the libraries that rustc itself needs to execute.
Wait a second - (host) rustc itself links against libraries in (target) sysroot ? Seriously ?!
Enrico Weigelt at 2017-07-12 18:18:16
Are symlinks or hard links not an option just due to Windows support? If so, perhaps it is worth pointing out that Windows 10 supports symbolic and hard links without the privilege escalation that was necessary in Vista, 7, and 8 (and XP if you include "junctions").
Forgive me if I'm stating the obvious. I do not really understand this issue. I just would not want to see an easy solution overlooked due to unfamiliarity with Windows' current capabilities.
https://blogs.windows.com/buildingapps/2016/12/02/symlinks-windows-10/#RRmytWmTlOwHQ8YZ.97
Jon Wolski at 2017-09-05 02:00:42
Triage: I don't think that anything has changed here, but I'm not sure.
Steve Klabnik at 2019-03-31 13:56:55
@Mark-Simulacrum did either of us open an issue about the
libLLVM-*.soduplication?Eduard-Mihai Burtescu at 2020-04-05 19:38:03
Not to my knowledge, no. It would probably be good to do so.
Mark Rousskov at 2020-04-05 19:39:48
Filed #70838.
Eduard-Mihai Burtescu at 2020-04-06 12:15:35
This may be fixed now that #70838 is closed
Chris de Claverie at 2020-10-14 18:27:57
Triage: the only duplicate artifacts I see now are libstd.so and libtest.so, which sounds like it's a lot less than before (12 MB between them). But those two are still duplicated.
(posting for future reference: it turns out libtest.so is shipped in the host sysroot so that rustdoc can compile doctests)
jyn at 2022-06-27 04:38:08