Codegen for a large static array uses a lot of memory

2e9b334
Opened by Seo Sanghyeon at 2020-04-22 21:06:50

Following 2 lines of Rust code can use more than 200 MB of memory to compile.

const L: usize = 1 << 25;
pub static S: [u32; L] = [1; L];
$ rustc --crate-type lib -Z time-passes test.rs | grep translation
time: 0.710; rss: 209MB translation

Reported on users.rust-lang.org post. See also #23600.

  1. 200MB is possibly slightly excessive... but it's naturally going to take a lot of memory to generate a 30MB object file.

    eefriedman at 2015-11-27 20:30:58

  2. Wait, no, I miscalculated; it's actually a 130MB object file.

    eefriedman at 2015-11-27 20:33:35

  3. This is more of an LLVM issue, as I believe that we have to generate essentially [1, 1, 1, 1, ...., 1] in this case, which means LLVM ends up tracking 2^25 values. I'm actually surprised that it doesn't end up using 2^25 * 8 bytes of memory for 2^25 pointers.

    Note that this isn't an issue for [0; L], as LLVM has a zeroinitializer value that represents "all zero bits" for any type.

    Looking at the issue from users.rust-lang.org, it should be able to use a zeroinitializer, as the value is all-zeros. However, it doesn't for some reason. Investigation suggests that it doesn't for element types larger than a word (so [(u32, f32); L] is fine, [(u64, f32); L] is not). I haven't checked the relevant code yet though.

    James Miller at 2015-11-28 03:52:07

  4. It would be nice if LLVM was doing RLE for these kinds of things. Although I don't recall C being able to generate anything large other than zeroed arrays (which is what the zeroinitializer is for).

    Eduard-Mihai Burtescu at 2015-12-21 14:16:43

  5. One issue shown by #35154 is that very large arrays can cause ICEs due to us trying to create absurdly-large vectors to hand to LLVM for constants.

    James Miller at 2016-08-01 05:41:14

  6. Triage: time-passes no longer has a step named translation as in the original issue, but as of Rust 1.43 the RSS peaks at 322 MB during the following phases:

    time: 0.212; rss: 322MB codegen_to_LLVM_IR
    time: 0.000; rss: 322MB assert_dep_graph
    time: 0.000; rss: 322MB serialize_dep_graph
    time: 0.214; rss: 322MB codegen_crate
    

    That said, I think this issue could use a better title, because I would intuitively expect a large static array to use at least a linear-ish amount of memory when compiling. What is the specific issue that we should address? (eddyb uses the term "RLE" above, which I have no expansion for.)

    bstrie at 2020-02-20 15:58:17

  7. @bstrie "trans(lation)" is the old jargon for "codegen", maybe I should've retroactively replaced all of the mentions of it to avoid confusion.

    <hr/>

    RLE means "Run-length encoding", one of the common primitives of compression algorithms.

    In this specific case, LLVM could represent constant blocks of data as N repeats of (smaller) block, which would always allow us to represent [e; N] values (even if we might have to detect it on the fly, like a compression algorithm, or also support it in miri).

    <hr/>

    That said, I think this issue could use a better title, because I would intuitively expect a large static array to use at least a linear-ish amount of memory when compiling.

    It's not a bad intuition, but at no point do you actually have a linear amount of entropy, you can represent the array as "1 aka 1u32 aka 01 00 00 00, etc. repeated 2²⁵ times" all the way down to emitting the binary (which you don't have to do in memory either).

    Eduard-Mihai Burtescu at 2020-02-20 16:08:03