2.pow(n) does not optimize well
A mibbit user noted in #rust-beginners that the integer pow method with a base of two generates far worse code than a shift: https://godbolt.org/g/mE1wjH (the shift is not equivalent for n >= 64, but you can see that it would still be better code if you accounted for that)
Perhaps we can special case the base 2 and use a shift in that case? One complication is that pow is subject to overflow checks: Whether it panics or returns 0 for excessive exponents depends on whether overflow checks are enabled, and I don't know of a way to branch depending on whether they are enabled (which is required for this optimization to not change behavior).
Alternatively, better LLVM optimizations could help, but this might be a bit too much to ask.
LLVM doesn’t optimise even a
pub fn dopow(mut exp: u32) -> u64 { let mut acc = 1; while exp > 0 { acc = acc * 2; exp -= 1; } acc }so I don’t see it being able to optimise the more complicated algorithm used currently.
Simonas Kazlauskas at 2018-01-06 20:19:23
Sooo... that means that there should be a MIR optimisation that checks whether the function is
libcore::num::<sth>::powand then checks whether the base is a constant expression and a power of two and if is, it should replace the expression by a shift instead?Or is this rather solved on the llvm side by having a pow intrinsic that works on integers?
est31 at 2018-01-06 20:24:06
A MIR pass would be a pretty big hammer. We'd have to make this random library function specially recognized by the compiler. Also the implementation effort wouldn't be justified for a one-off optimization.
I think the closest thing to a practical solution would be changing the library code that implements
pow. The problem with that is that it can't match the current algorithm wrt overflow checks. But that doesn't seem like it's unique to this function, surely other code runs into the same problem? Might be worth adding a way to query the status of overflow checks. Historicallycfg!(debug_assertion)was that way (if you ignored-Zflags), but now we also have-C overflow-checks.Hanna Kruppe at 2018-01-06 20:44:48
As of 1.53.0 and today's nightly the bug still occurs: https://godbolt.org/z/Kjn4odTdh
Valentin at 2021-07-27 19:41:55
As of 1.74.0, Rust still isn't able to optimize
2u32.wrapping_pow(u). However, since 1.60.0, it is able to optimize1u32.wrapping_pow(u).If the leading zeroes of the base are removed before exponentiation with a right shift and then added back with a left shift after exponentiation, then the
2u32.wrapping_pow(u)becomes a1u32.wrapping_pow(u)and therefor can be optimized. In fact, it gives equivalent assembly to1u32.checked_shl(u).unwrap_or_(0)However, removing the zeroes only significantly helps in with special cases like a constant power of two, so it isn't worth implementing.Here is an example. This wasn't thoroughly checked for logic errors: Compiler Explorer link
Nic at 2023-12-15 04:25:58
https://github.com/rust-lang/rust/pull/114390 is the fix for this issue, but it doesn't have recent activity.
Nikita Popov at 2023-12-15 08:31:59
#114390 is the fix for this issue, but it doesn't have recent activity.
@nikic Should I take it over or should I give the pull requester more time?
Nic at 2023-12-16 19:02:01
One complication is that pow is subject to overflow checks: Whether it panics or returns 0 for excessive exponents depends on whether overflow checks are enabled,
We can fix that now that we have
#[track_caller]. We will be further able to improve this further once we have#[cfg(overflow_checks)].Nic at 2024-01-18 01:57:22
PR has been reverted due to #120537 . Perhaps this issue could be reopened ?
Yuri Gribov at 2025-03-12 16:23:00