Inconsistent inlineing of Iterator Adaptors - Missed Optimizations
While profiling some rust code of me, I noticed that the following pattern does not optimize well:
vec![1,2,3,4]
.into_iter()
.map(|v| ...)
.skip_while(|v| ...)
skip_while is implemented using find and find is implemented using try_fold. The functions SkipWhile::next() and Iterator::find() use the #[inline] annotation. The function Map::try_fold() does not. This means that Map::try_fold() will not be inlined.
I started looking at the source code and inlineing of iterators seems to follow no rule. I could not find any bug reports related to this.
Some iterators like Cloned do not have any function marked as inline. Not even next() is marked as inline.
The PR introducing try_fold does not give justification why some try_folds are inline and some are not.
The methods len and is_empty of ExactSizeIterator's are also not marked as inlineable, even though they are always implemented as pass-through to the underlying iterator.
If desired I can prepare a pull request to mark those functions as inlineable. Is there a list of functions for the iterator traits (e.g., Iterator, ExactSizeIterator) which should be inline/not be inline?
Note that all iterator adaptors are generic and therefore instantiated in the crate using them, making them available for inlining in principle. Adding
#[inline]in those cases just gives a hint to be slightly more aggressive about inlining than the default. Codegen units + absence of ThinLTO might change this to some degree, not sure what's the current interaction between CGUs and#[inline].We should still be consistent about how we use
#[inline], and there might very well be regressions in some cases caused by absence of#[inline]. But generally the attribute is not required to make a function inlinable.Hanna Kruppe at 2018-01-15 19:36:17
Is there an improvement in the generated instructions if you annotate
try_foldwith#[inline]?varkor at 2018-01-15 20:08:25
@varkor I do not have a minimal example or a good testcase where I could compare the assembly. I just noticed that in profiling the
try_folddoes not occur anymore, meaning it got inlined.@rkruppe You make a good point. I forgot that generic functions are always inlineable. I got fooled by the results.
Jonas Bushart at 2018-01-15 22:29:17
I had no deep logic for marking or not marking try_fold methods with
#[inline]. Roughly I had it there when the correspondingfoldhad it, but that's it.@jonasbb I see "anymore" in your comment. Did the behaviour here change in a recent nightly?
scottmcm at 2018-01-16 00:16:00
@scottmcm No, it is unrelated to nightly changes. For other reasons I already have xargo setup and just played around with adding inline annotation and reprofiling.
Jonas Bushart at 2018-01-16 07:28:37
I can confirm that I have this issue on
rustc 1.62.0-nightly (cb1219871 2022-05-08): code withiter.copied().map(|x| something(x))produced bad assembly whileiter.map(|x| something(*x)was properly inlined. I've tried to create an MRE but I wasn't able to reproduce it there. I'm not an expert on compiler optimizations but I can confirm that I witnessed this behavior in the wild.Although I failed to create a MRE I can provide function source code and assembly that gets generated if needed. I suspect it has to deal something with inlining budget or some werights but I'm pretty sure I don't want to have iterators in resulting release assembly.
Psilon at 2022-05-15 16:09:24