Range* should overrride more methods of Iterator

9e96b96
Opened by Marco A L Barbosa at 2019-05-03 12:58:55

Like nth, last, count, max, min,... maybe even sum and product (using a direct formula).

  1. I don't see the use case for nth, last, max, min. Sum and product might be good to have. I'll try to implement them.

    tvladyslav at 2017-03-05 20:32:25

  2. @crypto-universe The use case is in generic code. Sometimes a range is used in generic code that use these methods, it they are implemented efficiently it is better.

    Marco A L Barbosa at 2017-03-06 11:29:16

  3. All ranges that can provide iterator implementations do so today; this excludes RangeFull, RangeTo, and RangeToInclusive. However, these don't make sense as iterators (all don't have start points). As such, I'm closing.

    Mark Rousskov at 2017-05-26 23:54:33

  4. @Mark-Simulacrum This issue is about overriding methods of the iterator trait. For example, max can be implemented efficient for Range, but it is not, it uses the default implementation.

    Marco A L Barbosa at 2017-05-30 12:42:48

  5. Oh, I misinterpreted then. Reopening.

    Mark Rousskov at 2017-05-30 13:32:10

  6. This is more important now that https://github.com/rust-lang/rust/pull/41439 deperecated Range::step_by in favor of Iteartor::step_by, and the later returns an iterator wrapper that calls Iterator::nth on its inner iterator.

    CC @ivandardi, @BurntSushi, @alexcrichton, @nagisa

    Simon Sapin at 2017-06-05 12:32:43

  7. https://github.com/rust-lang/rust/issues/43077 does nth.

    Simon Sapin at 2017-07-05 23:33:10

  8. https://github.com/rust-lang/rust/pull/47180 adds last, min and max to Range and RangeInclusive. Adding sum led to type-inference issues, which I'm still investigating. I didn't add count because it also requires specialisation only for integer types, which I imagined would hit the same problems as sum, but I may try for a follow-up PR.

    Edit: count has the same type inference issues as sum.

    varkor at 2018-01-11 18:49:18

  9. For anyone implementing something here, consider overriding try_fold and try_rfold, as that should make it much easier for LLVM to simplify all the loops without needing to extend Step (especially for RangeInclusive, as its next is a bit more complex, which may explain why https://github.com/rust-lang/rust/pull/47180#issue-285974444 needed to override it by not the normal Range).

    Edit: try_fold is up in https://github.com/rust-lang/rust/pull/48012, which fixes (0..=n).sum()

    scottmcm at 2018-01-12 06:09:58

  10. This should probably be closed. #48012 seems to have resolved performance concerns for count, sum, and product. See #48024's closing comment.

    nth was merged with #43077, and last, min, max were merged with #47180. That's everything suggested in this issue.

    Phlosioneer at 2018-03-07 04:07:20

  11. sum on an integer range is an O(1) operation with the formula for triangular numbers (1..n).sum() == n * (n+1)/2. It just needs to be a bit careful about overflow because division is not compatible with modular arithmetic. In general: (n / d) % M != (n % M)/(d % M) % M. If n*(n+1) overflows, the result is wrong. Either n+1 or n needs to be divided by 2 before multiplication, depending on which one is even.

    (0..).sum() and (0..).product() could panic immediately, if the overflow is considered a bug. They would otherwise loop forever in release mode.

    Emerentius at 2018-06-18 17:08:47

  12. Is there any use to calling sum on a Range?

    Simon Sapin at 2018-06-18 17:43:40

  13. Why, yes, I've used triangle numbers 3 times on Project Euler. No, seriously, I was just responding to people talking about performance issues on sum. It's too much code for the gain to specialize all integer types separately, even with macro (would be neat, though).

    For RangeFrom, it would just be 2*3 lines to panic, so.. maybe worth it.

    Emerentius at 2018-06-18 19:11:25

  14. Just to mention: I ran into type inference issues when I tried specialising for sum, so you'd have to work around that if you did want to specialise.

    varkor at 2018-06-18 19:46:25

  15. Note that LLVM already optimizes Range::sum to a closed-form solution, so there really isn't a need to specialize: https://play.rust-lang.org/?gist=631cf2e39b07d8d8e4f80013c8c235aa&mode=release

    You can tell by the lack of back-edges, and the 33-bit multiply and divide-by-two:

      %10 = mul i33 %6, %9
      %11 = lshr i33 %10, 1
    

    scottmcm at 2018-06-18 20:07:56

  16. One possible way to implement .count would be to short-circuit on Step::steps_between:

    pub fn range_count<A: Step>(range: std::ops::Range<A>) -> usize {
        if let Some(steps) = Step::steps_between(&range.start, &range.end) {
            steps
        } else {
            range.fold(0, |x, _| x + 1)
        }
    }
    

    With inlining, this works out nicely for integer types. And consider that None return in steps_between signals that the count did not fit into a usize. In that case, the current count should simply loop a very long time anways and the performance drop from the single conditional check may be negligable. For all current types, the branch of the if is also chosen statically and should thus be inlined during monomorphisation.

    Optimally, the no_between implementation for i128 or larger-than-pointer sized types could also be adjusted to test if the difference fits within a usize despite the missing static guarantee. Many differences coult still fit within that size even if their absolute values do not.

    Available at the playground.

    Aurelia Molzer at 2019-05-03 12:47:28