Add drain_filter for BinaryHeap

fbcf612
Opened by Ralf Jung at 2023-06-30 15:44:19

Subject says it all... Vec and HashMap have retain and drain, but BTreeMap does not.

  1. std::collections::BinaryHeap is missing a drain_filter method (rust-lang/rfcs#2140).

    <!-- TRIAGEBOT_START --> <!-- TRIAGEBOT_ASSIGN_START --> <!-- TRIAGEBOT_ASSIGN_DATA_START$${"user":null}$$TRIAGEBOT_ASSIGN_DATA_END --> <!-- TRIAGEBOT_ASSIGN_END --> <!-- TRIAGEBOT_END -->

    Mark Rousskov at 2018-06-04 20:24:15

  2. Sounds good to me. I agree that this part of the API would be good to keep consistent between HashMap/HashSet and BTreeMap/BTreeSet. Please submit a PR.

    David Tolnay at 2017-11-15 01:56:41

  3. An issue opened about 1 year ago. 17 thumbs up. Is this issue being addressed?

    nbro at 2018-06-04 19:11:29

  4. I don't believe there's any current in-progress work, but we're happy to take PRs. If you're interested, we can probably connect you with a mentor or provide some instructions.

    Mark Rousskov at 2018-06-04 19:15:55

  5. @Mark-Simulacrum Honestly, right now, I don't think I am qualified enough to address this issue. I would leave it someone else more qualified. Maybe, in the future, I may feel comfortable enough with Rust to address it.

    nbro at 2018-06-04 19:43:56

  6. If someone ever gets around to this, it would be a shame to repeat the mistake of passing &T like in the Vec retain implementation. These methods should pass &mut T to the closure, like the HashMap implementation does.

    Ryan Wiedemann at 2018-07-29 02:51:15

  7. @Mark-Simulacrum I could take a look to BtreeMap retain(), if there is some guidance/mentoring available. As a first Rust lang PR it may be otherwise a bit too steep effort.

    jq-rs at 2018-08-21 10:41:33

  8. cc @rust-lang/libs would someone be able to provide some mentoring for @jq-rs?

    Mark Rousskov at 2018-08-21 16:45:22

  9. Getting nowhere so far, so I'll pass this to future volunteers.

    jq-rs at 2018-09-13 18:22:59

  10. BinaryHeap now has drain, although for some reason it removes elements in arbitrary order (https://github.com/rust-lang/rust/issues/59278) :/

    Jon Gjengset at 2019-03-18 16:36:10

  11. [...] it removes elements in arbitrary order (#59278) :/

    <strike>Is that really a problem? I can't think of a situation where you want to remove elements in order.</strike>

    Ah sure, it's about iterating in a specific order, not removing (was confused with retain)

    Marcel Hellwig at 2019-03-18 18:35:31

  12. I mean, it's a heap... If I want to re-use the heap, it seems pretty reasonable to use .drain to walk the heap rather than a while let Some(_) = heap.pop loop. I think this is especially surprising for IntoIterator -- it'd be like a BTreeMap yielding elements in a random order when you do for (k, v) in btreemap.

    Jon Gjengset at 2019-03-18 18:43:55

  13. Is it acceptable to implement this using brute force and then redo it later for performance gains? Just do something like:

    let this = mem::replace(self, Default::default());
    *self = this.into_iter().filter(...).collect();
    

    Chris Gregory at 2019-05-25 05:45:08

  14. Regarding BTreeMap/Set::drain, it looks like most of the implementation can be derived from IntoIter (apart from emptying the map instead of mem::forget-ing it). I can give it a try if that's the case.

    Guillaume E at 2019-08-26 19:45:41

  15. Subject says it all...

    Actually, not to me... which drain? Vec & VecDeque have a range argument; HashMap and BinaryHeap don't (because there is no order or it's not exploitable). BTreeMap has order and exposes range concepts already.

    I assumed a range argument was implied, and I believe @arthurprs's comment reveals the same assumption. But then, even if you know how to sew up the remaining tree, you can't simply adjust its length because there is no length concept on ranges.

    If drain without argument is what we're after, then I still don't quite understand @gendx's wording in the last comment. Isn't a plain drain always the same as the IntoIterator implementation, apart from the cleanup of self? And in this case, since that cleanup happens in into_iter and outside of IntoIter, can we not just use IntoIter as is, as in my attempt?

    Stein Somers at 2019-11-19 13:00:08

  16. I was thinking about the no-range-argument case. There might exist a clever optimization for drain, but indeed given how easy it is to derive drain from into_iter it's unlikely that such optimization wouldn't apply to into_iter as well. So your attempt looks good! (At least from a high-level perspective)

    As for the sub-range case, that would definitely be a different and more complex implementation.

    Guillaume E at 2019-11-19 22:18:11

  17. It turns out that a drain without range is trivial. An excerpt from simple code on playground comparing to an existing drain:

            for i in v[0].drain(..) {}
            for i in std::mem::replace(&mut m[0], BTreeSet::new()).into_iter() {}
    

    So I think it's only worth the trouble if it operates on a range.

    Secondly, I wondered why retain doesn't take a range either, but that's what drain_filter is for (the #59618 mentioned above). It might die like #1353 did, but more likely it means retain will be useless and we want drain_filter instead.

    Stein Somers at 2019-12-05 10:23:29

  18. Actually, thanks to #61129, in the 1.40 beta version of Rust, the above alternative to drain can be shortened to:

    std::mem::take(&mut m[0]).into_iter()
    

    Stein Somers at 2019-12-15 19:46:42

  19. All but two of the methods originally requested here have since been implemented. I'm splitting this issue so that we have one ticket for per method. This ticket is for BinaryHeap::drain_filter.

    BTreeMap/BTreeSet::drain_filter are tracked at #70530. BinaryHeap::retain is tracked at #71503. BTreeMap/BTreeSet::retain are part of rust-lang/rfcs#1338.

    Matt Brubeck at 2020-09-08 18:16:55

  20. Should BinaryHeap::drain_filter be sorted or not? BinaryHeap::drain isn't sorted, but there's an equivalent, BinaryHeap::drain_sorted, which is.

    inquisitivecrystal at 2021-07-22 08:11:52

  21. @rustbot claim

    Mike Leany at 2022-01-20 04:08:06

  22. @rustbot claim

    dis-da-mor at 2022-11-05 18:54:42

  23. Some questions from implementing this feature:

    1. Should the closure also accept an &mut T? An item could be mutated to violate the heap order. Two possible solutions:
      1. Change the closure to accept an & T. This would mean one less feature and make this implementation inconsistent with others. But it would avoid the complexity of 2.
      2. Change the closure to accept a struct that behaves similarly to PeekMut<T>. If the struct is deref'd mut to an &mut T then on drop it will ensure the item is sorted correctly on the heap. This increases potential complexity in implementation and for the end user, as instead of accessing the value directly they need to deref it.
    2. Should the items be given to the closure in order or out of order? Or should we follow the route of drain and have drain_filter and drain_filter_sorted?

    dis-da-mor at 2022-11-13 06:23:54