API convention for blocking-, timeout-, and/or deadline-related functions

d167acd
Opened by Julien Cretin at 2023-10-26 15:59:35

The standard library currently exposes several blocking- and/or timeout-related functions:

Function \ Versions | Blocking | Timeout (ms) | Timeout --- | --- | --- | --- std::sync::Condvar::wait* | wait | wait_timeout_ms | wait_timeout std::sync::mpsc::Receiver::recv* | recv | none | recv_timeout std::thread::park* | park | park_timeout_ms | park_timeout std::thread::sleep* | none | sleep_ms | sleep

The timeout versions taking a u32 in milliseconds are actually deprecated for the version taking a Duration since 1.6.0.

This issue tracks the possibility to extend these APIs and provide a convention for blocking-, timeout-, and/or deadline-related functions. The current suggestion is:

Function \ Versions | Blocking | Timeout | Deadline --- | --- | --- | --- std::sync::Condvar::wait* | wait | wait_timeout | wait_deadline std::sync::mpsc::Receiver::recv* | recv | recv_timeout | recv_deadline std::thread::park* | park | park_timeout | park_deadline std::thread::sleep* | none | sleep_for | sleep_until

The blocking versions do not take any extra argument and are not suffixed. The timeout versions take a timeout as a Duration and return if this timeout is reached (the timeout starts when the function is called with best-effort precision). They are suffixed by _timeout. The deadline versions take a deadline as an Instant and return if this deadline is reached (the deadline precision is best-effort). They are suffixed by _deadline.

For functions that do not have a meaningful blocking version (like sleep which would essentially block until the program ends), the timeout version would be suffixed by _for and the deadline version would be suffixed by _until. We don't have enough data-points to see if this rule is actually applicable. In a first iteration, we could leave aside those functions that do not have a meaningful blocking version.

  1. @alexcrichton, as requested in #45969, I'm cc-ing you here to discuss which API we want for deadline-related functions.

    Julien Cretin at 2017-11-27 21:38:00

  2. parking-lot uses wait_until which I like, though recv_until doesn't really make sense as a name.

    Steven Fackler at 2017-11-28 04:24:21

  3. Thanks @sfackler for the parking-lot example. This example actually conflicts with the standard library versions and extends the "no meaningful blocking version" suggestion above to functions that actually have a blocking version.

    If I reformulate the suggestion according to this information, it would be:

    • If the primary role of the function is to do something and this thing may accidentally block (e.g. recv, send, wait, or park), then use _timeout and _deadline since the name of the function is not related to blocking.
    • If the primary role of the function is to block (e.g. sleep, wait, or park), then use _for and _until since the name of the function is related to blocking.

    These rules overlap which is unfortunate, but this is maybe the freedom we need to account for the current state of naming by interpreting the primary role of a function differently. For example, for the standard library, the main role of Condvar::wait is to ensure we received a notification. The timeout and deadline parameters are potential accessories to avoid blocking too long. While for parking-lot, the main role of Condvar::wait is to block until we receive a notification. The timeout and deadline parameters are additional ways of unblocking. We could probably argue about the fact that if a timeout or deadline version needs to return whether the timeout or deadline was reached (and this information is semantically important), then it probably means that blocking is not the main role of the function. But going this way would probably only assign sleep-like functions to the _for/_until bucket, so it's probably a too strong constraint.

    Just to clarify, with the current suggestion (and the first one which was very similar), recv_until is not an option since recv is doing something specific (receiving a message) and it only blocks by accident (this is not its primary role).

    Here is the current suggestion (and whether it differs from the current state):

    Action function \ Versions | Blocking | Timeout | Deadline | Has diff --- | --- | --- | --- | --- std::sync::Condvar::wait* | wait | wait_timeout | wait_deadline | no std::sync::mpsc::Receiver::recv* | recv | recv_timeout | recv_deadline | no std::thread::park* | park | park_timeout | park_deadline | no

    Blocking function \ Versions | Blocking | Timeout | Deadline | Has diff --- | --- | --- | --- | --- parking_lot::Condvar::wait* | wait | wait_for | wait_until | no std::thread::sleep* | none | sleep_for | sleep_until | yes

    Julien Cretin at 2017-11-28 16:23:44

  4. I took the time to go over the existing crates by downloading the latest version of each crate and grepping for public functions taking an Instant as argument (there are too many Duration arguments and we are interested in functions that come with duration and instant versions). The main noticeable facts are:

    • The notion of non-blocking functions (in addition to the current blocking, duration, and instant versions) that I forgot to take into account.
    • The usage of try_ to switch from _timeout/_deadline to _for/_until.
    • The usage of _at (in addition to _until and _deadline) for instant versions.

    I changed the terminology for timeout and deadline versions to duration and instant versions to reflect the types and avoid possible biaises.

    Here are some examples according to the 4 versions (blocking, non-blocking, duration, and instant):

    Context | Blocking | Non-blocking | Duration | Instant --- | --- | --- | --- | --- sqa_engine::sync::AudioThreadHandle | recv | try_recv | wait_for | wait_until parking_lot::Mutex | lock | try_lock | try_lock_for | try_lock_until parking_lot::RwLock | read | try_read | try_read_for | try_read_until std::sync::mpsc::Receiver | recv | try_recv | recv_timeout | (recv_deadline) std::sync::Condvar | wait | | wait_timeout | (wait_deadline) parking_lot::Condvar | wait | | wait_for | wait_until std::thread | | | sleep_for | sleep_until

    Using try_ for the non-blocking version seems to always be verified. Then we have 3 different possibilities for the duration and instant versions (sorted according to my preferences):

    • Keep the try_ prefix and use _for and _until.
    • Use _timeout and _deadline.
    • Use another verb that works with _for and _until.

    We see some usage of _at, but these usages seem to not be related to blocking functions, but simply to time functions. In particular there are only duration and instant versions (there are no blocking or non-blocking versions). Here are some examples (with my proposal):

    Context | Duration | Instant --- | --- | --- future_utils::Timeout | new | new_at tk_easyloop | timeout | timeout_at tokio_core::reactor::Interval | new | new_at tokio_timer::Timer | interval | interval_at proposal | do_stuff_in | do_stuff_at

    Some outsiders:

    Context | Duration | Instant --- | --- | --- task_scheduler::Scheduler | after_duration | after_instant freertos_rs::TaskDelay | delay_until(?) |

    Julien Cretin at 2017-11-30 22:09:33

  5. It's worth pointing out that there may also just not be a convention to be had here. There's certainly not an "obvious agreement" amongst everything today it seems like and I wouldn't want to bend over too much to try to fit things in!

    That being said I do sort of like the _until suffix for Instant-taking operations. Even something like recv_until sort of makes sense to me if you sit and think about it for a second. That may be a bit of a pipe dream though :)

    Alex Crichton at 2017-12-01 16:24:31

  6. This thread suggests renaming std::thread::sleep, but that function is already stable. Is the naming consistency worth introducing a new function and deprecating the old one? (The old one would have to stay deprecated-but-stable "forever".)

    This issue is now also the tracking issue for std::sync::mpsc::Receiver::recv_deadline which landed in Nightly on 2017-11-29. However it seems like this issue also discusses other APIs that are not implemented (yet?)

    Simon Sapin at 2018-03-17 16:40:55

  7. I created this issue in response to https://github.com/rust-lang/rust/pull/45969#issuecomment-347282846. Maybe we should split it between the tracking issue of std::sync::mpsc::Receiver::recv_deadline and a possible convention for blocking functions. We could also decide to forget about a possible convention (because it seems like it is too late to do this kind of work since it would imply having deprecated-but-stable or duplicated APIs) and just keep this issue as a tracking issue.

    What is the timeline for a tracking issue? Should we keep it open until the feature goes stable? Is there any time constraints about this?

    Julien Cretin at 2018-03-17 18:07:14

  8. There is no time constraint. A tracking issue stays open until the corresponding feature becomes stable (in the sense of: does not require #[feature(foo)] to opt in) or is deprecated in the master branch. That change usually reaches the Nightly channel the next day, and the Stable 6 to 12 weeks later.

    Simon Sapin at 2018-03-17 20:35:13

  9. Now that crossbeam has been ported to the std library, couldn't the recv_deadline method in mpsc be stabilized? It's only a wrapper over the same method in crossbeam. https://doc.rust-lang.org/src/std/sync/mpsc/mod.rs.html#980-982

    Bruno França at 2023-03-18 02:43:25

  10. I've mentioned this in a couple of places but, since this seems to be the issue that has the most focus on *_until functions, I want to reiterate it here: Implementing any *_until functions in terms of the matching *_timeout functions are fundamentally buggy in a way that won't matter for most applications but are critical for others.

    Pulling my example from this issue:

    pub fn sleep_until(deadline: Instant) {
        let now = Instant::now();
        
       // What if the thread is suspended here? If that happens, you are no longer sleeping until `deadline` but
       // are instead sleeping until `deadline + epsilon` which is not universally acceptable.
        
        if let Some(delay) = deadline.checked_duration_since(now) {
            thread::sleep(delay);        
        }
    }
    

    Some of the functions described in this issue are likely too high-level to be used by applications which care about this level of accuracy (e.g., I would be surprised if such an application used the mpsc implementation) but several of the more primitive functions should avoid this bug where possible. It really sucks to have to drop to libc to get the correct behavior here.

    Nate Kent at 2023-10-24 22:19:39

  11. How is that functionally different from the OS not scheduling your thread for some period of time after the timer triggers, or the thread getting suspended immediately after being woken up? Or for that matter if the thread is suspended immediately before the call to sleep_until?

    On a per-emptively scheduled OS, I don't think there is any way to guarantee your thread is awoken precisely at a given time.

    Thayne McCombs at 2023-10-25 06:08:35

  12. I think a better question is: why not just do it correctly? What is the appeal of writing an incorrect version when the OSes make it so easy to do it correctly?

    But, to use a more complex (but more accurate) example:


    On Unix, sleeping for a duration is implemented via nanosleep in a loop, where the remaining time is returned by the call and used as the remaining time if the sleep is interrupted. Quoting the Linux Programming Interface on this exact situation (emphasis mine):

    Although nanosleep() allows nanosecond precision when specifying the sleep interval, the accuracy of the sleep interval is limited to the granularity of the software clock (Section 10.6). If we specify an interval that is not a multiple of the software clock, then the interval is rounded up. This rounding behavior means that if signals are received at a high rate, then there is a problem with the approach employed in the program in Listing 23-3. The problem is that each restart of nanosleep() will be subject to rounding errors, since the returned remain time is unlikely to be an exact multiple of the granularity of the software clock. Consequently, each restarted nanosleep() will sleep longer than the value returned in remain by the previous call. In the case of an extremely high rate of signal delivery (i.e., as or more frequent than the software clock granularity), the process may never be able to complete its sleep. On Linux 2.6, this problem can be avoided by making use of clock_nanosleep() with the TIMER_ABSTIME option.

    Note that clock_nanosleep is POSIX, not just Linux.

    So, right there on the surface, we have one type of program where implementing sleep_until in terms of sleep_timeout will break the program. Is it a typical situation? No. But as far as I'm aware, Rust is a systems programming language and, in my opinion, it would be silly to make Rust unsuitable for a systems program simply because people didn't want to write slightly more complex code.

    In my specific use case, I'm running robotic control on a fairly underpowered SBC. I am not so low level as to be embedded (i.e., I do have an OS) but I have done everything in my power to make sure that the thread is going to wake up at exactly the specified time (e.g., I have completely removed one of the cores from scheduling and have dedicated it to my control loop, among other things). This control loop is vaguely similar to the following:

    fn run() {
      let deadline = Instant::now() + INTERVAL;
      loop {
        do_quick_calculation_and_set_pins();
        thread::sleep_until(deadline);
        deadline += INTERVAL;
      }
    }
    

    Looking at that code, I am explicitly avoiding calls to Instant::now because, even though its not a huge cost to pay, I don't need to pay it. But, if implemented the way shown above, suddenly there's a call to Instant::now going on behind my back. Does it matter as far as the accuracy of my loop goes? Probably not. But implementing it the correct way is so trivially easy that I'm pulling in libc because the standard library didn't implement it correctly for some reason.

    tl;dr: It's so trivially easy to do this correctly, why go out of your way to make Rust unsuitable for niche (but valid) programs?

    Nate Kent at 2023-10-25 15:53:03

  13. I'm not saying it shouldn't be done correctly. My point is that it doesn't completely solve the problem of your thread getting interrupted at an inopportune time.

    Thayne McCombs at 2023-10-25 18:45:02

  14. Of course. That wasn't meant to be a definitive guide on how to wake up as accurately as possible, just an easy to grok example.

    Nate Kent at 2023-10-26 15:59:35