The Rust Programming Language
Read original on doc.rust-lang.orgA Closer Look at the Traits for Async
Throughout the chapter, weâve used the Future, Stream, and StreamExt
traits in various ways. So far, though, weâve avoided getting too far into the
details of how they work or how they fit together, which is fine most of the
time for your day-to-day Rust work. Sometimes, though, youâll encounter
situations where youâll need to understand a few more of these traitsâ details,
along with the Pin type and the Unpin trait. In this section, weâll dig in
just enough to help in those scenarios, still leaving the really deep dive
for other documentation.
The Future Trait
Letâs start by taking a closer look at how the Future trait works. Hereâs how
Rust defines it:
#![allow(unused)]
fn main() {
use std::pin::Pin;
use std::task::{Context, Poll};
pub trait Future {
type Output;
fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output>;
}
}
That trait definition includes a bunch of new types and also some syntax we havenât seen before, so letâs walk through the definition piece by piece.
First, Futureâs associated type Output says what the future resolves to.
This is analogous to the Item associated type for the Iterator trait.
Second, Future has the poll method, which takes a special Pin reference
for its self parameter and a mutable reference to a Context type, and
returns a Poll<Self::Output>. Weâll talk more about Pin and Context in a
moment. For now, letâs focus on what the method returns, the Poll type:
#![allow(unused)]
fn main() {
pub enum Poll<T> {
Ready(T),
Pending,
}
}
This Poll type is similar to an Option. It has one variant that has a value,
Ready(T), and one that does not, Pending. Poll means something quite
different from Option, though! The Pending variant indicates that the future
still has work to do, so the caller will need to check again later. The Ready
variant indicates that the Future has finished its work and the T value is
available.
When you see code that uses await, Rust compiles it under the hood to code
that calls poll. If you look back at Listing 17-4, where we printed out the
page title for a single URL once it resolved, Rust compiles it into something
kind of (although not exactly) like this:
match page_title(url).poll() {
Ready(page_title) => match page_title {
Some(title) => println!("The title for {url} was {title}"),
None => println!("{url} had no title"),
}
Pending => {
// But what goes here?
}
}
What should we do when the future is still Pending? We need some way to try
again, and again, and again, until the future is finally ready. In other words,
we need a loop:
let mut page_title_fut = page_title(url);
loop {
match page_title_fut.poll() {
Ready(value) => match page_title {
Some(title) => println!("The title for {url} was {title}"),
None => println!("{url} had no title"),
}
Pending => {
// continue
}
}
}
If Rust compiled it to exactly that code, though, every await would be
blockingâexactly the opposite of what we were going for! Instead, Rust ensures
that the loop can hand off control to something that can pause work on this
future to work on other futures and then check this one again later. As weâve
seen, that something is an async runtime, and this scheduling and coordination
work is one of its main jobs.
In the âSending Data Between Two Tasks Using Message
Passingâ section, we described waiting on
rx.recv. The recv call returns a future, and awaiting the future polls it.
We noted that a runtime will pause the future until itâs ready with either
Some(message) or None when the channel closes. With our deeper
understanding of the Future trait, and specifically Future::poll, we can
see how that works. The runtime knows the future isnât ready when it returns
Poll::Pending. Conversely, the runtime knows the future is ready and
advances it when poll returns Poll::Ready(Some(message)) or
Poll::Ready(None).
The exact details of how a runtime does that are beyond the scope of this book, but the key is to see the basic mechanics of futures: a runtime polls each future it is responsible for, putting the future back to sleep when it is not yet ready.
The Pin Type and the Unpin Trait
Back in Listing 17-13, we used the trpl::join! macro to await three
futures. However, itâs common to have a collection such as a vector containing
some number futures that wonât be known until runtime. Letâs change Listing
17-13 to the code in Listing 17-23 that puts the three futures into a vector
and calls the trpl::join_all function instead, which wonât compile yet.
We put each future within a Box to make them into trait objects, just as
we did in the âReturning Errors from runâ section in Chapter 12. (Weâll cover
trait objects in detail in Chapter 18.) Using trait objects lets us treat each
of the anonymous futures produced by these types as the same type, because all
of them implement the Future trait.
This might be surprising. After all, none of the async blocks returns anything,
so each one produces a Future<Output = ()>. Remember that Future is a
trait, though, and that the compiler creates a unique enum for each async
block, even when they have identical output types. Just as you canât put two
different handwritten structs in a Vec, you canât mix compiler-generated
enums.
Then we pass the collection of futures to the trpl::join_all function and
await the result. However, this doesnât compile; hereâs the relevant part of
the error messages.
error[E0277]: `dyn Future<Output = ()>` cannot be unpinned
--> src/main.rs:48:33
|
48 | trpl::join_all(futures).await;
| ^^^^^ the trait `Unpin` is not implemented for `dyn Future<Output = ()>`
|
= note: consider using the `pin!` macro
consider using `Box::pin` if you need to access the pinned value outside of the current scope
= note: required for `Box<dyn Future<Output = ()>>` to implement `Future`
note: required by a bound in `futures_util::future::join_all::JoinAll`
--> file:///home/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-util-0.3.30/src/future/join_all.rs:29:8
|
27 | pub struct JoinAll<F>
| ------- required by a bound in this struct
28 | where
29 | F: Future,
| ^^^^^^ required by this bound in `JoinAll`
The note in this error message tells us that we should use the pin! macro to
pin the values, which means putting them inside the Pin type that
guarantees the values wonât be moved in memory. The error message says pinning
is required because dyn Future<Output = ()> needs to implement the Unpin
trait and it currently does not.
The trpl::join_all function returns a struct called JoinAll. That struct is
generic over a type F, which is constrained to implement the Future trait.
Directly awaiting a future with await pins the future implicitly. Thatâs why
we donât need to use pin! everywhere we want to await futures.
However, weâre not directly awaiting a future here. Instead, we construct a new
future, JoinAll, by passing a collection of futures to the join_all function.
The signature for join_all requires that the types of the items in the
collection all implement the Future trait, and Box<T> implements Future
only if the T it wraps is a future that implements the Unpin trait.
Thatâs a lot to absorb! To really understand it, letâs dive a little further
into how the Future trait actually works, in particular around pinning. Look
again at the definition of the Future trait:
#![allow(unused)]
fn main() {
use std::pin::Pin;
use std::task::{Context, Poll};
pub trait Future {
type Output;
// Required method
fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output>;
}
}
The cx parameter and its Context type are the key to how a runtime actually
knows when to check any given future while still being lazy. Again, the details
of how that works are beyond the scope of this chapter, and you generally only
need to think about this when writing a custom Future implementation. Weâll
focus instead on the type for self, as this is the first time weâve seen a
method where self has a type annotation. A type annotation for self works
like type annotations for other function parameters but with two key
differences:
- It tells Rust what type
selfmust be for the method to be called. - It canât be just any type. Itâs restricted to the type on which the method is
implemented, a reference or smart pointer to that type, or a
Pinwrapping a reference to that type.
Weâll see more on this syntax in Chapter 18. For now,
itâs enough to know that if we want to poll a future to check whether it is
Pending or Ready(Output), we need a Pin-wrapped mutable reference to the
type.
Pin is a wrapper for pointer-like types such as &, &mut, Box, and Rc.
(Technically, Pin works with types that implement the Deref or DerefMut
traits, but this is effectively equivalent to working only with references and
smart pointers.) Pin is not a pointer itself and doesnât have any behavior of
its own like Rc and Arc do with reference counting; itâs purely a tool the
compiler can use to enforce constraints on pointer usage.
Recalling that await is implemented in terms of calls to poll starts to
explain the error message we saw earlier, but that was in terms of Unpin, not
Pin. So how exactly does Pin relate to Unpin, and why does Future need
self to be in a Pin type to call poll?
Remember from earlier in this chapter that a series of await points in a future get compiled into a state machine, and the compiler makes sure that state machine follows all of Rustâs normal rules around safety, including borrowing and ownership. To make that work, Rust looks at what data is needed between one await point and either the next await point or the end of the async block. It then creates a corresponding variant in the compiled state machine. Each variant gets the access it needs to the data that will be used in that section of the source code, whether by taking ownership of that data or by getting a mutable or immutable reference to it.
So far, so good: if we get anything wrong about the ownership or references in
a given async block, the borrow checker will tell us. When we want to move
around the future that corresponds to that blockâlike moving it into a Vec to
pass to join_allâthings get trickier.
When we move a futureâwhether by pushing it into a data structure to use as an
iterator with join_all or by returning it from a functionâthat actually means
moving the state machine Rust creates for us. And unlike most other types in
Rust, the futures Rust creates for async blocks can end up with references to
themselves in the fields of any given variant, as shown in the simplified illustration in Figure 17-4.
By default, though, any object that has a reference to itself is unsafe to move, because references always point to the actual memory address of whatever they refer to (see Figure 17-5). If you move the data structure itself, those internal references will be left pointing to the old location. However, that memory location is now invalid. For one thing, its value will not be updated when you make changes to the data structure. For anotherâmore importantâthing, the computer is now free to reuse that memory for other purposes! You could end up reading completely unrelated data later.
Theoretically, the Rust compiler could try to update every reference to an object whenever it gets moved, but that could add a lot of performance overhead, especially if a whole web of references needs updating. If we could instead make sure the data structure in question doesnât move in memory, we wouldnât have to update any references. This is exactly what Rustâs borrow checker is for: in safe code, it prevents you from moving any item with an active reference to it.
Pin builds on that to give us the exact guarantee we need. When we pin a
value by wrapping a pointer to that value in Pin, it can no longer move. Thus,
if you have Pin<Box<SomeType>>, you actually pin the SomeType value, not
the Box pointer. Figure 17-6 illustrates this process.
In fact, the Box pointer can still move around freely. Remember: we care about
making sure the data ultimately being referenced stays in place. If a pointer
moves around, but the data it points to is in the same place, as in Figure
17-7, thereâs no potential problem. (As an independent exercise, look at the docs
for the types as well as the std::pin module and try to work out how youâd do
this with a Pin wrapping a Box.) The key is that the self-referential type
itself cannot move, because it is still pinned.
However, most types are perfectly safe to move around, even if they happen to be
behind a Pin pointer. We only need to think about pinning when items have
internal references. Primitive values such as numbers and Booleans are safe
because they obviously donât have any internal references.
Neither do most types you normally work with in Rust. You can move around
a Vec, for example, without worrying. Given what we have seen so far, if
you have a Pin<Vec<String>>, youâd have to do everything via the safe but
restrictive APIs provided by Pin, even though a Vec<String> is always safe
to move if there are no other references to it. We need a way to tell the
compiler that itâs fine to move items around in cases like thisâand thatâs
where Unpin comes into play.
Unpin is a marker trait, similar to the Send and Sync traits we saw in
Chapter 16, and thus has no functionality of its own. Marker traits exist only
to tell the compiler itâs safe to use the type implementing a given trait in a
particular context. Unpin informs the compiler that a given type does not
need to uphold any guarantees about whether the value in question can be safely
moved.
Just as with Send and Sync, the compiler implements Unpin automatically
for all types where it can prove it is safe. A special case, again similar to
Send and Sync, is where Unpin is not implemented for a type. The
notation for this is impl !Unpin for SomeType, where
SomeType is the name of a type that does need to uphold
those guarantees to be safe whenever a pointer to that type is used in a Pin.
In other words, there are two things to keep in mind about the relationship
between Pin and Unpin. First, Unpin is the ânormalâ case, and !Unpin is
the special case. Second, whether a type implements Unpin or !Unpin only
matters when youâre using a pinned pointer to that type like Pin<&mut SomeType>.
To make that concrete, think about a String: it has a length and the Unicode
characters that make it up. We can wrap a String in Pin, as seen in Figure
17-8. However, String automatically implements Unpin, as do most other types
in Rust.
As a result, we can do things that would be illegal if String implemented
!Unpin instead, such as replacing one string with another at the exact same
location in memory as in Figure 17-9. This doesnât violate the Pin contract,
because String has no internal references that make it unsafe to move around.
That is precisely why it implements Unpin rather than !Unpin.
Now we know enough to understand the errors reported for that join_all call
from back in Listing 17-23. We originally tried to move the futures produced by
async blocks into a Vec<Box<dyn Future<Output = ()>>>, but as weâve seen,
those futures may have internal references, so they donât automatically
implement Unpin. Once we pin them, we can pass the resulting Pin type into
the Vec, confident that the underlying data in the futures will not be
moved. Listing 17-24 shows how to fix the code by calling the pin! macro
where each of the three futures are defined and adjusting the trait object type.
This example now compiles and runs, and we could add or remove futures from the vector at runtime and join them all.
Pin and Unpin are mostly important for building lower-level libraries, or
when youâre building a runtime itself, rather than for day-to-day Rust code.
When you see these traits in error messages, though, now youâll have a better
idea of how to fix your code!
The Stream Trait
Now that you have a deeper grasp on the Future, Pin, and Unpin traits, we
can turn our attention to the Stream trait. As you learned earlier in the
chapter, streams are similar to asynchronous iterators. Unlike Iterator and
Future, however, Stream has no definition in the standard library as of
this writing, but there is a very common definition from the futures crate
used throughout the ecosystem.
Letâs review the definitions of the Iterator and Future traits before
looking at how a Stream trait might merge them together. From Iterator, we
have the idea of a sequence: its next method provides an
Option<Self::Item>. From Future, we have the idea of readiness over time:
its poll method provides a Poll<Self::Output>. To represent a sequence of
items that become ready over time, we define a Stream trait that puts those
features together:
#![allow(unused)]
fn main() {
use std::pin::Pin;
use std::task::{Context, Poll};
trait Stream {
type Item;
fn poll_next(
self: Pin<&mut Self>,
cx: &mut Context<'_>
) -> Poll<Option<Self::Item>>;
}
}
The Stream trait defines an associated type called Item for the type of the
items produced by the stream. This is similar to Iterator, where there may be
zero to many items, and unlike Future, where there is always a single
Output, even if itâs the unit type ().
Stream also defines a method to get those items. We call it poll_next, to
make it clear that it polls in the same way Future::poll does and produces a
sequence of items in the same way Iterator::next does. Its return type
combines Poll with Option. The outer type is Poll, because it has to be
checked for readiness, just as a future does. The inner type is Option,
because it needs to signal whether there are more messages, just as an iterator
does.
Something very similar to this definition will likely end up as part of Rustâs standard library. In the meantime, itâs part of the toolkit of most runtimes, so you can rely on it, and everything we cover next should generally apply!
In the examples we saw in the âStreams: Futures in Sequenceâ section, though, we didnât use poll_next or Stream, but
instead used next and StreamExt. We could work directly in terms of the
poll_next API by hand-writing our own Stream state machines, of course,
just as we could work with futures directly via their poll method. Using
await is much nicer, though, and the StreamExt trait supplies the next
method so we can do just that:
#![allow(unused)]
fn main() {
use std::pin::Pin;
use std::task::{Context, Poll};
trait Stream {
type Item;
fn poll_next(
self: Pin<&mut Self>,
cx: &mut Context<'_>,
) -> Poll<Option<Self::Item>>;
}
trait StreamExt: Stream {
async fn next(&mut self) -> Option<Self::Item>
where
Self: Unpin;
// other methods...
}
}
The StreamExt trait is also the home of all the interesting methods available
to use with streams. StreamExt is automatically implemented for every type
that implements Stream, but these traits are defined separately to enable the
community to iterate on convenience APIs without affecting the foundational
trait.
In the version of StreamExt used in the trpl crate, the trait not only
defines the next method but also supplies a default implementation of next
that correctly handles the details of calling Stream::poll_next. This means
that even when you need to write your own streaming data type, you only have
to implement Stream, and then anyone who uses your data type can use
StreamExt and its methods with it automatically.
Thatâs all weâre going to cover for the lower-level details on these traits. To wrap up, letâs consider how futures (including streams), tasks, and threads all fit together!