Async Rust Notes
The Future Trait
pub trait Future {
type Output;
fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output>;
}
Why
Pin<&mut Self>
What is pinning?
- Reference
- Moving: Copying the bytes of one value from one location to another.
- Unpinned by default: Rust compiler is (or we are) allowed to move the values by default.
- Pinning: We say that a value has been pinned when it has been put into a state where it is guaranteed to remain located at the same place in memory from the time it is pinned until its drop is called.
What does
Pin<Ptr>
mean?#[derive(Copy, Clone)] pub struct Pin<Ptr> { pub __pointer: Ptr, } impl<Ptr: Deref> Pin<Ptr> { ... }
- Pinning is a promise that we will not move the value of type
Self
oncePin<&mut Self>
is constructed, before the value is dropped (instead ofPin<...>
is dropped). - Pinning does not change the behavior of the compiler. However, it prevents misuse in the safe code.
- Pinning is a contract with the unsafe code.
- There is no constraint against moving the value, if you have a mutable reference it somewhere else! So one of the safe way to do construct a
Pin<&mut Self>
is to move the value inside thePin
, e.g. usingBox::pin(value)
. TheSelf
owningPin<Box<Self>>
returned ensure thatSelf
is not moved anymore. - Having a mutable reference elsewhere to the
Pin
is source of unsafety (even afterPin<&mut Self>
is dropped! Remember once the value is pinned it is up to you to uphold the constraint forever, so getting a mutable reference is after droppingPin<&mut Self>
can elide the check of the borrow checker and break the promise).
- Pinning is a promise that we will not move the value of type
Pin<&mut Self>
prevents misuse in safe codePin<&mut Self>
disallows getting&mut Self where Self: !Unpin
in safe code.- Mark your
Self
with a field ofstd::marker::PhantomPinned
#[derive(Default)] struct AddrTracker { prev_addr: Option<usize>, // remove auto-implemented `Unpin` bound to mark this type as having some // address-sensitive state. This is essential for our expected pinning // guarantees to work, and is discussed more below. _pin: PhantomPinned, }
- Getting
&mut Self
must be unsafeimpl AddrTracker { fn check_for_move(self: Pin<&mut Self>) { let current_addr = &*self as *const Self as usize; match self.prev_addr { None => { // SAFETY: we do not move out of self let self_data_mut = unsafe { self.get_unchecked_mut() }; self_data_mut.prev_addr = Some(current_addr); }, Some(prev_addr) => assert_eq!(prev_addr, current_addr), } } }
- Mark your
- See: https://doc.rust-lang.org/std/pin/#fixing-addrtracker
- See reasons why constructing
Pin<&mut Self>
is unsafe:- https://doc.rust-lang.org/std/pin/struct.Pin.html#method.new_unchecked
Miscellaneous
fn check_for_move(self: Pin<&mut Self>)
vsfn check_for_move(mut self: Pin<&mut Self>)
Note the
mut
placed before selfHowever, there are basically no differences because
- Nothing inside
self: Pin<&mut Self>
can be mutated (__pointer
field is not mutable for users, either). - Methods like
get_unchecked_mut
,map_unchecked_mut
movesself
out thus not requiringmut self
as input. - The
mut
inmut self
actually means you will mutateself
after you consumeself
in the function body. Since it consumes the input, themut
does not matter to the caller anyways.
- Nothing inside
Here
self
is nothing but a value of typePin<&mut Self>
, just like any other parameters.Both
self
andmut self
allows getting mut in unsafe code.struct S { x: i32, _pin: PhantomPinned, } impl S { fn immutable_self(self: Pin<&mut Self>) { unsafe { self.get_unchecked_mut().x = 1; } } fn mut_self(mut self: Pin<&mut Self>) { // ^ Warning: variable does not need to be mutable unsafe { self.get_unchecked_mut().x = 1; } } }
mut self
appears in some tutorials but I think it is not required.
So, why does
Future
needself: Pin<&mut Self>
instead of&mut self
?It is answered many times. See: https://rust-lang.github.io/async-book/04_pinning/01_chapter.html#why-pinning
TLDR:
Future
, since desugared from your code, contains self-reference just like your sync code.let a = 1; let ref_a = &a; // `ref_a` is desugared to be a field in your returned `Future`, thus self-referencing. do_something().await; println("a = ", *ref_a);
Self-referencing is safe in sync code because the stack does not move.
Self-referencing is unsafe (if we don't use
Pin
) becauseFuture
are stored in heaps and Rust doesn't forbid moving heap-allocated values.Pin
meansSelf
is pinned at least before we entered our self-referencing code, thus it is safe now toref_a = &a
The Arc Pointer
Arc is short for "Atomically Reference Counted"
Arc<T>
uses atomic operation for RCT
must be immutable.Arc<T>
isSend
ifT
isSend + Sync
, andArc<T>
isSync
ifT
isSend + Sync
.This means if
T
is notSync
or notSend
,Arc<T>
becomes neitherSend
orSync
(meaningArc<T>
is not only notSync
but also notSend
, therefore it becomes no more useful thanRc<T>
)use std::{cell::Cell, sync::Arc}; struct S { cell: Cell<i32>, } fn shit() { let arc_s = Arc::new(S { cell: Cell::new(0) }); fn is_sync<T: Sync>(t: T) {} fn is_send<T: Send>(t: T) {} is_send(arc_s); // ^ `Cell<i32>` cannot be shared between threads safely... }
This can be understood like, if we want to access
T
fromArc<T>
from different threads, we must expectT
to support multi-threading as good asArc
's ref counter. To prove it more rigidly, we consider:If
T
is notSync
,We assume
Arc<T>
isSend
, consider the following case:let a = Arc::new(S {}); let b = a.clone(); thread::spawn(move || { // `b` is sent here b }); thread::spawn(move || { // `a` is sent here a });
This is not safe, because
b
anda
are handled by different thread, manipulating the sameT: !Sync
. SoArc<T>
cannot beSend
.We assume
Arc<T>
isSync
. That means&Arc<T>
(which produces&T
) can be shared among threads, butT
cannot be shared sinceT: !Sync
.
If
T
is notSend
,- See: https://stackoverflow.com/questions/41909811/why-does-arct-require-t-to-be-both-send-and-sync-in-order-to-be-send
- TLDR:
Arc<T>
might move the underlyingT
among threads in the following situations:drop
try_unwrap
Notes of using
Arc<T>
:Arc<T>
does not power you withSend + Sync + 'static
(which is generally desired in async Rust). You need to ensure T isSend + Sync + 'static
by iteself.Arc<T>
is generally used to hold "injected services" into your APIs.- A plain static
T
is not really useful, since services cannot stay bitwise the same shared by all threads. For example, if you are using a db service, it needs to maintain a mutating connection pool while providing a&self
interface. Actually the frameworks only allow us to have&self
access to the context, so the handling of interior mutability is on our own.