Async Rust Notes
The Future Trait
pub trait Future {
type Output;
fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output>;
}
Why
Pin<&mut Self>What is pinning?
- Reference
- Moving: Copying the bytes of one value from one location to another.
- Unpinned by default: Rust compiler is (or we are) allowed to move the values by default.
- Pinning: We say that a value has been pinned when it has been put into a state where it is guaranteed to remain located at the same place in memory from the time it is pinned until its drop is called.
What does
Pin<Ptr>mean?#[derive(Copy, Clone)] pub struct Pin<Ptr> { pub __pointer: Ptr, } impl<Ptr: Deref> Pin<Ptr> { ... }- Pinning is a promise that we will not move the value of type
SelfoncePin<&mut Self>is constructed, before the value is dropped (instead ofPin<...>is dropped). - Pinning does not change the behavior of the compiler. However, it prevents misuse in the safe code.
- Pinning is a contract with the unsafe code.
- There is no constraint against moving the value, if you have a mutable reference it somewhere else! So one of the safe way to do construct a
Pin<&mut Self>is to move the value inside thePin, e.g. usingBox::pin(value). TheSelfowningPin<Box<Self>>returned ensure thatSelfis not moved anymore. - Having a mutable reference elsewhere to the
Pinis source of unsafety (even afterPin<&mut Self>is dropped! Remember once the value is pinned it is up to you to uphold the constraint forever, so getting a mutable reference is after droppingPin<&mut Self>can elide the check of the borrow checker and break the promise).
- Pinning is a promise that we will not move the value of type
Pin<&mut Self>prevents misuse in safe codePin<&mut Self>disallows getting&mut Self where Self: !Unpinin safe code.- Mark your
Selfwith a field ofstd::marker::PhantomPinned#[derive(Default)] struct AddrTracker { prev_addr: Option<usize>, // remove auto-implemented `Unpin` bound to mark this type as having some // address-sensitive state. This is essential for our expected pinning // guarantees to work, and is discussed more below. _pin: PhantomPinned, } - Getting
&mut Selfmust be unsafeimpl AddrTracker { fn check_for_move(self: Pin<&mut Self>) { let current_addr = &*self as *const Self as usize; match self.prev_addr { None => { // SAFETY: we do not move out of self let self_data_mut = unsafe { self.get_unchecked_mut() }; self_data_mut.prev_addr = Some(current_addr); }, Some(prev_addr) => assert_eq!(prev_addr, current_addr), } } }
- Mark your
- See: https://doc.rust-lang.org/std/pin/#fixing-addrtracker
- See reasons why constructing
Pin<&mut Self>is unsafe:- https://doc.rust-lang.org/std/pin/struct.Pin.html#method.new_unchecked
Miscellaneous
fn check_for_move(self: Pin<&mut Self>)vsfn check_for_move(mut self: Pin<&mut Self>)Note the
mutplaced before selfHowever, there are basically no differences because
- Nothing inside
self: Pin<&mut Self>can be mutated (__pointerfield is not mutable for users, either). - Methods like
get_unchecked_mut,map_unchecked_mutmovesselfout thus not requiringmut selfas input. - The
mutinmut selfactually means you will mutateselfafter you consumeselfin the function body. Since it consumes the input, themutdoes not matter to the caller anyways.
- Nothing inside
Here
selfis nothing but a value of typePin<&mut Self>, just like any other parameters.Both
selfandmut selfallows getting mut in unsafe code.struct S { x: i32, _pin: PhantomPinned, } impl S { fn immutable_self(self: Pin<&mut Self>) { unsafe { self.get_unchecked_mut().x = 1; } } fn mut_self(mut self: Pin<&mut Self>) { // ^ Warning: variable does not need to be mutable unsafe { self.get_unchecked_mut().x = 1; } } }mut selfappears in some tutorials but I think it is not required.
So, why does
Futureneedself: Pin<&mut Self>instead of&mut self?It is answered many times. See: https://rust-lang.github.io/async-book/04_pinning/01_chapter.html#why-pinning
TLDR:
Future, since desugared from your code, contains self-reference just like your sync code.let a = 1; let ref_a = &a; // `ref_a` is desugared to be a field in your returned `Future`, thus self-referencing. do_something().await; println("a = ", *ref_a);Self-referencing is safe in sync code because the stack does not move.
Self-referencing is unsafe (if we don't use
Pin) becauseFutureare stored in heaps and Rust doesn't forbid moving heap-allocated values.PinmeansSelfis pinned at least before we entered our self-referencing code, thus it is safe now toref_a = &a
The Arc Pointer
Arc is short for "Atomically Reference Counted"
Arc<T>uses atomic operation for RCTmust be immutable.Arc<T>isSendifTisSend + Sync, andArc<T>isSyncifTisSend + Sync.This means if
Tis notSyncor notSend,Arc<T>becomes neitherSendorSync(meaningArc<T>is not only notSyncbut also notSend, therefore it becomes no more useful thanRc<T>)use std::{cell::Cell, sync::Arc}; struct S { cell: Cell<i32>, } fn shit() { let arc_s = Arc::new(S { cell: Cell::new(0) }); fn is_sync<T: Sync>(t: T) {} fn is_send<T: Send>(t: T) {} is_send(arc_s); // ^ `Cell<i32>` cannot be shared between threads safely... }This can be understood like, if we want to access
TfromArc<T>from different threads, we must expectTto support multi-threading as good asArc's ref counter. To prove it more rigidly, we consider:If
Tis notSync,We assume
Arc<T>isSend, consider the following case:let a = Arc::new(S {}); let b = a.clone(); thread::spawn(move || { // `b` is sent here b }); thread::spawn(move || { // `a` is sent here a });This is not safe, because
bandaare handled by different thread, manipulating the sameT: !Sync. SoArc<T>cannot beSend.We assume
Arc<T>isSync. That means&Arc<T>(which produces&T) can be shared among threads, butTcannot be shared sinceT: !Sync.
If
Tis notSend,- See: https://stackoverflow.com/questions/41909811/why-does-arct-require-t-to-be-both-send-and-sync-in-order-to-be-send
- TLDR:
Arc<T>might move the underlyingTamong threads in the following situations:droptry_unwrap
Notes of using
Arc<T>:Arc<T>does not power you withSend + Sync + 'static(which is generally desired in async Rust). You need to ensure T isSend + Sync + 'staticby iteself.Arc<T>is generally used to hold "injected services" into your APIs.- A plain static
Tis not really useful, since services cannot stay bitwise the same shared by all threads. For example, if you are using a db service, it needs to maintain a mutating connection pool while providing a&selfinterface. Actually the frameworks only allow us to have&selfaccess to the context, so the handling of interior mutability is on our own.