Description
Sorry if this is a duplicate, but I like the "keywords" I've showcased in this issue. Other related issues:
- Validity of pointers and references to memory not allocated by the compiler #285
- Could we have Rust without provenance? #287
- Storing an object as &Header, but reading the data past the end of the header #256 <- this is a pervasive instance of that.
- What about: "container_of"-style pointer arithmetic? #243
- Stacked Borrows: raw pointer usable only for
T
too strict? #134 <- canonical issue? - Provenance #52
Note that the quite loaded term "provenance" is being used here as described mainly in #134.
Unleaking
The stdlib libs docs currently state, regarding Box::leak
:
Dropping the returned reference [return value of
Box::leak()
] will cause a memory leak. If this is not acceptable, the reference should first be wrapped with theBox::from_raw
function producing aBox
. ThisBox
can then be dropped which will properly destroyT
and release the allocated memory.
So, even if there is no code snippet, such statement is stating that:
fn drop_box<T> (boxed: Box<T>)
{
drop(unsafe { Box::from_raw(Box::leak(boxed)) });
}
is sound, no matter the alloc::Global
backing it.
-
A far-fetched / contrived generalization to any
impl Allocator
fn drop_box_in<T, A : Allocator> (boxed: Box<T, A>) { let (mut ptr, alloc) = Box::into_raw_with_allocator(x); unsafe { ptr = &mut *ptr; drop(Box::from_raw_in(ptr, alloc)); } }
Which, given Box
's implementation, is assuming that if somebody asks an impl GlobalAlloc
—or an impl Alloc
if generalizing— memory for a Layout::new::<T>()
(through alloc
or realloc
), and gets back a non-null pointer ptr
, then it is then legal to give back ptr
to that impl Alloc
's dealloc
(or realloc
), but with ptr
's provenance having been "shrunk" down to that T
's layout (e.g., through ptr = <*mut _>::cast(&mut *ptr.cast::<MaybeUninit<T>>());
).
This, in practice, can be quite problematic for many (most?) GlobalAlloc
implementations out there, since they do often perform some bookkeeping and whatnot laid out contiguously to the yielded allocation, and such metadata would thus not be accessible from such a returned pointer alone: the allocator would thus need to keep some extra data / state to be able to get back provenance over the user-facing allocation and the contiguous metadata.
A simplified example
use ::std::{
alloc::{self, GlobalAlloc as _, Layout},
mem::drop as stuff,
ptr,
};
unsafe trait AllocI32 {
fn alloc_i32() -> Option<ptr::NonNull<i32>>;
unsafe
fn dealloc(_: ptr::NonNull<i32>);
}
struct MyAllocUsingGlobalAllocUnderTheHood;
#[repr(C)]
struct I32AndMeta {
alloc: i32,
meta: u8,
}
static SYSTEM_ALLOC: alloc::System = alloc::System;
unsafe impl AllocI32 for MyAllocUsingGlobalAllocUnderTheHood {
fn alloc_i32() -> Option<ptr::NonNull<i32>> {
let layout = Layout::new::<I32AndMeta>();
ptr::NonNull::new(unsafe { SYSTEM_ALLOC.alloc_zeroed(layout) })
.map(ptr::NonNull::cast)
}
unsafe fn dealloc(ptr: ptr::NonNull<i32>) {
let layout = Layout::new::<I32AndMeta>();
let meta = ptr.as_ptr().cast::<u8>().add(4).read();
stuff(meta);
SYSTEM_ALLOC.dealloc(ptr.as_ptr().cast(), layout);
}
}
The interesting lines here are:
unsafe fn dealloc(ptr: ptr::NonNull<i32>) {
let layout = Layout::new::<I32AndMeta>();
let meta = ptr.as_ptr().cast::<u8>().add(4).read();
if ptr
were to stem from &i32
(e.g., let r: &i32 = …; dealloc(r.into());
), even if that &i32
had originated from an alloc()
-yielded ptr (let r: &i32 = alloc().unwrap().as_ref();
), then the operation reading meta
would not be well-defined: r.add(4)
would yield an off-by-one pointer, which would not be usable to perform a read-dereference with.
The two possible workarounds
In a world without any abstraction whatsoever, the answer to this problem is easy: keep a pointer with provenance over that allocated I32AndMeta
around (such as the ptr
returned by alloc
itself), and use it to "launder" the received ptr. But since there is this Alloc
/ GlobalAlloc
boundary, the question remains: who should be responsible for doing this?
-
Would it be the
Alloc
ator, as in:unsafe fn dealloc(ptr: ptr::NonNull<i32>) { let layout = Layout::new::<I32AndMeta>(); let ptr = ptr.as_ptr().cast::<u8>(); + let ptr_with_provenance: *mut u8 = Self::get_ptr_with_provenance(…); + // amend `ptr` so that it points to the same place, but with a fixed provenance + let ptr = <*mut u8>::wrapping_offset( + ptr_with_provenance, + isize::wrapping_sub( + ptr.as_ptr() as _, + ptr_with_provenance as _, + ), + ); let meta = ptr.add(4).read(); // <- this read is now well-defined! stuff(meta); SYSTEM_ALLOC.dealloc(ptr.as_ptr().cast(), layout); }
-
or would it be the user of the
Alloc
ator, by declaring "unleaking" to be a contract violation / by requiring that a pointer with the originally-obtained provenance be the only valid input for a{de,re}alloc
call?pub struct MutRefWithOriginalProvenance<'lt, T> { ptr: ptr::NonNull<T>, _phantom: PhantomData<&'lt mut T>, } impl<T> Deref{,Mut} for … { type Target = T; … } impl… MutRefWithOriginalProvenance… { pub fn into_raw… }
let mut r: MutRefWithOriginalProvenance<'static, i32> = Box::reversible_leak(Box::new(42)); *r += 27; unsafe { drop(Box::from_raw(r.into_raw())); }
This point ought to be clarified, and if going for the latter —or until confirming the former—, then the stdlib docs should be updated to actually disincentivize unleaking.
My potentially-obvious two cents
It feels like the "legalized unleaking" approach has the drawback of requiring that extra get_ptr_with_provenance(…)
operation, which could come with a non-negligible cost for all GlobalAlloc
implementations, only to allow a potentially deemed niche "unleaking" operation.
But it also feels like "forbidden unleaking" approach is quite a footgun, if, for instance, even the stdlib docs have gotten it wrong for such a long time.
So this seems like the classic "let's gauge/measure the performance benefits of 'forbidden unleaking' / the performance cost of 'legalized unleaking'" to compare them against the footguns that forbidding it introduces.
Finally, and this is technically beyond Rust's reach, there is also the question of non-Rust pervasive implementations of GlobalAlloc
, such as that of libc
(malloc
, calloc
, realloc
, free
) and whatnot. Such implementations do use metadata, and according to @chorman0773, the cost of a get_ptr_with_provenance(…)
operation would be very much non-negligible (and, technically, even more so since Rust cannot go and tweak such an implementation, and would thus have to wrap it in a black-box API kind of fashion).
-
So, from the looks of it / IIUC, the only practical approach w.r.t. a legalized unleaking would be to ban
malloc
& friends from being used forGlobalAlloc
implementations! But I may very well be wrong; I'll let @chorman0773 (and others) chime in and clarify this hypothetical point (although if this were to be true, then I guess there is really no other choice than forbidding unleaking).- This, in turn, could be deemed inconvenient for FFI users which would have chosen a
malloc
-powered#[global_allocator]
to allowfree
-ing to occur from the C side, but this is yet another topic…
- This, in turn, could be deemed inconvenient for FFI users which would have chosen a