Can allocations have gaps? Can they be partially deallocated? Can they grow "in-place"?

Rust itself can only create contiguous allocations that can only be deallocated all at once. But does the compiler get to assume that *all* allocations, including those generated by 3rd party allocators, behave like that?

This does have impact on optimizations:
- for a `ptr: *const u8`, if we see `ptr.read()` and `ptr.wrapping_offset(x).read()`, can we conclude that all memory at offset `0..x` is dereferenceable?
- if we have a `x: &Cell<T>`, then call `unknown_fn()` (which might deallocate the memory `x` points to), and then at least one byte of x is read again -- can we conclude that all of `x` (`0..size_of::<T>()`) is still dereferenceable?

One major example of an API that might violate this is `mmap`. There is a fairly clean way to model `mmap`: we say there is one (exposed) provenance (one allocation) for all mmap'ed memory, and with `mmap`/`munmap` ranges of memory become dereferenceable with that provenance. But clearly the "allocation" that represents all `mmap`ed memory can have gaps, and it can be partially deallocated! If we say that the compiler can assume that allocations never have gaps, then the following is UB:
- assume that no memory is mapped at X .. (X+4*PAGE_SIZE)
- map a page at address X
- map another page at address X+2*PAGE_SIZE
- call the following function with `X as *const u8`:
```rust
fn break(ptr: *const u8) {
  ptr.read();
  ptr.wrapping_add(2*PAGE_SIZE).read();
}
```
This code must be UB because under the "no gap" assumption, the compiler is allowed to insert another read at `ptr.add(PAGE_SIZE)`, but that address is not mapped so we'd get a pagefault.

So this leads to some questions:
- Is the sequence of actions described above permitted? IMO the answer must be yes; making this UB is just too much of a footgun. Where does the UB even come from? `wrapping_add` is safe, so somehow the 2nd `read` is UB? But that memory was mmap'ed!
- If that sequence of actions is permitted -- I think that implies we do *not* have a "no gap" assumption. At least we certainly don't get the optimizations we'd usually get from that assumption, as the example above shows.
- What's LLVM's stance on this? Does LLVM do anything that assumes "no gaps"? Should we propose to add something to the LangRef that explicitly says gaps can happen?

Note that "mmap works in addresses, there is no provenance" does not help: in `break`, `ptr` has some provenance, and no matter where it comes from -- if we say "no gaps", no one provenance can be valid for both of these reads with an unmapped page in between them.

For the "no partial deallocation" assumption, the situation is similar -- we can also write `mmap`-based code that violates it:
- map a page at address X, and another one at X+PAGE_SIZE
- call the following function with `X as *const u8`:
```rust
fn break2(ptr: *const u8) {
  let _val = ptr.cast::<[u8; 2*PAGE_SIZE]>().read();
  deallocate_second_page(); // unmap the page at X+PAGE_SIZE
  let _val = ptr.read();
}
```
Is the compiler allowed to insert a speculative `ptr.cast::<[u8; 2*PAGE_SIZE]>().read()` at the end? If yes, this would segfault, so somehow the original code must have UB. I don't think it can reasonably have UB, and therefore we have to accept that partial deallocation is a thing.

For allocations growing in-place, if the compiler sees the `malloc` it might exploit that the geometry of the allocation cannot change, so even after calling unknown code an access outside the of the allocation per its initial size could be optimized away as UB -- but if the unknown code was actually growing the allocation to make the access in-bounds, that would be a wrong optimization.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Can allocations have gaps? Can they be partially deallocated? Can they grow "in-place"? #430

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Can allocations have gaps? Can they be partially deallocated? Can they grow "in-place"? #430

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions