Description
While trying to isolate what kinds of preconditions the Allocator
trait is going to require and post-conditions it ensures, I made a test program to explore what happens with our current pair of low-level allocators.
As far as I can tell so far, jemalloc always ensures that it never gives back an unaligned address.
but on OS X, the system allocator (linked via extern crate alloc_system;
) can produce addresses that do not meet the alignment request, namely if you ask for values with starting addresses of alignment greater than or equal to 1 << 32
. (Like i said, we're talking about huge alignments here.)
(on Linux, for both jemalloc and the system allocator, I never observed either returning non-null addresses that did not meet the alignment request. In other words, they handled absurdly large alignments "properly")
What to do about this?
I talk a little about this further in my Digression section below, but in summary, I think we should do something about this.
It seems like an easy short term solution is this: We should be able to pre-determine which allocator + platform combinations fall victim to this problem, and for those cases, make the allocate
(and reallocate
, etc) methods validate their results. If validation fails, then we can just free the original address and return null.
- The main problem with this is that it adds overhead to the allocation path. (Admittedly a
%
(aka divide) and branch doesn't seem like much in the context of an allocation, but still...)
Anyway, if you're interested in alternative ideas, see the Digression/Discussion section below
Digression and/or Discussion
When Rust originally merged the jemalloc support PR, it stated the following requirement on the allocate
method:
alignment must be no larger than the largest supported page size on the platform.
- depending on what the phrase "the largest supported page size" is supposed to mean, perhaps even moderately sized values are not legal inputs for the alignment...
- (was that phrase supposed to be something like "allocated block size" rather than "page size" ? The notion of more than one memory page size on a given platform is not something I'm familiar with...)
... so arguably there has always been a precondition to not feed in ridiculously large values for alignment.
However, even if that is the case, here are some things to consider
Requirements should be checkable
if we want to continue to impose this precondition, then we must provide the programmer with a way to query what the value of "the largest supported page size on the platform" actually is.
(I didn't see a way to do this via a couple of greps of the code base, but it doesn't help that I don't understand what is actually meant by the above phrase.)
- (I don't think the phrase is meant to denote the same thing as
::std::sys::os::page_size
; so I don't think that would be a way to query the actual value, though certainly it would provide a bound that a programmer can use in the meantime...) - IMO, if we were to add a method to query the value of "largest supported alignment", it should be part of the low-level allocator interface (see RFC 1183), since presumably the largest supported value would be connected to the allocator implementation.
Post-validation is not absurd
Given that at least some alloc-crate + target combinations are not imposing the above requirement (in the sense that they return null when the given alignment is unsatisfiable), it seems somewhat reasonable to me to add the post-allocation check that I described above, as long as we do it in a conditionalized manner.
- Instead of conditionalizing based on the target OS, we might be able to write the injected code in a way where the compiler can optimize away the check when the given alignment is known and falls beneath some reasonable bound, like
align <= 0x1000
, working under the assumptions that all allocators will behave properly with such an input. - (I suspect in a vast number of cases, the alignment is statically known, so this would probably be a fine solution.)
Also, the man page for posix_memalign
on my version of OS X says nothing about an upper bound on the value for alignment
. This to me indicates that this is a bug in OS X itself, which we can report and workaround via post-validation in the short term.
- (If we did conditionalize based on target OS, then longer term, I don't know how we would best deal with trying to conditionally inject the post-validation in question; perhaps the default
#[allocator]
crate linking could choose between the two variants depending on which version of OS X is targetted.)
The Sample Program
Here is that sample program (you can toggle the different allocates with --cfg alloc_system
or --cfg alloc_jemalloc
). When you run it, the thing to look for is when there is an output line that has rem: <value>
on the end, which is only printed for non-zero remainder (when one divides the address by the alignment).
(Also, here's a playpen, though I repeat that the problematic behavior does not exhibit itself on versions of Linux that I have tried.) ((Updated to fit new global allocator API))
#![feature(alloc, allocator_api)]
#![cfg_attr(alloc_jemalloc, feature(alloc_jemalloc))]
#![cfg_attr(alloc_system, feature(alloc_system))]
extern crate alloc;
#[cfg(alloc_jemalloc)]
extern crate alloc_jemalloc;
#[cfg(alloc_system)]
extern crate alloc_system;
use std::{isize, mem};
const PRINT_ALL: bool = false;
fn main() {
use std::heap::{Alloc, System, Layout};
unsafe {
for i in 0 .. mem::size_of::<usize>()*8 {
let (mut v, a) = (Vec::new(), 1 << i);
let try_alloc = |j, s, a| {
let p = System.alloc(Layout::from_size_align(s, a).unwrap());
if let Ok(p) = p {
let q = p as usize;
let rem = q % a;
if PRINT_ALL || rem != 0 {
println!("(i,j,s):{ijs:>30} a:{a:>8} => p:{p:20?}, rem: 0x{rem:x}",
ijs=format!("({},{},0x{:x})", i,j,s),
a=format!("1<<{}", i),
p=p,
rem=rem);
}
} else {
println!("(i,j,s):{ijs:>30} a:{a:>8} => alloc fail",
ijs=format!("({},{},0x{:x})", i,j,s),
a=format!("1<<{}", i));
}
p
};
{
let mut alloc_loop = |init_s, start, limit| {
let mut s = init_s;
for j in start .. limit {
if s > isize::MAX as usize {
println!("breaking b/c s={} > isize::MAX={}", s, isize::MAX);
break;
}
let p = try_alloc(j, s, a);
if let Ok(p) = p { v.push((p,s,a)); } else { break; }
s += j;
}
};
if i >= 8*mem::size_of::<usize>() { break; }
alloc_loop(10, 0, 10);
alloc_loop(a, 10, 20);
}
for (p,s,a) in v { System.dealloc(p, Layout::from_size_align(s, a).unwrap()); }
}
}
}
Output on Mac OS X showing the erroneous value(s)
See following gist: https://gist.github.com/pnkfelix/535e8d4e810025331c77