if huge requested align, alloc_system heap::allocate on OS X returns unaligned values

While trying to isolate what kinds of preconditions the `Allocator` trait is going to require and post-conditions it ensures, I made a test program to explore what happens with our current pair of low-level allocators.

As far as I can tell so far, jemalloc always ensures that it never gives back an unaligned address.

but on OS X, the system allocator (linked via `extern crate alloc_system;`) can produce addresses that do not meet the alignment request, namely if you ask for values with starting addresses of alignment greater than or equal to `1 << 32`.  (Like i said, we're talking about _huge_ alignments here.)

(on Linux, for both jemalloc and the system allocator, I never observed either returning non-null addresses that did not meet the alignment request.  In other words, they handled absurdly large alignments "properly")
### What to do about this?

I talk a little about this further in my Digression section below, but in summary, I think we should do _something_ about this.

It seems like an easy short term solution is this: We should be able to pre-determine which allocator + platform combinations fall victim to this problem, and for those cases, make the `allocate` (and `reallocate`, etc) methods validate their results. If validation fails, then we can just free the original address and return null. 
- The main problem with this is that it adds overhead to the allocation path.  (Admittedly a `%` (aka divide) and branch doesn't seem like much in the context of an allocation, but still...)

Anyway, if you're interested in alternative ideas, see the Digression/Discussion section below
### Digression and/or Discussion

When Rust [originally merged](https://github.com/rust-lang/rust/commit/1b1ca6d5465ef4de12b1adf25cd4598f261c660d#diff-2020572c14fe23a6b11b55b5e113b4aaR36) the jemalloc support PR, it stated the following requirement on the `allocate` method:

```
alignment must be no larger than the largest supported page size on the platform.
```
- depending on what the phrase "the largest supported page size" is supposed to mean, perhaps even moderately sized values are not legal inputs for the alignment...
- (was that phrase supposed to be something like "allocated block size" rather than "page size" ? The notion of more than one memory page size on a given platform is not something I'm familiar with...)

... so arguably there has always been a precondition to not feed in ridiculously large values for alignment.   

However, even if that is the case, here are some things to consider
#### Requirements should be checkable

if we want to continue to impose this precondition, then we _must_ provide the programmer with a way to query what the value of "the largest supported page size on the platform" actually _is_.

(I didn't see a way to do this via a couple of greps of the code base, but it doesn't help that I don't understand what is actually meant by the above phrase.)
- (I don't think the phrase is meant to denote the same thing as `::std::sys::os::page_size`; so I don't think that would be a way to query the actual value, though certainly it would provide a _bound_ that a programmer can use in the meantime...)
- IMO, if we were to add a method to query the value of "largest supported alignment", it should be part of the low-level allocator interface (see [RFC 1183](https://github.com/rust-lang/rfcs/blob/master/text/1183-swap-out-jemalloc.md)), since presumably the largest supported value would be connected to the allocator implementation.
#### Post-validation is not absurd

Given that at least _some_ alloc-crate + target combinations are not imposing the above requirement (in the sense that they return null when the given alignment is unsatisfiable), it seems somewhat reasonable to me to add the post-allocation check that I described above, as long as we do it in a conditionalized manner.
- Instead of conditionalizing based on the target OS, we might be able to write the injected code in a way where the compiler can optimize away the check when the given alignment is known and falls beneath some reasonable bound, like `align <= 0x1000`, working under the assumptions that all allocators will behave properly with such an input.
- (I suspect in a vast number of cases, the alignment is statically known, so this would probably be a fine solution.)

Also, the man page for `posix_memalign` on my version of OS X says nothing about an upper bound on the value for `alignment`. This to me indicates that this is a _bug_ in OS X itself, which we can report and workaround via post-validation in the short term. 
-  (If we did conditionalize based on target OS, then longer term, I don't know how we would best deal with trying to conditionally inject the post-validation in question; perhaps the default `#[allocator]` crate linking could choose between the two variants depending on which version of OS X is targetted.)
### The Sample Program

Here is that sample program (you can toggle the different allocates with `--cfg alloc_system` or `--cfg alloc_jemalloc`).  When you run it, the thing to look for is when there is an output line that has `rem: <value>` on the end, which is only printed for non-zero remainder (when one divides the address by the alignment).

(Also, here's a [playpen](https://play.rust-lang.org/?gist=6fa7a76ab99380251920&version=nightly), though I repeat that the problematic behavior does not exhibit itself on versions of Linux that I have tried.) ((Updated to fit new [global allocator API](https://github.com/rust-lang/rust/issues/27389)))

``` rust
#![feature(alloc, allocator_api)]
#![cfg_attr(alloc_jemalloc, feature(alloc_jemalloc))]
#![cfg_attr(alloc_system, feature(alloc_system))]
extern crate alloc;
#[cfg(alloc_jemalloc)]
extern crate alloc_jemalloc;
#[cfg(alloc_system)]
extern crate alloc_system;
use std::{isize, mem};
const PRINT_ALL: bool = false;
fn main() {
    use std::heap::{Alloc, System, Layout};
    unsafe {
        for i in 0 .. mem::size_of::<usize>()*8 {
            let (mut v, a) = (Vec::new(), 1 << i);
            let try_alloc = |j, s, a| {
                let p = System.alloc(Layout::from_size_align(s, a).unwrap());
                if let Ok(p) = p {
                    let q = p as usize;
                    let rem = q % a;
                    if PRINT_ALL || rem != 0 {
                        println!("(i,j,s):{ijs:>30} a:{a:>8}  =>  p:{p:20?}, rem: 0x{rem:x}",
                                 ijs=format!("({},{},0x{:x})", i,j,s),
                                 a=format!("1<<{}", i),
                                 p=p,
                                 rem=rem);
                    }
                } else {
                    println!("(i,j,s):{ijs:>30} a:{a:>8}  =>  alloc fail",
                             ijs=format!("({},{},0x{:x})", i,j,s),
                             a=format!("1<<{}", i));
                }
                p
            };
                {
                let mut alloc_loop = |init_s, start, limit| {
                    let mut s = init_s;
                    for j in start .. limit {
                        if s > isize::MAX as usize {
                            println!("breaking b/c s={} > isize::MAX={}", s, isize::MAX);
                            break;
                        }
                        let p = try_alloc(j, s, a);
                        if let Ok(p) = p { v.push((p,s,a)); } else { break; }
                        s += j;
                    }
                };

                if i >= 8*mem::size_of::<usize>() { break; }
                alloc_loop(10, 0, 10);
                alloc_loop(a, 10, 20);
            }

            for (p,s,a) in v { System.dealloc(p, Layout::from_size_align(s, a).unwrap()); }
        }
    }
}                                                                                            
```
#### Output on Mac OS X showing the erroneous value(s)

See following gist: https://gist.github.com/pnkfelix/535e8d4e810025331c77


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

if huge requested align, alloc_system heap::allocate on OS X returns unaligned values #30170

What to do about this?

Digression and/or Discussion

Requirements should be checkable

Post-validation is not absurd

The Sample Program

Output on Mac OS X showing the erroneous value(s)

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

if huge requested align, alloc_system heap::allocate on OS X returns unaligned values #30170

Description

What to do about this?

Digression and/or Discussion

Requirements should be checkable

Post-validation is not absurd

The Sample Program

Output on Mac OS X showing the erroneous value(s)

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions