Skip to content

Improve liballoc_jemalloc {re}alloc_excess #45482

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

arthurprs
Copy link
Contributor

It should now behave as expected.

cc #45434

@rust-highfive
Copy link
Contributor

r? @alexcrichton

(rust_highfive has picked a reviewer for you, use r? to override)

@kennytm kennytm added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Oct 24, 2017
ptr::write(err as *mut AllocErr,
AllocErr::Exhausted { request: layout });
} else {
*excess = nallocx(size as size_t, flags) as usize;
Copy link
Contributor

@gnzlbg gnzlbg Oct 25, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did it like this at first in #45512 but I think this is wrong. nallocx returns the total allocation size, so excess should be that - size. #45512 does it like that.

Copy link
Contributor Author

@arthurprs arthurprs Oct 25, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that excess (despite the name) should have the entire capacity, not the extra.

    unsafe fn realloc_excess(&mut self,
                             ptr: *mut u8,
                             layout: Layout,
                             new_layout: Layout) -> Result<Excess, AllocErr> {
        let mut err = ManuallyDrop::new(mem::uninitialized::<AllocErr>());
        let mut size = 0;
        let ptr = __rust_realloc_excess(ptr,
                                        layout.size(),
                                        layout.align(),
                                        new_layout.size(),
                                        new_layout.align(),
                                        &mut size,
                                        &mut *err as *mut AllocErr as *mut u8);
        if ptr.is_null() {
            Err(ManuallyDrop::into_inner(err))
        } else {
            Ok(Excess(ptr, size))
        }
    }

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just checked the docs of Excess and you are correct, the usize there is the total capacity.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just commented on the #32838 tracking issue about this, I found it really confusing. Not only in the Allocator trait, but also because it propagates back to the wrappers over the allocator functions like here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the classic naming problem.

AllocErr::Exhausted { request: layout },
);
} else {
*excess = nallocx(new_size, flags) as usize;
Copy link
Contributor

@gnzlbg gnzlbg Oct 25, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think this is wrong: nallocx returns the excess size of mallocx but you are using rallocx to allocate (and it doesn't return the excess size but the total size).

You can either use sdallocx(ptr, align) to get the real size allocated for pointer (which I think is more expensive than nallocx), or do like #45514 and use xallocx to allocate instead of rallocx which gives you the total size back.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point about salloc, I'll change the code to use that.

xallocx isn't general, it can "fail to grow" if the reallocation can't be done inplace, see https://github.com/jemalloc/jemalloc/blob/47203d5f422452def4cb29c0b7128cc068031100/test/integration/xallocx.c#L51

Copy link
Contributor

@gnzlbg gnzlbg Oct 25, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

xallocx isn't general, it can "fail to grow" if the reallocation can't be done inplace

Indeed, I've asked about this in jemalloc's issues earlier today: jemalloc/jemalloc#1059

Ideally what we would like to use is nallocx equivalent for rallocx, or a rallocx variant that returns the excess size. My understanding is that sdallocx is correct for all cases, but it can be expensive because it might traverse jemalloc internal data-structures to find the memory.

@gnzlbg
Copy link
Contributor

gnzlbg commented Oct 25, 2017

Modulo the issues mentioned I think we should make this change.

I've filled it in two different pull-requests (#45512, #45514) because I want to run perf on both independently since that will tell us more about the impact of nallocx and xallocx on performance which is useful to know while optimizing RawVec.

Did you manage to find any information about why these changes weren't implemented before? I've been trying to find anything about this in the issue tracker without luck.

@arthurprs arthurprs force-pushed the update-jemalloc branch 2 times, most recently from 55562a3 to f8e1488 Compare October 25, 2017 12:14
@gnzlbg
Copy link
Contributor

gnzlbg commented Oct 25, 2017

Note that currently sallocx is not provided.

@gnzlbg
Copy link
Contributor

gnzlbg commented Oct 25, 2017

Once this is done could you split this into two independent pull-request so that we can run perf on both ? That way I can close mine.

@arthurprs
Copy link
Contributor Author

Why split PRs though? It's a 30 line change.

@gnzlbg
Copy link
Contributor

gnzlbg commented Oct 25, 2017

nvm, I just wanted to get perf result for the two independent changes mostly because I don't know the cost of the sallocx call, might be as expensive as a rallocx call. But I have the changes split in both PRs already so I can ask for perf runs on those. For merging purposes a single PR is fine.

@arthurprs
Copy link
Contributor Author

I see what you meant now. I'll go ahead and close this one so you can have more control of what's tested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-review Status: Awaiting review from the assignee but also interested parties.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants