-
Notifications
You must be signed in to change notification settings - Fork 13.4k
Improve liballoc_jemalloc {re}alloc_excess #45482
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
(rust_highfive has picked a reviewer for you, use r? to override) |
0328e5b
to
e1344a2
Compare
src/liballoc_jemalloc/lib.rs
Outdated
ptr::write(err as *mut AllocErr, | ||
AllocErr::Exhausted { request: layout }); | ||
} else { | ||
*excess = nallocx(size as size_t, flags) as usize; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding is that excess
(despite the name) should have the entire capacity, not the extra.
unsafe fn realloc_excess(&mut self,
ptr: *mut u8,
layout: Layout,
new_layout: Layout) -> Result<Excess, AllocErr> {
let mut err = ManuallyDrop::new(mem::uninitialized::<AllocErr>());
let mut size = 0;
let ptr = __rust_realloc_excess(ptr,
layout.size(),
layout.align(),
new_layout.size(),
new_layout.align(),
&mut size,
&mut *err as *mut AllocErr as *mut u8);
if ptr.is_null() {
Err(ManuallyDrop::into_inner(err))
} else {
Ok(Excess(ptr, size))
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just checked the docs of Excess
and you are correct, the usize
there is the total capacity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just commented on the #32838 tracking issue about this, I found it really confusing. Not only in the Allocator trait, but also because it propagates back to the wrappers over the allocator functions like here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the classic naming problem.
src/liballoc_jemalloc/lib.rs
Outdated
AllocErr::Exhausted { request: layout }, | ||
); | ||
} else { | ||
*excess = nallocx(new_size, flags) as usize; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also think this is wrong: nallocx
returns the excess size of mallocx
but you are using rallocx
to allocate (and it doesn't return the excess size but the total size).
You can either use sdallocx(ptr, align)
to get the real size allocated for pointer (which I think is more expensive than nallocx
), or do like #45514 and use xallocx
to allocate instead of rallocx
which gives you the total size back.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point about salloc, I'll change the code to use that.
xallocx
isn't general, it can "fail to grow" if the reallocation can't be done inplace, see https://github.com/jemalloc/jemalloc/blob/47203d5f422452def4cb29c0b7128cc068031100/test/integration/xallocx.c#L51
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
xallocx
isn't general, it can "fail to grow" if the reallocation can't be done inplace
Indeed, I've asked about this in jemalloc's issues earlier today: jemalloc/jemalloc#1059
Ideally what we would like to use is nallocx
equivalent for rallocx
, or a rallocx
variant that returns the excess size. My understanding is that sdallocx
is correct for all cases, but it can be expensive because it might traverse jemalloc internal data-structures to find the memory.
Modulo the issues mentioned I think we should make this change. I've filled it in two different pull-requests (#45512, #45514) because I want to run perf on both independently since that will tell us more about the impact of Did you manage to find any information about why these changes weren't implemented before? I've been trying to find anything about this in the issue tracker without luck. |
55562a3
to
f8e1488
Compare
Note that currently |
Once this is done could you split this into two independent pull-request so that we can run perf on both ? That way I can close mine. |
f8e1488
to
8c6590f
Compare
Why split PRs though? It's a 30 line change. |
nvm, I just wanted to get perf result for the two independent changes mostly because I don't know the cost of the |
I see what you meant now. I'll go ahead and close this one so you can have more control of what's tested. |
It should now behave as expected.
cc #45434