-
Notifications
You must be signed in to change notification settings - Fork 13.6k
LangRef: allocated objects can grow #141338
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -3327,6 +3327,19 @@ behavior is undefined: | |
- the size of all allocated objects must be non-negative and not exceed the | ||
largest signed integer that fits into the index type. | ||
|
||
Allocated objects that are created with operations recognized by LLVM (such as | ||
:ref:`alloca <i_alloca>`, heap allocation functions marked as such, and global | ||
variables) may *not* change their size. (``realloc``-style operations do not | ||
change the size of an existing allocated object; instead, they create a new | ||
allocated object. Even if the object is at the same location as the old one, old | ||
pointers cannot be used to access this new object.) However, allocated objects | ||
can also be created by means not recognized by LLVM, e.g. by directly calling | ||
RalfJung marked this conversation as resolved.
Show resolved
Hide resolved
|
||
``mmap``. Those allocated objects are allowed to grow to the right (i.e., | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It occurred to me that it may be helpful to point out here that There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good point, I'll add that. |
||
keeping the same base address, but increasing their size) while maintaining the | ||
validity of existing pointers, as long as they always satisfy the properties | ||
described above. Currently, allocated objects are not permitted to grow to the | ||
left or to shrink, nor can they have holes. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Given the restrictions, the compiler can't tell where the "beginning" of the object is, so I'm not sure forbidding growth to the left has any meaningful effect. I'm not sure what a "hole" is, in this context. I don't think we require that all bytes of an object have to be dereferenceable. It might make sense to forbid overlapping live objects, though. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
It does have the one effect that it simplifies specifying when
But pointer offset is meant to be freely reorderable, so we could move the two offsets next to each other. We also can combine adjacent
By "hole" I mean e.g.
Also, what exactly does There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
But you currently forbid shrinking?
I agree munmapping is problematic; overlapping objects would be hard to reason about. There are potentially useful "holes", though: mprotecting a page from a continuous range for a JIT would be a hole in the sense that reads aren't legal. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Yeah but I'd like to allow it in the future -- it seems more important to me than growing to the left.
I don't think an mprotect causing reads and writes to trap is fundamentally different from munmap for this purpose, or is it? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Hmm, okay.
Alias analysis doesn't really care if an object is actually accessible at the moment; it just cares we don't overlap with some other object. Dereferenceability is only relevant if you want to do speculative loads. So you can separate dereferenceability from the rest of the properties of an allocation. |
||
|
||
.. _objectlifetime: | ||
|
||
Object Lifetime | ||
|
@@ -11870,6 +11883,9 @@ if the ``getelementptr`` has any non-zero indices, the following rules apply: | |
:ref:`based <pointeraliasing>` on. This means that it points into that | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this semantics is problematic as you need to guess the future. This has implications in alias analysis. We would need to disable all rules that use reasoning such as There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
See the comments above. I think we can avoid this issue by rephrasing this to something like:
I believe this still gives us all the properties we need from inbounds (in particular the ability to cross more than half the address space, even with a sequence of multiple inbounds operations).
This alias analysis only applies to fixed size objects with known size. I do not believe it will be affected by this change (which is only relevant to allocations which for LLVM does not know the size). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Alias analysis works over heap-allocated objects. Anything that LLVM (MemoryBuiltins.h) can infer the size is fair game. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
No, we don't -- the PR explicitly discusses this: all allocated objects created by operations that are built-in to LLVM must never change their size.
Indeed, and that's fine. All we need is some way to allocate memory such that LLVM cannot infer the size (and promises to never infer it) -- e.g. by calling Longer-term it may also be useful to offer a flag for malloc-like functions so that frontends can communicate to LLVM whether this allocation is allowed to change size or not, but that's left to future work. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
My idea was that conceptually the size would be given in the code, e.g. if we had an actual formal model. LLVM IR would just omit it since y'all don't like spec-only parameters. ;) But as long as we keep the start of the allocation fixed, then as Nikita says we can also just say that the relevant size is the theoretical maximum, not the actual maximum -- and that is a simple pure function of the start address of the allocation. |
||
allocated object, or to its end. Note that the object does not have to be | ||
live anymore; being in-bounds of a deallocated object is sufficient. | ||
If the allocated object can grow, then the relevant size for being *in | ||
bounds* is the maximal size the object could have while satisfying the | ||
allocated object rules, not its current size. | ||
* During the successive addition of offsets to the address, the resulting | ||
pointer must remain *in bounds* of the allocated object at each step. | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "heap allocation functions marked as such" part here is meant to capture everything LLVM recognizes as a heap allocation function. Is there a better way to say this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe "heap allocation functions with the
allocsize
attribute"?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't LLVM also recognize
malloc
as a special magic name?