Should raw pointer deref/projections have to be in-bounds?

Currently, when you write `addr_of!((*ptr).field)`, `ptr` has to be a sufficiently aligned and dereferencable pointer (meaning it has to point to actually allocated memory), with the size/alignment determined by the pointee type.

Formally speaking, this stems from the fact that the `*` operator, which turns a value (of pointer or reference type) into a place, [requires](https://doc.rust-lang.org/reference/behavior-considered-undefined.html) that the pointer be aligned and dereferencable. This is true regardless of the context in which the `*` operator is used, i.e., regardless of what happens to this place after it has been constructed.

Pragmatically speaking, this reflects the fact that `addr_of!((*ptr).field)` is lowered to a `getelementptr inbounds` in LLVM. That said, the rules in the reference go further than what is required for codegen -- they require `ptr` to be dereferencable for the full size of the type (not just the `field`), and that even if we do `addr_of!(*ptr)`.

This situation frequently causes confusion, so maybe we want to change it. At least it'd be good to have a central place for discussion and reference, so I created this issue.
Cc @eddyb @joshtriplett @thomcc @Lokathor @cuviper https://github.com/rust-lang/reference/pull/1000
Instance(s) of confusion in practice: https://github.com/rust-lang/rust/pull/93459#discussion_r812436691

So, what could we do? Let me collect some proposals that have come up.

## Remove restriction from `*` entirely

The most radical option is to entirely remove the clause which imposes a rule on `*`, and instead just require that when a place is actually read from (via `move`/`copy`) or stored to (via `=`), *then* it has to be sufficiently aligned and dereferencable as determined by the type of the place. (We currently do not need such a rule, since as an invariant *all* places that are constructed are aligned and dereferencable.)

This means we have to remove `inbounds` from `getelementptr` in place projection lowering when the "source" of this place projection is a raw pointer. We can still add `inbounds` for references or other places (locals/statics) since we know those to be dereferencable for other reasons (except for [this twist](https://github.com/rust-lang/unsafe-code-guidelines/issues/319#issuecomment-1075600561)). (We can *not* add it in cases like `&(*ptr).field` since `ptr` might be e.g. 4 bytes before the beginning of the allocation and `.field` offsets the ptr by 4.)

This has the potential of making LLVM alias analysis and other optimizations less effective. In particular, `getelementptr` might not just go out-of-bounds, it would even be allowed to overflow, so there are some tricks LLVM might not be able to pull any more.

It is certainly my favorite option, because it maximally reduces the amount of UB -- the remaining rule (about actual loads/stores) is pretty much unavoidable and typically expected by programmers.

## Still prevent overflows

We could also say that `*ptr` on a raw pointer is UB if adding `size_of_val_raw(ptr)` to `ptr` would overflow, but allocation bounds do not matter. This would let us emit a hypothetical `getelementptr nowrap`, if LLVM were to add support for such a flag in the future. I doubt this is any less surprising than our current rules, TBH, but it is less likely to be violated by the typical kind of code people write. It is also more complicated, since we still have to explicitly add the rule which says that loading from and storing to a place requires it to be dereferencable and aligned.

## Check inbounds on place projection

If we want to keep codegen unchanged, we could still relax our rules: we could allow `addr_of!(*ptr)`, and we could make it so that `addr_of!((*ptr).field)` only needs to be inbounds for the actual offset that is performed.

For example, on top of the "remove restrictions" proposal, we could have an extra rule for each place projection which says that this projection needs to be in-bounds.

This has the advantage of keeping our current codegen and quite obviously aligning with the intuition many people seem to have for place projections. It also has some disadvantages though:
- `addr_of!((*ptr).field)` continues to be UB if `ptr` is entirely dangling, so e.g. people wanting to write offset_of-style code still have to navigate around a footgun.
- Our rules become more complicated. (Basically, we would replace one bullet in the "Behavior considered undefined" list by two bullets.)

## Magic?

> There's also the option (which I'm aware you aren't a fan of) of allowing addr_of!((*p).field) to violate the inbounds rule but still requiring &(*ptr).field to uphold it.

https://github.com/rust-lang/unsafe-code-guidelines/issues/319#issuecomment-1049129946

## New syntax

> What you really just want to be able to really ergonomically talk about field offsets.

https://github.com/rust-lang/unsafe-code-guidelines/issues/319#issuecomment-1049316167

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Should raw pointer deref/projections have to be in-bounds? #319

Remove restriction from `*` entirely

Still prevent overflows

Check inbounds on place projection

Magic?

New syntax

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Should raw pointer deref/projections have to be in-bounds? #319

Description

Remove restriction from * entirely

Still prevent overflows

Check inbounds on place projection

Magic?

New syntax

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Remove restriction from `*` entirely