-
Notifications
You must be signed in to change notification settings - Fork 4.5k
Add doc about mem.*Unsafe
consuming the input buffer
#8209
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
mem.*Unsafe
consuming the input buffermem.*Unsafe
consuming the input buffer
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #8209 +/- ##
==========================================
- Coverage 82.17% 82.01% -0.17%
==========================================
Files 410 410
Lines 40236 40236
==========================================
- Hits 33065 32998 -67
- Misses 5822 5873 +51
- Partials 1349 1365 +16
🚀 New features to boost your workflow:
|
@Groxx Can you please sign the CLA ? |
@Groxx , the PR looks good to me , but for the other part , can you open a issue and we can have a discussion there. |
@@ -207,13 +207,19 @@ func (b *buffer) String() string { | |||
|
|||
// ReadUnsafe reads bytes from the given Buffer into the provided slice. | |||
// It does not perform safety checks. | |||
// | |||
// The returned Buffer MUST be used in place of the one passed in, as it may |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have 2 implementations now: buffer and SliceBuffer and buffer can free when fully consumed. Because we are not doing type check we can't be sure if original buffer passed in is okay to be used again or not. So, we can just put that as the note because acting upon the returned buffer is expected anyways.
Note: provided Buffer `buf` must not be used again. For any remaining data, use the returned Buffer.
Defer to @dfawley for concurrency concern mentioned. |
Do you have any response to the concurrency issues raised in the first comment? Do you think we could remove the atomics, or maybe something is needed to address the concerns? Thanks! |
@Groxx we will need you to complete the CLA in order to merge this, thanks! |
Regarding the atomics, each goroutine is intended to have its own
However, you are right that it doesn't explicitly mention that the ref := buf.Ref()
go func() {
// use ref here
)()
buf.Free() Rather than like this: go func() {
ref := buf.Ref()
// use ref here
}()
buf.Free() Though I would argue that the second form counts as "concurrent access" :) The atomic counters are used to determine when the buffer can be safely freed, as the references get counted down. |
As for the proposed change in general, the doc looks good to me. The original motivation for
Absolutely, hence calling them |
If we don't want anyone else using it, we could have an package grpc/internal/mem
var ReadUnsafe func(dst []byte, buf Buffer) (int, Buffer)
--
package grpc/mem
internalmem.ReadUnsafe = func(dst []byte, buf Buffer) (int, Buffer) {
return buf.read(dst)
} But if this is turning out to be useful for others then I'm fine with leaving it exposed. |
So that's actually a good sample of why I think trying to be race-safer here with atomics is actually worse than not using atomics, even just from a safety standpoint. (as an up-front note here: all of these can be made safe and can be made to race. I'm just pointing out that a plausible misuse exists) go func() {
ref := buf.Ref()
// use ref here
}()
buf.Free() ^ this kind of construct is a common enough mistake that golang/go#18022 has led to adding a Forgetting to wait for the goroutine is also fairly common, so that example should have been something like this: done := make(chan struct{})
go func() {
ref := buf.Ref()
// use ref here
close(done)
}()
<-done
buf.Free() (I fully realize your example was just sample code and this is reasonable to omit, I'm just showing it to be more concrete) One risk that atomics bring is that in a done := make(chan struct{})
b := buffer()
go func() {
bb := b.Ref()
time.Sleep(time.Second)
// and then do not do anything with the value for some reason,
// maybe some other check no-opped it.
// or just don't use it in a racy way, e.g. maybe only read,
// or only racy-write to different locations
_ = bb
close(done)
}() //
b.Free()
<-done // note incorrect waiting, `Ref` and `Free` race while without atomics it'll be a race violation:
Full test herepackage demos
import (
"sync/atomic"
"testing"
"time"
)
type atomicbuf struct {
count int64 // must be used atomically
}
func (b *atomicbuf) Ref() *atomicbuf {
atomic.AddInt64(&b.count, 1)
return b
}
func (b *atomicbuf) Free() {
atomic.AddInt64(&b.count, -1)
}
type normalbuf struct {
count int64
}
func (b *normalbuf) Ref() *normalbuf {
b.count++
return b
}
func (b *normalbuf) Free() {
b.count--
}
func TestAtomic(t *testing.T) {
t.Run("safe in both constructs", func(t *testing.T) {
done := make(chan struct{})
b := &atomicbuf{}
bb := b.Ref()
go func() {
time.Sleep(time.Second)
// and then do not do anything with `bb` for some reason
_ = bb
close(done)
}()
b.Free()
// ensuring the goroutine finishes, rather than ending the process early
<-done
})
t.Run("unsafe but allowed with atomics", func(t *testing.T) {
done := make(chan struct{})
b := &atomicbuf{}
go func() {
bb := b.Ref() // moved inside the goroutine
time.Sleep(time.Second)
// and then do not do anything with the value for some reason
_ = bb
close(done)
}()
b.Free()
// ensuring the race occurs, rather than ending the process early
<-done
})
}
func TestNormal(t *testing.T) {
t.Run("safe in both constructs", func(t *testing.T) {
done := make(chan struct{})
b := &normalbuf{}
bb := b.Ref()
go func() {
_ = bb // capture it
time.Sleep(time.Second)
// and then do not do anything with `bb` for some reason
close(done)
}()
b.Free()
// ensuring the goroutine finishes, rather than ending the process early
<-done
})
t.Run("unsafe and blocked by race detector", func(t *testing.T) {
done := make(chan struct{})
b := &normalbuf{}
go func() {
bb := b.Ref() // moved inside the goroutine
time.Sleep(time.Second)
// and then do not do anything with the value for some reason
_ = bb
close(done)
}()
b.Free()
// ensuring the race occurs, rather than ending the process early
<-done
})
} With racy atomic code like that, it's also possible to trigger double-frees due to the non-atomic So it's hiding some misuse, and not reliably preventing misuse. I'd argue it should either be safe (with added runtime cost) or not (without), not something in the middle. |
re CLA... ... tbh I don't think I can agree with a license that requires someone to share their mailing address. That seems like an absurd overreach. |
This is the standard Linux Foundation / CNCF CLA, and the repo requires signing the agreement in order to merge pull requests. If you don't want to sign it, that's fine - we will just close this PR. |
There's also some chance you may have attempted to go through the "corporate" license workflow instead of the "individual" one. Asking around, nobody was aware that you'd need to provide a mailing address to agree to the CLA. Unfortunately, there's no easy way for me to test this for myself. |
No, it's the individual one. I see that a fair number of others also have that clause, but nothing seems to describe why it's necessary... and there are also a fair number that do not, e.g. Google's OpenSource CLA needs just a name and username. So yea, that seems like an absurd overreach. Surely only legal jurisdiction is relevant, it's not like CNCF is going to mail me a letter at an unverified address if they want to change a project's license or something... and I thought the whole point of a CLA was to remove that need, by retaining rights for the project. |
This PR is labeled as requiring an update from the reporter, and no update has been received after 6 days. If no update is provided in the next 7 days, this issue will be automatically closed. |
Blocked on the CLA issue; being investigated in communitybridge/easycla#4638 |
@Groxx can you try again? IIUC they removed the address requirement. |
This PR is labeled as requiring an update from the reporter, and no update has been received after 6 days. If no update is provided in the next 7 days, this issue will be automatically closed. |
This PR is labeled as requiring an update from the reporter, and no update has been received after 6 days. If no update is provided in the next 7 days, this issue will be automatically closed. |
@Groxx Closing the PR as it haas been 2 weeks since any activity. Feel free to re-open. |
I've been reading the new buffer-pool-related code due to recent OOM problems with other libraries misusing
sync.Pool
, and the varying behavior betweenBuffer
andSliceBuffer
had me confused for quite a while.Buffer mutates itself when
read
, and implicitly callsFree
when fully consumed:grpc-go/mem/buffers.go
Lines 194 to 200 in 5edab9e
While SliceBuffer returns a sub-slice but does not modify itself:
grpc-go/mem/buffers.go
Lines 263 to 267 in 5edab9e
The only way these
*Unsafe
funcs can be used correctly is by replacing the input-buffer with the result-buffer(s), as the current code does:grpc-go/internal/transport/transport.go
Line 139 in 5edab9e
... which seems worth documenting since it feels like a huge footgun otherwise: you only trigger the self-mutating-and-freeing behavior for relatively large data, which is less common in tests.
Granted, this isn't a technical barrier at all, but the behavior differences between implementations mislead me for a while and this "must not use input Buffer" requirement doesn't seem obviously-necessary to me when reading the code. Might as well save future-people that effort.
On a related note,
Buffer
feels... a bit concerningly strange in regards to concurrency, and I didn't see it called out in the PR that introduced it so I think it might not have been noticed.E.g.
Buffer
s are not concurrency-safe:grpc-go/mem/buffers.go
Lines 39 to 41 in 5edab9e
But it uses an atomic refcount:
grpc-go/mem/buffers.go
Lines 75 to 80 in 5edab9e
... but it doesn't use it in an atomically-safe way, as it sets the value to nil when freeing:
grpc-go/mem/buffers.go
Line 160 in 5edab9e
And if you include the fact that the ref came from a pool:
grpc-go/mem/buffers.go
Line 157 in 5edab9e
it seems like this code cannot possibly be correct:
Put
refcounts and buffers (with a racingRef
,Free
,Ref
,Free
sequence)So this is introducing difficult-to-investigate failure modes that non-atomic or non-pooled instances would not have. It seems like it'd be better to either just use an
int
field or not-reuse atomics (if the goal is to detect races).Also I would be quite surprised if pooled
*int
values perform better than non-pooled, given the cheap initialization andsync.Pool
's overhead, but I don't have any evidence either way.Is there some benefit to this I'm missing? Maybe something about it can catch tricky races or misuse that would otherwise be missed? It seems misleading and dangerous to me otherwise.
RELEASE NOTES: none