Skip to content

Update deliveries value atomically to avoid potential race condition #133

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

seabaylea
Copy link
Contributor

The libdispatch CI has an intermittent failure in dispatch_read2 with the following reported in the log file:

[BEGIN] dispatch_read deliveries
    Actual: 319
    Expected: 320
[FAIL] dispatch_read deliveries (dispatch_read2.c:386)
    dispatch_read2.c:386

it looks like this could be cause by a race condition on deliveries, so switching to use atomic updates.

@MadCoder
Copy link
Contributor

MadCoder commented Aug 4, 2016

no, that can't be the bug, dispatch_read serializes everything through a single dispatch queue, so there is serialization already. FWIW we also have that issue on OS X and I have never taken the time to track it down yet.

@seabaylea
Copy link
Contributor Author

Ok, so does that means that the blocks are executed in serial, not just the reads occurring in serial and enqueueing the blocks with the correct data, which can then be executed in parallel?

@MadCoder
Copy link
Contributor

MadCoder commented Aug 4, 2016

it's more about async I/O in general to avoid wasting thraeds while the data is being fetched by the kernel. but as you can see dispatch_read(*fd, siz, q, b); the queue argument mean that the results will be asynced to q so everything is serialized

what I think happens for real is that we do 240 writes, but one of the read coalesces two writes in a single read, and that the test assumptions about how the kernel works are wrong.

@seabaylea
Copy link
Contributor Author

Ok - if that's the case, are you happy for me to disable the test so that we can get the CI running cleanly?

Long term we could look at modifying the test to only check if the data size is correct if we think the number of deliveries might legitimately be smaller than expected.

@MadCoder
Copy link
Contributor

MadCoder commented Aug 4, 2016

sure, FWIW it's just a hunch that the test is wrong, I have never taken the time to double check.

@seabaylea
Copy link
Contributor Author

Ok - created PR #135 to disable the test.

@MadCoder
Copy link
Contributor

so for this test, the thing is, it expects that because it does a usleep(10000); between writes, reads won't coalesce. this is actually not always the case.

the equality test should be made a test_less_or_equal instead, it's not the important test anyway.

@MadCoder
Copy link
Contributor

closing as this hasn't made progress. as I said the right fix is to move to a <= assertion instead of ==

@MadCoder MadCoder closed this Sep 22, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants