Mir

Merge lp://qastaging/~alan-griffiths/mir/keep-helgrind-happy into lp://qastaging/mir

Proposed by Alan Griffiths
Status: Rejected
Rejected by: Alan Griffiths
Proposed branch: lp://qastaging/~alan-griffiths/mir/keep-helgrind-happy
Merge into: lp://qastaging/mir
Diff against target: 13 lines (+3/-1)
1 file modified
tests/integration-tests/compositor/test_swapping_swappers.cpp (+3/-1)
To merge this branch: bzr merge lp://qastaging/~alan-griffiths/mir/keep-helgrind-happy
Reviewer Review Type Date Requested Status
Daniel van Vugt Disapprove
Kevin DuBois (community) Approve
Alberto Aguirre (community) Approve
PS Jenkins bot (community) continuous-integration Approve
Review via email: mp+207389@code.qastaging.launchpad.net

Commit message

tests: frig code to prevent helgrind seeing a race condition (LP: #1282886)

Description of the change

tests: frig code to prevent helgrind seeing a race condition

After some investigation it became apparent that helgrind is seeing some read access of the address of mutex from the callback *before* the function is re-entered and a new mutex created on the same spot. I don't think the logic is actually wrong (just that helgrind hasn't got a way to sequence the destruction and re-creation of the mutex with the compositing thread).

As we don't need a new mutex for each call I've made it persistent.

To post a comment you must log in.
Revision history for this message
PS Jenkins bot (ps-jenkins) wrote :
review: Approve (continuous-integration)
Revision history for this message
Alberto Aguirre (albaguirre) :
review: Approve
Revision history for this message
Kevin DuBois (kdub) wrote :

okay

review: Approve
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Helgrind uses the Eracer algorithm, which is pretty simple and even obvious if you think about it for a bit;
http://www.cs.duke.edu/courses/cps210/spring06/papers/eraser.pdf
As such it's very easy to trust the correctness of Helgrind, providing it has implemented hooks for all the lock/unlock functions you may depend on.

I'm not sure, but it appears the problem is that mutex is not guaranteed to be 1:1 to the switching_bundle which affects cv.

The basic rule for condition variables is that if you're not protecting any data/objects in the associated mutex then you're using the condition variable in a racy and unsafe way. This is why both C++ and pthreads always require a condition have an associated lock. Because failure to use that lock properly will make you race. I forget the exact reasoning, but it appears the solution would be to tie mutex to the switching bundle more closely. Even lock it before entering the function.

Using a static might be a safe solution, not sure, but it's definitely inefficient forcing all threads to share the same mutex. So not ideal. When/if I can commit more thought to the problem, I'm sure it would be possible to solve more correctly without "static".

review: Abstain
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

I think I found the real issue. The mutex obviously needs to live longer than the lambda that's locking/unlocking it. But that's not guaranteed because the lambda runs in a different thread. Using static provides the guarantee, but it's a bit of a hack. I'll try to propose a more correct solution.

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

I feel static is a hack (but is a solution too). Also, this proposal doesn't silence the races reported for std::atomic.

I've made a separate proposal that fixes all the races, for helgrind cleanliness:
https://code.launchpad.net/~vanvugt/mir/fix-1282886/+merge/207596

review: Disapprove

Unmerged revisions

1415. By Alan Griffiths

Avoid false positive from helgrind

1414. By Alan Griffiths

merge lp:~andreas-pokorny/mir/report-namespace

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
The diff is not available at this time. You can reload the page or download it.

Subscribers

People subscribed via source and target branches