Mir

Merge lp://qastaging/~mir-team/mir/fix-1353867-sigterm-dispatch-mirteam into lp://qastaging/mir

Proposed by Daniel van Vugt
Status: Merged
Merged at revision: 1852
Proposed branch: lp://qastaging/~mir-team/mir/fix-1353867-sigterm-dispatch-mirteam
Merge into: lp://qastaging/mir
Diff against target: 850 lines (+191/-115)
23 files modified
include/test/mir_test_framework/basic_client_server_fixture.h (+2/-0)
include/test/mir_test_framework/display_server_test_fixture.h (+2/-0)
include/test/mir_test_framework/testing_process_manager.h (+3/-2)
src/client/mir_connection.cpp (+6/-1)
tests/acceptance-tests/test_client_cursor_api.cpp (+2/-0)
tests/acceptance-tests/test_client_input.cpp (+2/-0)
tests/acceptance-tests/test_client_library.cpp (+4/-1)
tests/acceptance-tests/test_client_screencast.cpp (+7/-43)
tests/acceptance-tests/test_client_surface_swap_buffers.cpp (+3/-15)
tests/acceptance-tests/test_large_messages.cpp (+3/-0)
tests/acceptance-tests/test_nested_mir.cpp (+2/-0)
tests/acceptance-tests/test_prompt_session_client_api.cpp (+2/-0)
tests/acceptance-tests/test_protobuf.cpp (+6/-3)
tests/acceptance-tests/test_server_disconnect.cpp (+60/-0)
tests/acceptance-tests/test_server_shutdown.cpp (+2/-0)
tests/acceptance-tests/test_stale_frames.cpp (+0/-2)
tests/acceptance-tests/test_test_framework.cpp (+2/-0)
tests/integration-tests/client/test_client_render.cpp (+10/-0)
tests/integration-tests/shell/test_session_lifecycle_event.cpp (+37/-48)
tests/mir_test_framework/display_server_test_fixture.cpp (+5/-0)
tests/mir_test_framework/testing_process_manager.cpp (+12/-0)
tests/mir_test_framework/using_stub_client_platform.cpp (+8/-0)
tests/unit-tests/client/test_client_mir_surface.cpp (+11/-0)
To merge this branch: bzr merge lp://qastaging/~mir-team/mir/fix-1353867-sigterm-dispatch-mirteam
Reviewer Review Type Date Requested Status
PS Jenkins bot (community) continuous-integration Needs Fixing
Daniel van Vugt Approve
Chris Halse Rogers Pending
Alan Griffiths Pending
Review via email: mp+231152@code.qastaging.launchpad.net

This proposal supersedes a proposal from 2014-08-12.

Commit message

client: Fix SIGTERM dispatch in our default lifecycle event handler, which
was resulting in clients looping infinitely in exception handlers
(LP: #1353867)

Our default lifecycle event handler used to call raise() to send a TERM
signal to the app when a connection break was detected. However, in a
multi-threaded program raise() sends the signal to the current thread only.
If the signal is blocked in that thread, then the signal is lost.

Depending on thread in which we detected that the connection was lost, the
TERM signal would be dispatched (e.g. in the context of a driver callback
asking for the next buffer) or not (e.g. in the context of our main RPC
threads which block all signals). This MP fixes the issue by explicitly
sending the signal to the process instead of the current thread. This ensures
that it will be dispatched to some thread that doesn't block it.

Description of the change

It turns out that we send a lifecycle connection lost event to ourselves when we release a client connection. With the signal dispatch issue fixed, this always leads to a SIGTERM being sent to the process when we call mir_connection_release() (with the default handler). This breaks most of the tests, since clients are expected to exit properly, not be terminated by a signal.

I assumed that sending a lifecycle event on release is a behavior we want, i.e. it's not accidental. If it's indeed accidental, we can remove it, along with the workaround in this MP.

The workaround proposed in this MP is to clear any lifecycle handlers just prior to disconnecting. This has the benefit that the default handler is active for the lifetime of the client and can therefore catch premature disconnections. It's not ideal, but it's reasonable for now (IMO).

As part of the workaround I refactored some of the test code to use newer infrastructure (like BasicClientServerFixture). However these refactorings are beneficial on their own and can remain even if we decide not to use the proposed workaround.

To post a comment you must log in.
Revision history for this message
PS Jenkins bot (ps-jenkins) wrote : Posted in a previous version of this proposal
review: Approve (continuous-integration)
Revision history for this message
Daniel van Vugt (vanvugt) wrote : Posted in a previous version of this proposal

Manual testing OK. Verified bug 1353867 is fixed.

I suggest SIGPIPE or SIGHUP would be more appropriate than SIGTERM though. SIGTERM means the user has explicitly asked for the app to be killed, and that's not true in this case. It's more appropriate to use SIGPIPE or SIGHUP indicating the user did not explicitly kill the app, but we still want a clean termination (no core file).

review: Approve
Revision history for this message
Chris Halse Rogers (raof) wrote : Posted in a previous version of this proposal

Looks sensible to me

review: Approve
Revision history for this message
Daniel van Vugt (vanvugt) wrote : Posted in a previous version of this proposal

Essentially Approved still. Just please consider using a more appropriate signal: SIGPIPE or SIGHUP.

review: Needs Information
Revision history for this message
Alexandros Frantzis (afrantzis) wrote : Posted in a previous version of this proposal

> Essentially Approved still. Just please consider using a more appropriate signal: SIGPIPE or SIGHUP.

I am not opposed to changing the signal, after some discussion about which one is best. A change to the signal needs to be followed by changes to signal handlers of our example clients so they can shut down properly. I'd rather we did all this in a separate MP and leave the current MP focused on just fixing the dispatch mechanism.

Revision history for this message
Alan Griffiths (alan-griffiths) wrote : Posted in a previous version of this proposal

> > Essentially Approved still. Just please consider using a more appropriate
> signal: SIGPIPE or SIGHUP.
>
> I am not opposed to changing the signal, after some discussion about which one
> is best. A change to the signal needs to be followed by changes to signal
> handlers of our example clients so they can shut down properly. I'd rather we
> did all this in a separate MP and leave the current MP focused on just fixing
> the dispatch mechanism.

OK

review: Approve
Revision history for this message
PS Jenkins bot (ps-jenkins) wrote : Posted in a previous version of this proposal
review: Needs Fixing (continuous-integration)
Revision history for this message
Daniel van Vugt (vanvugt) wrote : Posted in a previous version of this proposal

Happy to revisit the choice of signal later. One minor annoyance is stopping this from landing (per Jenkins):

Text conflict in tests/acceptance-tests/test_nested_mir.cpp
1 conflicts encountered.

review: Approve
Revision history for this message
Daniel van Vugt (vanvugt) :
review: Approve
Revision history for this message
PS Jenkins bot (ps-jenkins) wrote :
review: Approve (continuous-integration)
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Already approved by three people per above comments. This resubmission is only to fix a one line conflict, so no need to wait for people to review it again.

Revision history for this message
PS Jenkins bot (ps-jenkins) wrote :
review: Needs Fixing (continuous-integration)
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Blocked by bug 1358698 again.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
The diff is not available at this time. You can reload the page or download it.

Subscribers

People subscribed via source and target branches