Mir

Merge lp://qastaging/~afrantzis/mir/client-lifecycle-terminate-properly into lp://qastaging/mir

Proposed by Alexandros Frantzis
Status: Work in progress
Proposed branch: lp://qastaging/~afrantzis/mir/client-lifecycle-terminate-properly
Merge into: lp://qastaging/mir
Diff against target: 653 lines (+169/-111)
16 files modified
include/test/mir_test_framework/basic_client_server_fixture.h (+2/-0)
include/test/mir_test_framework/display_server_test_fixture.h (+2/-0)
include/test/mir_test_framework/testing_process_manager.h (+3/-2)
src/client/mir_connection.cpp (+6/-1)
tests/acceptance-tests/test_client_input.cpp (+2/-0)
tests/acceptance-tests/test_client_screencast.cpp (+7/-43)
tests/acceptance-tests/test_client_surface_swap_buffers.cpp (+3/-15)
tests/acceptance-tests/test_prompt_session_client_api.cpp (+2/-0)
tests/acceptance-tests/test_server_disconnect.cpp (+60/-0)
tests/acceptance-tests/test_server_shutdown.cpp (+2/-0)
tests/acceptance-tests/test_stale_frames.cpp (+0/-2)
tests/integration-tests/shell/test_session_lifecycle_event.cpp (+37/-48)
tests/mir_test_framework/display_server_test_fixture.cpp (+5/-0)
tests/mir_test_framework/testing_process_manager.cpp (+22/-0)
tests/mir_test_framework/using_stub_client_platform.cpp (+5/-0)
tests/unit-tests/client/test_client_mir_surface.cpp (+11/-0)
To merge this branch: bzr merge lp://qastaging/~afrantzis/mir/client-lifecycle-terminate-properly
Reviewer Review Type Date Requested Status
PS Jenkins bot (community) continuous-integration Needs Fixing
Mir development team Pending
Review via email: mp+229234@code.qastaging.launchpad.net

Commit message

client: Fix SIGTERM dispatch in our default lifecycle event handler

Our default lifecycle event handler used to call raise() to send a TERM signal to the app when a connection break was detected. However, in a multi-threaded program raise() sends the signal to the current thread only. If the signal is blocked in that thread, then the signal is lost.

Depending on thread in which we detected that the connection was lost, the TERM signal would be dispatched (e.g. in the context of a driver callback asking for the next buffer) or not (e.g. in the context of our main RPC threads which block all signals). This MP fixes the issue by explicitly sending the signal to the process instead of the current thread. This ensures that it will be dispatched to some thread that doesn't block it.

Description of the change

client: Fix SIGTERM dispatch in our default lifecycle event handler

Our default lifecycle event handler used to call raise() to send a TERM signal to the app when a connection break was detected. However, in a multi-threaded program raise() sends the signal to the current thread only. If the signal is blocked in that thread, then the signal is lost.

Depending on thread in which we detected that the connection was lost, the TERM signal would be dispatched (e.g. in the context of a driver callback asking for the next buffer) or not (e.g. in the context of our main RPC threads which block all signals). This MP fixes the issue by explicitly sending the signal to the process instead of the current thread. This ensures that it will be dispatched to some thread that doesn't block it.

It turns out that we send a lifecycle connection lost event to ourselves when we release a client connection. With the signal dispatch issue fixed, this always leads to a SIGTERM being sent to the process when we call mir_connection_release() (with the default handler). This breaks most of the tests, since clients are expected to exit properly, not be terminated by a signal.

I assumed that sending a lifecycle event on release is a behavior we want, i.e. it's not accidental. If it's indeed accidental, we can remove it, along with the workaround in this MP.

The workaround proposed in this MP is for our stub client platform (used in almost all the tests that use full clients) to clear any lifecycle handlers just prior to disconnecting. This has the benefit that the default handler is active for the lifetime of the client and can therefore catch premature disconnections. It's not ideal, but it's reasonable for now (IMO).

As part of the workaround I refactored some of the test code to use newer infrastructure (like BasicClientServerFixture). However these refactorings are beneficial on their own and can remain even if we decide not to use the proposed workaround.

Finally, this change fixes some instances of the "client spinning when the server dies" issue, which is especially pronounced if we apply https://code.launchpad.net/~afrantzis/mir/fix-1347053/+merge/229221, since we won't abort() in driver code any more.

To post a comment you must log in.
Revision history for this message
PS Jenkins bot (ps-jenkins) wrote :
review: Needs Fixing (continuous-integration)

Unmerged revisions

1811. By Alexandros Frantzis

client: Fix SIGTERM dispatch in our default lifecycle event handler

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
The diff is not available at this time. You can reload the page or download it.

Subscribers

People subscribed via source and target branches