Merge lp://qastaging/~raof/mir/no-ipc-on-compositor-threads into lp://qastaging/mir
- no-ipc-on-compositor-threads
- Merge into development-branch
Status: | Merged |
---|---|
Approved by: | Alan Griffiths |
Approved revision: | no longer in the source branch. |
Merged at revision: | 4228 |
Proposed branch: | lp://qastaging/~raof/mir/no-ipc-on-compositor-threads |
Merge into: | lp://qastaging/mir |
Prerequisite: | lp://qastaging/~raof/mir/better-buffer-plumbing |
Diff against target: |
1038 lines (+399/-182) 18 files modified
include/server/mir/frontend/buffer_stream.h (+0/-5) src/server/compositor/dropping_schedule.cpp (+0/-10) src/server/compositor/dropping_schedule.h (+0/-2) src/server/compositor/queueing_schedule.cpp (+0/-7) src/server/compositor/queueing_schedule.h (+0/-2) src/server/compositor/schedule.h (+0/-3) src/server/compositor/stream.cpp (+1/-33) src/server/compositor/stream.h (+0/-5) src/server/frontend/default_ipc_factory.cpp (+186/-1) src/server/frontend/default_ipc_factory.h (+4/-0) src/server/frontend/published_socket_connector.cpp (+50/-10) src/server/frontend/published_socket_connector.h (+1/-1) src/server/frontend/session_mediator.cpp (+47/-21) src/server/frontend/session_mediator.h (+6/-1) tests/include/mir/test/doubles/stub_buffer_stream.h (+0/-2) tests/unit-tests/compositor/test_multi_monitor_arbiter.cpp (+0/-6) tests/unit-tests/compositor/test_queueing_schedule.cpp (+0/-19) tests/unit-tests/frontend/test_session_mediator.cpp (+104/-54) |
To merge this branch: | bzr merge lp://qastaging/~raof/mir/no-ipc-on-compositor-threads |
Related bugs: |
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Alan Griffiths | Approve | ||
Mir CI Bot | continuous-integration | Approve | |
Review via email:
|
Commit message
Move buffer-release IPC to a dedicated IPC thread.
Fixes: LP: #1395421
Description of the change

Mir CI Bot (mir-ci-bot) wrote : | # |

Mir CI Bot (mir-ci-bot) wrote : | # |
FAILED: Continuous integration, rev:4216
https:/
Executed test runs:
FAILURE: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
FAILURE: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
Click here to trigger a rebuild:
https:/

Mir CI Bot (mir-ci-bot) wrote : | # |
FAILED: Continuous integration, rev:4216
https:/
Executed test runs:
FAILURE: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
FAILURE: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
Click here to trigger a rebuild:
https:/

Mir CI Bot (mir-ci-bot) wrote : | # |
PASSED: Continuous integration, rev:4217
https:/
Executed test runs:
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
Click here to trigger a rebuild:
https:/

Chris Halse Rogers (raof) wrote : | # |
Ok. It's not clear to me how it was deadlocking, and I still can't reproduce this locally, so I'm going to hit rebuild on that. It *seems* that CI pretty reliably hits this, but let's check that it's not a fluke pass.

Mir CI Bot (mir-ci-bot) wrote : | # |
PASSED: Continuous integration, rev:4217
https:/
Executed test runs:
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
Click here to trigger a rebuild:
https:/

Chris Halse Rogers (raof) wrote : | # |
OK. So looks like that's good.

Alan Griffiths (alan-griffiths) wrote : | # |
Thanks for this - it's been on my to-tidy list far too long.
I'll withhold final approval until we've fixed the pre-requisite.

Mir CI Bot (mir-ci-bot) wrote : | # |
PASSED: Continuous integration, rev:4218
https:/
Executed test runs:
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
Click here to trigger a rebuild:
https:/

Alan Griffiths (alan-griffiths) wrote : | # |
Hmm, with more testing this may not be working right

Alan Griffiths (alan-griffiths) wrote : | # |
> Hmm, with more testing this may not be working right
Sorry not to be clear. Hitting EOD.
I tried building miral against this, running miral-app hosted by mir_demo_server and running all the clients.
When I came to close things down things were hung.
Didn't get time to be more specific.

Chris Halse Rogers (raof) wrote : | # |
Moar testing time! Thanks.

Chris Halse Rogers (raof) wrote : | # |
Hm.
To be clear, this is miral-shell hanging if the underlying mir_demo_server goes away (such as by SIGINT)?
If I quit miral-shell first everything seems to go as expected.

Chris Halse Rogers (raof) wrote : | # |
Oh, huh. Looks like it might be any EGL-using client...

Alan Griffiths (alan-griffiths) wrote : | # |
> Hm.
>
> To be clear, this is miral-shell hanging if the underlying mir_demo_server
> goes away (such as by SIGINT)?
>
> If I quit miral-shell first everything seems to go as expected.
No. I was just closing individual clients with Alt-F4 (or Alt-Shift-F4 for the few that don't listen).
I'll try to reproduce and narrow down exactly what went on. And check that it was actually this MP (and not something we already landed).

Chris Halse Rogers (raof) wrote : | # |
Hm. New hypothesis: this has always been broken, and doesn't require nesting.
It seems that if you start up any fullscreen client (tested with eglplasma and _target) before the throbber has finished then the throbber client hangs in swapbuffers and prevents miral-shell shutdown.
This also happens with the archive mir and miral.

Alan Griffiths (alan-griffiths) wrote : | # |
OK, I can't reproduce what I saw yesterday.
Maybe I hit an intermittent, maybe I had screwed up my test. Either way, let's land this and monitor.

Alan Griffiths (alan-griffiths) wrote : | # |
> Hm. New hypothesis: this has always been broken, and doesn't require nesting.
>
> It seems that if you start up any fullscreen client (tested with eglplasma and
> _target) before the throbber has finished then the throbber client hangs in
> swapbuffers and prevents miral-shell shutdown.
"throbber"?
> This also happens with the archive mir and miral.
File a bug. We should fix it.

Mir CI Bot (mir-ci-bot) wrote : | # |
FAILED: Autolanding.
More details in the following jenkins job:
https:/
Executed test runs:
FAILURE: https:/
None: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
FAILURE: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/

Alan Griffiths (alan-griffiths) wrote : | # |
I don't think the failure is related.

Mir CI Bot (mir-ci-bot) wrote : | # |
FAILED: Autolanding.
More details in the following jenkins job:
https:/
Executed test runs:
FAILURE: https:/
None: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
deb: https:/
FAILURE: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
FAILURE: https:/

Chris Halse Rogers (raof) wrote : | # |
On Thu, Jul 20, 2017 at 6:44 PM, Alan Griffiths <email address hidden>
wrote:
>> Hm. New hypothesis: this has always been broken, and doesn't
>> require nesting.
>>
>> It seems that if you start up any fullscreen client (tested with
>> eglplasma and
>> _target) before the throbber has finished then the throbber client
>> hangs in
>> swapbuffers and prevents miral-shell shutdown.
>
> "throbber"?
The splashscreen, spinning Ubuntu logo thingy.

Chris Halse Rogers (raof) wrote : | # |
Hm. *Those* failures are due to real-time tests exceeding their thresholds. This branch plausibly increases the latency of those tests, as the buffer responses now require a context-switch rather than occurring on the compositor thread.
It might also just have been due to a transient load on the CI machine, so I'll give it another try...

Mir CI Bot (mir-ci-bot) wrote : | # |
FAILED: Autolanding.
More details in the following jenkins job:
https:/
Executed test runs:
FAILURE: https:/
None: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
FAILURE: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/

Chris Halse Rogers (raof) wrote : | # |
(Spinner bug is bug #1705973)

Mir CI Bot (mir-ci-bot) wrote : | # |
PASSED: Continuous integration, rev:4218
https:/
Executed test runs:
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
Click here to trigger a rebuild:
https:/

Mir CI Bot (mir-ci-bot) wrote : | # |
FAILED: Autolanding.
More details in the following jenkins job:
https:/
Executed test runs:
FAILURE: https:/
None: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
FAILURE: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/

Alan Griffiths (alan-griffiths) wrote : | # |
While this could change timing, I don't see why it should affect NestedServer.
Logged failure as lp:1706050 & re-approved

Mir CI Bot (mir-ci-bot) wrote : | # |
FAILED: Autolanding.
More details in the following jenkins job:
https:/
Executed test runs:
FAILURE: https:/
None: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
FAILURE: https:/

Alan Griffiths (alan-griffiths) wrote : | # |
Now a failure in NestedServer.
...
12:28:26 terminate called after throwing an instance of 'boost:
12:28:26 what(): stop_server() failed to stop server
12:28:26 ==18875==
12:28:26 ==18875== Process terminating with default action of signal 6 (SIGABRT)
12:28:26 ==18875== at 0x458AEA9: raise (raise.c:54)
12:28:26 ==18875== by 0x458C406: abort (abort.c:89)
12:28:26 ==18875== by 0x4420D34: __gnu_cxx:
12:28:26 ==18875== by 0x441E832: ??? (in /usr/lib/
12:28:26 ==18875== by 0x441D648: ??? (in /usr/lib/
12:28:26 ==18875== by 0x441DE10: __gxx_personali
12:28:26 ==18875== by 0x453A3DE: ??? (in /lib/i386-
12:28:26 ==18875== by 0x453A856: _Unwind_Resume (in /lib/i386-
12:28:26 ==18875== by 0x8727D93: ~unique_lock (mutex:450)
12:28:26 ==18875== by 0x8727D93: mir_test_
...
I think that's (at least) a failure too many to be a coincidence.
Is this MP introducing a shutdown race that can deadlock?

Chris Halse Rogers (raof) wrote : | # |
I cannot for the life of me get this to fail locally.
I've run “make test” on a valgrind and debflags build in a loop for hours, and the tests continue to reliably pass.
Aaargh.
Here are some changes which also don't fail locally, but might help?

Mir CI Bot (mir-ci-bot) wrote : | # |
PASSED: Continuous integration, rev:4220
https:/
Executed test runs:
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
Click here to trigger a rebuild:
https:/

Mir CI Bot (mir-ci-bot) wrote : | # |
PASSED: Continuous integration, rev:4223
https:/
Executed test runs:
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
Click here to trigger a rebuild:
https:/

Mir CI Bot (mir-ci-bot) wrote : | # |
FAILED: Continuous integration, rev:4225
https:/
Executed test runs:
FAILURE: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
FAILURE: https:/
SUCCESS: https:/
deb: https:/
FAILURE: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
FAILURE: https:/
Click here to trigger a rebuild:
https:/

Alan Griffiths (alan-griffiths) wrote : | # |
Just looking at the last commit:
1. Do we really want four shared pointers in ThreadExecutor? Surely one pointer holding a "shared state" implementation object would be better?
2. The (potential) problem with detaching threads is ensuring they exit before the process does. I don't see this being a problem here except possibly with valgrind...

Chris Halse Rogers (raof) wrote : | # |
It shouldn't matter if the threads don't end before the process does; valgrind doesn't count reachable objects as leaked, and if the thread hasn't finished then the objects are still reachable.
This should now *actually* work.
I'm somewhat surprised we haven't seen problems caused by the boost::...::socket outliving the io_service in the past. As far as I can tell, there's no mechanism to enforce that - we share SocketMessangers around pretty widely...

Mir CI Bot (mir-ci-bot) wrote : | # |
PASSED: Continuous integration, rev:4227
https:/
Executed test runs:
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
Click here to trigger a rebuild:
https:/

Alan Griffiths (alan-griffiths) wrote : | # |
It still feels untidy to exit while a detached thread might own resources, but that's probably less evil than the existing code.

Mir CI Bot (mir-ci-bot) wrote : | # |
FAILED: Autolanding.
More details in the following jenkins job:
https:/
Executed test runs:
FAILURE: https:/
None: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
FAILURE: https:/
FAILURE: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/

Alan Griffiths (alan-griffiths) wrote : | # |
Hmm, this does look suspicious:
10:37:22 9: ==28377== 792 (24 direct, 768 indirect) bytes in 1 blocks are definitely lost in loss record 36 of 41
10:37:22 9: ==28377== at 0x4C2E19F: operator new(unsigned long) (in /usr/lib/
10:37:22 9: ==28377== by 0x51B754A: _S_make_
10:37:22 9: ==28377== by 0x51B754A: thread<(lambda at /<<BUILDDIR>

Chris Halse Rogers (raof) wrote : | # |
Damnit, valgrind!

Mir CI Bot (mir-ci-bot) wrote : | # |
FAILED: Continuous integration, rev:4228
https:/
Executed test runs:
FAILURE: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
FAILURE: https:/
FAILURE: https:/
FAILURE: https:/
FAILURE: https:/
FAILURE: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
FAILURE: https:/
Click here to trigger a rebuild:
https:/

Chris Halse Rogers (raof) wrote : | # |
Oh, dear.
This is now running into poor interactions with fork(); specifically:
Specifically:
* After a fork() in a multithreaded program, the child can safely
call only async-signal-safe functions (see signal-safety(7)) until
such time as it calls execve(2).
When running “make test”, we've got a single process executing all the tests; since the ThreadExecutor is now static, we now have a thread waiting around before we get to fork().
This is why ptest doesn't fail; each test gets its own process...

Alan Griffiths (alan-griffiths) wrote : | # |
> Oh, dear.
>
> This is now running into poor interactions with fork(); specifically:
> Specifically:
> * After a fork() in a multithreaded program, the child can safely
> call only async-signal-safe functions (see signal-safety(7)) until
> such time as it calls execve(2).
>
> When running “make test”, we've got a single process executing all the tests;
> since the ThreadExecutor is now static, we now have a thread waiting around
> before we get to fork().
Ack. Objects with static duration can be problematic. :(
> This is why ptest doesn't fail; each test gets its own process...
I'm not convinced by that final reasoning: with ptest each test fixture gets its own process, but that still runs multiple tests (each of which may fork).

Chris Halse Rogers (raof) wrote : | # |
On 2 August 2017 6:03:59 pm AEST, Alan Griffiths <email address hidden> wrote:
>> This is why ptest doesn't fail; each test gets its own process...
>
>I'm not convinced by that final reasoning: with ptest each test fixture
>gets its own process, but that still runs multiple tests (each of which
>may fork).
More correctly: our tests that fork do so *first*, and do all of their multi-threading in a child.
Then along comes this MP, and causes a thread to hang around - waiting on a condition variable - from a previous test. Now our top-level fork is mt.
A hack - delete and recreate the executor across fork boundaries using pthread_atfork() - resolves this. It should also work to quiesce the executor pre-fork and then resume it in the other end.
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

Mir CI Bot (mir-ci-bot) wrote : | # |
FAILED: Continuous integration, rev:4229
https:/
Executed test runs:
FAILURE: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
FAILURE: https:/
FAILURE: https:/
FAILURE: https:/
FAILURE: https:/
FAILURE: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
Click here to trigger a rebuild:
https:/

Mir CI Bot (mir-ci-bot) wrote : | # |
FAILED: Continuous integration, rev:4230
https:/
Executed test runs:
FAILURE: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
FAILURE: https:/
SUCCESS: https:/
deb: https:/
FAILURE: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
Click here to trigger a rebuild:
https:/

Alan Griffiths (alan-griffiths) wrote : | # |
Just artful/
That's also happening in two other branches. (I've reproduced locally yet.)

Alan Griffiths (alan-griffiths) wrote : | # |
> Just artful/
>
> That's also happening in two other branches. (I've reproduced locally yet.)
I've *not* reproduced locally yet.

Chris Halse Rogers (raof) wrote : | # |
Yeah. *My* artful builds work fine!

Alan Griffiths (alan-griffiths) wrote : | # |
Hmm, the errors seem to be coming from DefaultPersiste

Mir CI Bot (mir-ci-bot) wrote : | # |
FAILED: Continuous integration, rev:4231
https:/
Executed test runs:
FAILURE: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
FAILURE: https:/
SUCCESS: https:/
deb: https:/
FAILURE: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
Click here to trigger a rebuild:
https:/

Mir CI Bot (mir-ci-bot) wrote : | # |
FAILED: Continuous integration, rev:4232
https:/
Executed test runs:
FAILURE: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
FAILURE: https:/
FAILURE: https:/
FAILURE: https:/
FAILURE: https:/
FAILURE: https:/
FAILURE: https:/
FAILURE: https:/
FAILURE: https:/
Click here to trigger a rebuild:
https:/

Mir CI Bot (mir-ci-bot) wrote : | # |
FAILED: Continuous integration, rev:4233
https:/
Executed test runs:
FAILURE: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
FAILURE: https:/
SUCCESS: https:/
deb: https:/
FAILURE: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
Click here to trigger a rebuild:
https:/

Mir CI Bot (mir-ci-bot) wrote : | # |
FAILED: Continuous integration, rev:4234
https:/
Executed test runs:
FAILURE: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
FAILURE: https:/
SUCCESS: https:/
deb: https:/
FAILURE: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
Click here to trigger a rebuild:
https:/

Mir CI Bot (mir-ci-bot) wrote : | # |
FAILED: Continuous integration, rev:4231
https:/
Executed test runs:
FAILURE: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
FAILURE: https:/
Click here to trigger a rebuild:
https:/

Alan Griffiths (alan-griffiths) wrote : | # |

Alan Griffiths (alan-griffiths) wrote : | # |
+mir::Executor& system_executor()
+{
+ static std::once_flag setup;
+ static ThreadExecutor executor;
Hmm, as well as the fight you started with fork() how does this scale well with having multiple Mir server instances in process? (I can see it doesn't break the test suite, but hopefully that isn't "just luck".)

Alan Griffiths (alan-griffiths) wrote : | # |
make_socket_
"cotained"?
~~~~
What is the intention of "system_

Chris Halse Rogers (raof) wrote : | # |
system_executor() is named as such in honour of the C++ TS. I guess it shouldn't really be named that :)
This shouldn't be a problem for multiple servers in a single process - they'll all use the same IPC thread for buffer-return messages, but each buffer-return task should be extremely quick - almost just a syscall - and there shouldn't be all that many of them - they'll generally max out at 60/s/client.
If you run *lots* of really busy servers in the same process that might start getting awkward.

Alan Griffiths (alan-griffiths) wrote : | # |
Nit:
+ std::shared_
+ std::shared_
Did you have CLion set to the wrong "&" placement style?

Mir CI Bot (mir-ci-bot) wrote : | # |
PASSED: Continuous integration, rev:4233
https:/
Executed test runs:
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
Click here to trigger a rebuild:
https:/

Alan Griffiths (alan-griffiths) wrote : | # |
I think this is OK but in the "good old days" I would have wanted another set of eyes. (I did ask Alberto, but he's not found time - yet.)

Alan Griffiths (alan-griffiths) wrote : | # |
Running miral-desktop (trunk) on this leads to orphaned titlebars and a hung server.
I need to run some more tests to isolate the problem.

Alan Griffiths (alan-griffiths) wrote : | # |
$ sudo mir_demo_server --vt 4 --launch mir_demo_
Wait 30 seconds(ish)...
the server hangs

Alan Griffiths (alan-griffiths) wrote : | # |
> $ sudo mir_demo_server --vt 4 --launch mir_demo_
>
> Wait 30 seconds(ish)...
>
> the server hangs
Actually, the server isn't entirely hung - but killing the child process doesn't remove it from the scene.

Alan Griffiths (alan-griffiths) wrote : | # |
> > $ sudo mir_demo_server --vt 4 --launch mir_demo_
> >
> > Wait 30 seconds(ish)...
> >
> > the server hangs
>
> Actually, the server isn't entirely hung - but killing the child process
> doesn't remove it from the scene.
Specifically, the server has exhausted file handles: /proc/.../fd/ has lots like this:
lrwx------ 1 alan alan 64 Aug 15 12:03 999 -> /dev/shm/#128090392 (deleted)

Alan Griffiths (alan-griffiths) wrote : | # |
OK, I've made some progress:
multi_stream is hitting the problem because it repeatedly creates new BufferStreams and destroying the old ones (and RenderSurfaces).
When BufferStreams are created and destroyed like this the buffer_cache in SessionMediator grows intermittently but inexorably.
Up until the point when file handles are exhausted closing the connection releases the cache and the buffers. (Once file handles are exhausted the server struggles to release the connection.)
I've not yet got my head around where the buffers created for mir_buffer_

Chris Halse Rogers (raof) wrote : | # |
Thanks for that!
That made it obvious that we were never actually freeing the buffers allocated on a BufferStream when it is destroyed.
I thought that (a) the client side did that, and (b) the server side *also* did that.
But it turns out that stream-

Mir CI Bot (mir-ci-bot) wrote : | # |
PASSED: Continuous integration, rev:4236
https:/
Executed test runs:
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
Click here to trigger a rebuild:
https:/

Alan Griffiths (alan-griffiths) wrote : | # |
Looking good!
FAILED: Continuous integration, rev:4212 /mir-jenkins. ubuntu. com/job/ mir-ci/ 3474/ /mir-jenkins. ubuntu. com/job/ build-mir/ 4747/console /mir-jenkins. ubuntu. com/job/ build-0- fetch/4905 /mir-jenkins. ubuntu. com/job/ build-1- sourcepkg/ release= artful/ 4894 /mir-jenkins. ubuntu. com/job/ build-1- sourcepkg/ release= xenial/ 4894 /mir-jenkins. ubuntu. com/job/ build-1- sourcepkg/ release= zesty/4894 /mir-jenkins. ubuntu. com/job/ build-2- binpkg- mir/arch= amd64,compiler= clang,platform= mesa,release= artful/ 4784 /mir-jenkins. ubuntu. com/job/ build-2- binpkg- mir/arch= amd64,compiler= clang,platform= mesa,release= artful/ 4784/artifact/ output/ *zip*/output. zip /mir-jenkins. ubuntu. com/job/ build-2- binpkg- mir/arch= amd64,compiler= clang,platform= mesa,release= zesty/4784 /mir-jenkins. ubuntu. com/job/ build-2- binpkg- mir/arch= amd64,compiler= clang,platform= mesa,release= zesty/4784/ artifact/ output/ *zip*/output. zip /mir-jenkins. ubuntu. com/job/ build-2- binpkg- mir/arch= amd64,compiler= gcc,platform= mesa,release= artful/ 4784 /mir-jenkins. ubuntu. com/job/ build-2- binpkg- mir/arch= amd64,compiler= gcc,platform= mesa,release= artful/ 4784/artifact/ output/ *zip*/output. zip /mir-jenkins. ubuntu. com/job/ build-2- binpkg- mir/arch= amd64,compiler= gcc,platform= mesa,release= xenial/ 4784/console /mir-jenkins. ubuntu. com/job/ build-2- binpkg- mir/arch= amd64,compiler= gcc,platform= mesa,release= zesty/4784 /mir-jenkins. ubuntu. com/job/ build-2- binpkg- mir/arch= amd64,compiler= gcc,platform= mesa,release= zesty/4784/ artifact/ output/ *zip*/output. zip /mir-jenkins. ubuntu. com/job/ build-2- binpkg- mir/arch= cross-armhf, compiler= gcc,platform= mesa,release= artful/ 4784 /mir-jenkins. ubuntu. com/job/ build-2- binpkg- mir/arch= cross-armhf, compiler= gcc,platform= mesa,release= artful/ 4784/artifact/ output/ *zip*/output. zip /mir-jenkins. ubuntu. com/job/ build-2- binpkg- mir/arch= cross-armhf, compiler= gcc,platform= mesa,release= zesty/4784 /mir-jenkins. ubuntu. com/job/ build-2- binpkg- mir/arch= cross-armhf, compiler= gcc,platform= mesa,release= zesty/4784/ artifact/ output/ *zip*/output. zip /mir-jenkins. ubuntu. com/job/ build-2- binpkg- mir/arch= i386,compiler= gcc,platform= mesa,release= xenial/ 4784/console
https:/
Executed test runs:
FAILURE: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
FAILURE: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
SUCCESS: https:/
deb: https:/
FAILURE: https:/
Click here to trigger a rebuild: /mir-jenkins. ubuntu. com/job/ mir-ci/ 3474/rebuild
https:/