Merge into trunk : concurrent-tests1 : Code : selenium-simple-test

Status:	Merged
Merged at revision:	425
Proposed branch:	lp://qastaging/~coreygoldberg/selenium-simple-test/concurrent-tests1
Merge into:	lp://qastaging/selenium-simple-test
Diff against target:	787 lines (+577/-33) 11 files modified docs/changelog.rst (+1/-0) docs/index.rst (+15/-17) requirements.txt (+1/-0) src/sst/cases.py (+1/-1) src/sst/command.py (+11/-10) src/sst/concurrency.py (+114/-0) src/sst/runtests.py (+15/-4) src/sst/scripts/run.py (+1/-0) src/sst/scripts/test.py (+1/-0) src/sst/tests/test_concurrency.py (+416/-0) tox-acceptance.ini (+1/-1)
To merge this branch:	bzr merge lp://qastaging/~coreygoldberg/selenium-simple-test/concurrent-tests1
Related bugs:	Link a bug report

Reviewer	Date Requested	Status
Corey Goldberg (community)		Needs Resubmitting on 2013-06-12
Vincent Ladeuil (community)	2013-05-31	Needs Fixing on 2013-06-10
Review via email: mp+166615@code.qastaging.launchpad.net

Revision history for this message

Vincent Ladeuil (vila) wrote on 2013-05-31:

#

30 + parser.add_option('--concurrency', dest='use_concurrency',
31 + default=False, action='store_true',
32 + help='concurrency enabled (proc per cpu):')

I think we want an integer rather than a boolean here, there are far too
many cases where using as many processes as processors doesn't match the
needs (sometimes you want to use more processes than processors, sometimes
you want to you less).

41 +#
42 +# Copyright (c) 2013 Canonical Ltd.
43 +#
44 +# This file is part of: SST (selenium-simple-test)
45 +# https://launchpad.net/selenium-simple-test

That's not the usual header is it ? Any reason to use a different one ?

126 +def iter_suite_tests(suite):

This flatten the test suite for no good reason, sst.filters.filter_suite
should be re-usable here.

154 +class SubUnitSSTProtocolClient(TestProtocolClient):
155 + def addSuccess(self, test, details=None):
156 + # The subunit client always includes the details in the subunit
157 + # stream, but we don't want to include it in ours.

Why don't we want to include it ? I can think of at least one case where it
can help: comparing a passing test with a failing one. As such, as a dev, I
would be pretty upset to have my carefully collected data thrown away ;)

170 +import Queue
173 +import threading

Doesn't pep8 warn you about this unused imports ?

260 +import sst.concurrency
261 +import sst.result
262 +import sst.tests

from sst import (... ?

295 + self.addCleanup(self.restore_stdout)
296 +
297 + def restore_stdout(self):
298 + sys.stdout = sys.__stdout__

It's weird to see a restore function without anything changing sys.stdout...

384 +class ConcurrencyTestCase(TestCase):

There is a lot of duplication there that makes it hard to review the
differences between the tests.

That seems to be a good start to test the happy paths though, I'm not sure
it's worth checking all the kinds of test ouputs (unless they were test
parameters which would make the parametrized test clearer).

486 + def test_concurrent_one_case_per_threaded_worker(self):

That one is definetely too big, I can't see what it is about and even wonder
what it really test. You may want to split it into different tests ?

Is it another attempt to use threads to run the tests ?

So overall, that seems like a good first step to implement concurrency via
fork but we really want tests for the error paths, including (not
exhaustive): one test failing, one test erroring, one test hanging, one
process dying and also interrupting the whole test run and ensure we get
some meaningful result in that case (these are the most common use cases I
can think of).

30	+ parser.add_option('--concurrency', dest='use_concurrency',
31	+ default=False, action='store_true',
32	+ help='concurrency enabled (proc per cpu):')

I think we want an integer rather than a boolean here, there are far too
many cases where using as many processes as processors doesn't match the
needs (sometimes you want to use more processes than processors, sometimes
you want to you less).

41	+#
42	+# Copyright (c) 2013 Canonical Ltd.
43	+#
44	+# This file is part of: SST (selenium-simple-test)
45	+# https://launchpad.net/selenium-simple-test

That's not the usual header is it ? Any reason to use a different one ?

126	+def iter_suite_tests(suite):

This flatten the test suite for no good reason, sst.filters.filter_suite
should be re-usable here.

154	+class SubUnitSSTProtocolClient(TestProtocolClient):
155	+ def addSuccess(self, test, details=None):
156	+ # The subunit client always includes the details in the subunit
157	+ # stream, but we don't want to include it in ours.

Why don't we want to include it ? I can think of at least one case where it
can help: comparing a passing test with a failing one. As such, as a dev, I
would be pretty upset to have my carefully collected data thrown away ;)

170	+import Queue
173	+import threading

Doesn't pep8 warn you about this unused imports ?

260	+import sst.concurrency
261	+import sst.result
262	+import sst.tests

from sst import (... ?

295	+ self.addCleanup(self.restore_stdout)
296	+
297	+ def restore_stdout(self):
298	+ sys.stdout = sys.__stdout__

It's weird to see a restore function without anything changing sys.stdout...

384	+class ConcurrencyTestCase(TestCase):

There is a lot of duplication there that makes it hard to review the
differences between the tests.

That seems to be a good start to test the happy paths though, I'm not sure
it's worth checking all the kinds of test ouputs (unless they were test
parameters which would make the parametrized test clearer).

486	+ def test_concurrent_one_case_per_threaded_worker(self):

That one is definetely too big, I can't see what it is about and even wonder
what it really test. You may want to split it into different tests ?

Is it another attempt to use threads to run the tests ?

So overall, that seems like a good first step to implement concurrency via
fork but we really want tests for the error paths, including (not
exhaustive): one test failing, one test erroring, one test hanging, one
process dying and also interrupting the whole test run and ensure we get
some meaningful result in that case (these are the most common use cases I
can think of).

Revision history for this message

Corey Goldberg (coreygoldberg) wrote on 2013-05-31:

#

> I think we want an integer rather than a boolean here

fixed. --concurrency=4

> That's not the usual header is it ?

it is. (though a truncated version without license. We use Apache license, but bzr was GPLv2, so I wasn't sure what to license this file as. I assume Apache because the rest of the project is? kind of an odd situation, but doable because all code is (c) canonical.

> unused imports

fixed.

> It's weird to see a restore function

removed, was included by accident.

> That one is definetely too big,

some of these tests will be removed once this branch is ready to land. the entire file: 'test_concurrency_ideas.py' is stuff I was working on, and moved out of the real 'test_concurrent.py' test.

----

still working on the rest :)

Revision history for this message

Corey Goldberg (coreygoldberg) wrote on 2013-05-31:

#

rather:

--concurrency=N

Revision history for this message

Corey Goldberg (coreygoldberg) wrote on 2013-06-02:

#

ready for re-review.. including basic failure tests.

Revision history for this message

Vincent Ladeuil (vila) wrote on 2013-06-10:

#

<vila> cgoldberg: 494 + ### XXX why isn't my skipped test showing up :/
<cgoldberg> vila, yea.. crap.. that needs to be addressed.. i'll look again. i think it's something with multitestresult
<cgoldberg> meant to ask about that
<vila> cgoldberg: also, using discover and creating all tests to select only one sounds like a big hammer, why not create only the test you need ?
<vila> cgoldberg: and as mentioned previously, the tests would be simpler and give us a better access to internal if you write them as direct consumers of a single test run emitting the subunit stream
<vila> cgoldberg: you may even directly understand why skip is having issues or at least better see which part of the code is responsible which is the point of having focused tests
<vila> cgoldberg: that being said, those tests are far clearer
<vila> cgoldberg: and src/sst/tests/test_result.py has a get_case function you can use as an example to create tests (or suites) in a more direct way
<cgoldberg> vila, skips work with ConcurrencyTestSuite alone (directly using a TextTestResult or XML Result). but not when I add in a MultiTestResult
<vila> cgoldberg: good, so you have the needed bits to write a test reproducing the issue, TDD ftw :)
<cgoldberg> vila, yea
<vila> cgoldberg: finally, I think we also want tests for one test hanging, one
<vila> process dying and also interrupting the whole test run and ensure we get
<vila> some meaningful result in that case, as mention in a previous review
<cgoldberg> vila, interrupted how? just kill the pid?
<vila> cgoldberg: yup, make the test sleep long enough so you're sure to kill it

review: Needs Fixing

lp://qastaging/~coreygoldberg/selenium-simple-test/concurrent-tests1 updated on 2013-06-10

462. By Corey Goldberg on 2013-06-10: refactored unit test cases

Revision history for this message

Corey Goldberg (coreygoldberg) wrote on 2013-06-12:

#

removed hammer.

'skips' do not get populated in the 'MultiTestResult' object. this is not because of this branch. this branch just includes a test that exposes it. I commented out the test and left a note with the testtools bug number (see source)

add tests for killing testcase pid.

review: Needs Resubmitting

Revision history for this message

Vincent Ladeuil (vila) wrote on 2013-06-14:

#

Download full text (7.0 KiB)

Lots of good things in revno 462:

- http://pad.lv/1189593 has been properly isolated and you even provided a
failing test upstream.

- the tests are more focused thanks to defining only the inner tests they
care about instead of selecting from a common (and of course more verbose)
definition of all possible cases.

Thanks for that work, definitely on the right track :)

Let's finish !

139 # self.assertEqual(len(result.skipped), 1)

We know this will fail until the testtools bug is fixed, but we can (and
should) do better with:

self.expectFailure('bug NNN: testtools should report skipped tests',
self.assertEqual, len(result.skipped), 1)

This way, we ensure the test pass as long as the bug is not fixed but also
that it will fail when the bug is fixed, at which point we will be able to
come back to:

self.assertEqual(len(result.skipped), 1)

and be done with the whole issue.

This whole work is really valuable and allows you to share the knowledge you
acquired in a straight forward way: noone can argue with that, the bug is in
testtools and we'll be warned automatically when it's fixed. That's TDD to
its best \o/

Regarding the focus of these tests, I thought we agreed on IRC when I said:

Jun 10 16:07:21 <vila> cgoldberg: also, using discover and creating all tests to select only one sounds like a big hammer, why not create only the test you need ?

Jun 10 16:08:32 <vila> cgoldberg: and as mentioned previously, the tests would be simpler and give us a better access to internal if you write them as direct consumers of a single test run emitting the subunit stream

So let me re-iterate that they should focus on the least possible amount of
code in the same way you greatly did for the skip issue. There are still two
places where we're not there yet:

- test creation, currently done with loader.TestLoader().discover('t', pattern=
- test run, currently done with self.run_tests_concurrently(suite)

As such, these tests involve too much code we don't really care about.

They act as integration tests where we want unit tests so they are testing
code paths that we don't care about making the tests harder to read and work
with.

For test creation we don't need to create files on disk and then involve the
discovery, we can just:

=== modified file 'src/sst/tests/test_concurrency.py'
--- src/sst/tests/test_concurrency.py 2013-06-10 21:26:29 +0000
+++ src/sst/tests/test_concurrency.py 2013-06-14 12:16:38 +0000
@@ -55,21 +54,20 @@
return txt_result

def test_forked_all_pass(self):
- tests.write_tree_from_desc('''dir: t
-file: t/__init__.py
-file: t/test_pass.py
-import unittest
-class BothPass(unittest.TestCase):
- def test_pass_1(self):
- self.assertTrue(True)
- def test_pass_2(self):
- self.assertTrue(True)
-''')
- suite = loader.TestLoader().discover('t', pattern='test_pass.py')
+ class BothPass(unittest.TestCase):
+
+ def test_pass_1(self):
+ self.assertTrue(True)
+
+ def test_pass_2(self):
+ self.assertTrue(True)
+
+ suite = unittest.TestSuite()
+ suite.addTests([BothPass('test_pass_1'), Bo...

Lots of good things in revno 462:

- http://pad.lv/1189593 has been properly isolated and you even provided a
  failing test upstream.

- the tests are more focused thanks to defining only the inner tests they
  care about instead of selecting from a common (and of course more verbose)
  definition of all possible cases.

Thanks for that work, definitely on the right track :)

Let's finish !

139        # self.assertEqual(len(result.skipped), 1)
 
We know this will fail until the testtools bug is fixed, but we can (and
should) do better with:

self.expectFailure('bug NNN: testtools should report skipped tests',
                     self.assertEqual, len(result.skipped), 1)

This way, we ensure the test pass as long as the bug is not fixed but also
that it will fail when the bug is fixed, at which point we will be able to
come back to:

self.assertEqual(len(result.skipped), 1)

and be done with the whole issue.

This whole work is really valuable and allows you to share the knowledge you
acquired in a straight forward way: noone can argue with that, the bug is in
testtools and we'll be warned automatically when it's fixed. That's TDD to
its best \o/

Regarding the focus of these tests, I thought we agreed on IRC when I said:

Jun 10 16:07:21 <vila>	cgoldberg: also, using discover and creating all tests to select only one sounds like a big hammer, why not create only the test you need ?

Jun 10 16:08:32 <vila>	cgoldberg: and as mentioned previously, the tests would be simpler and give us a better access to internal if you write them as direct consumers of a single test run emitting the subunit stream

So let me re-iterate that they should focus on the least possible amount of
code in the same way you greatly did for the skip issue. There are still two
places where we're not there yet:

- test creation, currently done with loader.TestLoader().discover('t', pattern=
- test run, currently done with self.run_tests_concurrently(suite)

As such, these tests involve too much code we don't really care about.

They act as integration tests where we want unit tests so they are testing
code paths that we don't care about making the tests harder to read and work
with.

For test creation we don't need to create files on disk and then involve the
discovery, we can just:

=== modified file 'src/sst/tests/test_concurrency.py'
--- src/sst/tests/test_concurrency.py	2013-06-10 21:26:29 +0000
+++ src/sst/tests/test_concurrency.py	2013-06-14 12:16:38 +0000
@@ -55,21 +54,20 @@
         return txt_result
 
     def test_forked_all_pass(self):
-        tests.write_tree_from_desc('''dir: t
-file: t/__init__.py
-file: t/test_pass.py
-import unittest
-class BothPass(unittest.TestCase):
-    def test_pass_1(self):
-        self.assertTrue(True)
-    def test_pass_2(self):
-        self.assertTrue(True)
-''')
-        suite = loader.TestLoader().discover('t', pattern='test_pass.py')
+        class BothPass(unittest.TestCase):
+
+            def test_pass_1(self):
+                self.assertTrue(True)
+
+            def test_pass_2(self):
+                self.assertTrue(True)
+
+        suite = unittest.TestSuite()
+        suite.addTests([BothPass('test_pass_1'), BothPass('test_pass_2')])

And we don't even need to define two tests, I suspect you did that to ensure
the inner passing test is still ok in its own process but that makes the
outer test involve far more code than needed without exposing what we care
about.

I don't have a proposal as concrete as above for this aspect but I think we
want to split fork_for_tests into 3 smaller pieces so we can test them:

- the test suite partitioning and the the iteration of the partitions, you
  already started that with PartitionTestCase and you should go further
  there and not involve discover() either, you properly invoke
  partition_tests in isolation there and that's good,

- the part that is executed in the controlling process, for that, I think
  ConcurrencyRunTestCase is good as long as it doesn't involve discover(),
  i.e. I'm ok to not try to focus further because the result we care about
  is already properly exposed and we won't get much value out by testing it
  in isolation (or we'll have to duplicate part of the implementation which
  is bad anyway),

- the part that is executed in the child process.

So may be just split out the last two branches following the os.fork().

I think the later is the one we care about the most here.

And to test that you want to run only that last part in a child process and
have the unit test receive the subunit stream and do assertions about its
content.

The way the tests are written right now doesn't expose that stream at all
(it only exist inside fork_for_tests and we have no way to capture it).

This is very important to ensure we'll be able to properly diagnose issues
regarding concurrency, several such issues were fixed in the bzrlib
implementation and not being able to capture this stream was the blocker,
some bugs plagued us for months until a relevant bug in subunit were
diagnosed allowing us to understand we were concerned too, let's fix that
while how this code works is paged in, let's share that knowledge so we
don't spend days finding it again.

And above all, let's have tests ready to re-use and allow a quick diagnostic
and fix when needed instead of requiring one to understand that piece of
code and be forced to do that refactoring while already dealing with
*another* issue.

How can I help you get there ? Should we meet on IRC ? Are there additional
explanations you would like ? Help to write these tests ?

A last remark:

193	+This allows you to parallelize a test run across a configurable number
194	+of worker processes. While this can speed up CPU-bound test runs, it is
195	+mainly useful for IO-bound tests that spend most of their time waiting for

I think there are additional benefits worth mentioning here:

1 - if one has a multi-core processor, the wall time will be divided by the
    number of cores (almost, let's not dive into details), i.e. it *will*
    speed up *even* CPU-bound tests,

2 - it has constraints regarding shared resources and cannot be used blindly
    in these cases.

3 - it can help work around isolation issues or leaks.

(1) applies whether or not the tests are CPU-bound or IO-bound so the actual
phrasing may be a bit misleading.

(2) is worth mentioning even in broad terms (I don't have a very simple
example that could be described in a single sentence may be you do ?).

(3) If one test badly interact with another in the same process (an
isolation issue), executing them in different process will work around the
issue (which still needs to be fixed but that's a different problem ;).

Likewise if running the whole test suite with leaks (from file descriptors
or sockets for example (real life example ;)) lead to a failing run because
some process limit is reached, running less tests in several processes
allows a successful run because no process reaches this limit anymore (the
leak is still there, waiting to be fixed but it's not triggered anymore,
allowing one to split the concerns).

Revision history for this message

Vincent Ladeuil (vila) wrote on 2013-06-14:

#

So in summary, I think we need three things for this to land:
- get rid of the discover() calls as much as possible,
- expose the subunit stream from a subprocess running a test
- enhance the doc string

lp://qastaging/~coreygoldberg/selenium-simple-test/concurrent-tests1 updated on 2013-06-24

463. By Corey Goldberg on 2013-06-18: remerged trunk
464. By Corey Goldberg on 2013-06-21: subunit stream unit tests
465. By Corey Goldberg on 2013-06-21: merged with trunk
466. By Corey Goldberg on 2013-06-24: concurrency unit test fixes

Revision history for this message

Corey Goldberg (coreygoldberg) wrote on 2013-06-24:

#

added tests for subunit streams.

added expectedfailure.

selenium-simple-test

Merge lp://qastaging/~coreygoldberg/selenium-simple-test/concurrent-tests1 into lp://qastaging/selenium-simple-test

Commit message

Description of the change

Preview Diff

Subscribers