Merge lp://qastaging/~benji/launchpad/bug-622765 into lp://qastaging/launchpad

Proposed by Benji York
Status: Merged
Approved by: Benji York
Approved revision: no longer in the source branch.
Merged at revision: 11477
Proposed branch: lp://qastaging/~benji/launchpad/bug-622765
Merge into: lp://qastaging/launchpad
Diff against target: 250 lines (+73/-24)
4 files modified
lib/canonical/launchpad/scripts/tests/test_librarian_apache_log_parser.py (+2/-2)
lib/lp/services/apachelogparser/base.py (+7/-3)
lib/lp/services/apachelogparser/script.py (+9/-1)
lib/lp/services/apachelogparser/tests/test_apachelogparser.py (+55/-18)
To merge this branch: bzr merge lp://qastaging/~benji/launchpad/bug-622765
Reviewer Review Type Date Requested Status
Henning Eggers (community) code Approve
Review via email: mp+34077@code.qastaging.launchpad.net

Description of the change

The config option "logparser_max_parsed_lines" was respected only for individual files, not across multiple files (e.g., if a max of 1000 lines were to be parsed and there were 10 files to parse 10,000 lines would be parsed instead of just 1000).

This branch fixes the problem by keeping track of the number of lines parsed across calls to parse_file.

The related tests can be run with

    bin/test -c test_apachelogparser

No lint was reported by make lint.

To post a comment you must log in.
Revision history for this message
Henning Eggers (henninge) wrote :
Download full text (11.6 KiB)

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi Benji,
thank you for this nice fix. It is basically good to land after you have
considered a few comments of mine.

 review approve code

Cheers,
Henning

Am 30.08.2010 17:03, schrieb Benji York:
>
> === modified file 'lib/canonical/launchpad/scripts/tests/test_librarian_apache_log_parser.py'
> --- lib/canonical/launchpad/scripts/tests/test_librarian_apache_log_parser.py 2010-08-20 20:31:18 +0000
> +++ lib/canonical/launchpad/scripts/tests/test_librarian_apache_log_parser.py 2010-08-30 15:03:44 +0000
> @@ -102,7 +102,7 @@
> '69.233.136.42 - - [13/Jun/2008:14:55:22 +0100] "GET '
> '/15018215/ul_logo_64x64.png HTTP/1.1" 200 2261 '
> '"https://launchpad.net/~ubuntulite/+archive" "Mozilla"')
> - downloads, parsed_bytes = parse_file(
> + downloads, parsed_bytes, ignored = parse_file(

I am not sure "ignored" is the best choice here. I wondered at first if it
means "ignored_bytes" or "ignored_lines". Would it hurt to call it what it is
("parsed_lines") and then simply ignore it?

There are multiple uses of "ignored" in here, as you know. ;-)

> fd, start_position=0, logger=self.logger,
> get_download_key=get_library_file_id)
> self.assertEqual(
> @@ -121,7 +121,7 @@
> fd = StringIO(
> '69.233.136.42 - - [13/Jun/2008:14:55:22 +0100] "GET / HTTP/1.1" '
> '200 2261 "https://launchpad.net/~ubuntulite/+archive" "Mozilla"')
> - downloads, parsed_bytes = parse_file(
> + downloads, parsed_bytes, ignored = parse_file(
> fd, start_position=0, logger=self.logger,
> get_download_key=get_library_file_id)
> self.assertEqual(
>
> === modified file 'lib/lp/services/apachelogparser/base.py'
> --- lib/lp/services/apachelogparser/base.py 2010-08-27 18:41:01 +0000
> +++ lib/lp/services/apachelogparser/base.py 2010-08-30 15:03:44 +0000

Very nice. Thank you.

[...]

> === modified file 'lib/lp/services/apachelogparser/script.py'
> --- lib/lp/services/apachelogparser/script.py 2010-08-27 18:41:01 +0000
> +++ lib/lp/services/apachelogparser/script.py 2010-08-30 15:03:44 +0000
> @@ -8,6 +8,7 @@
>
> from zope.component import getUtility
>
> +from canonical.config import config
> from lp.app.errors import NotFoundError
> from lp.services.apachelogparser.base import (
> create_or_update_parsedlog_entry,
> @@ -65,8 +66,15 @@
>
> self.setUpUtilities()
> country_set = getUtility(ICountrySet)
> + parsed_lines = 0
> + max_parsed_lines = getattr(
> + config.launchpad, 'logparser_max_parsed_lines', None)
> for fd, position in files_to_parse:
> - downloads, parsed_bytes = parse_file(
> + # If we've used up our budget of lines to process, stop.
> + if (max_parsed_lines is not None
> + and parsed_lines >= max_parsed_lines):

This line must be indented and the "and" should probably go on the previous
line but multi-line "if" conditions are always bad to read because the
dangling lines align with the body of the statement. Storing the boolean in a
variable fi...

review: Approve (code)
Revision history for this message
Benji York (benji) wrote :
Download full text (4.5 KiB)

On Mon, Aug 30, 2010 at 11:54 AM, Henning Eggers
<email address hidden> wrote:
> Review: Approve code
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Hi Benji,
> thank you for this nice fix. It is basically good to land after you have
> considered a few comments of mine.
>
>  review approve code
>
> Cheers,
> Henning
>
> Am 30.08.2010 17:03, schrieb Benji York:
>>
>> === modified file 'lib/canonical/launchpad/scripts/tests/test_librarian_apache_log_parser.py'
>> --- lib/canonical/launchpad/scripts/tests/test_librarian_apache_log_parser.py 2010-08-20 20:31:18 +0000
>> +++ lib/canonical/launchpad/scripts/tests/test_librarian_apache_log_parser.py 2010-08-30 15:03:44 +0000
>> @@ -102,7 +102,7 @@
>>              '69.233.136.42 - - [13/Jun/2008:14:55:22 +0100] "GET '
>>              '/15018215/ul_logo_64x64.png HTTP/1.1" 200 2261 '
>>              '"https://launchpad.net/~ubuntulite/+archive" "Mozilla"')
>> -        downloads, parsed_bytes = parse_file(
>> +        downloads, parsed_bytes, ignored = parse_file(
>
> I am not sure "ignored" is the best choice here. I wondered at first if it
> means "ignored_bytes" or "ignored_lines". Would it hurt to call it what it is
> ("parsed_lines") and then simply ignore it?

Done.

>> === modified file 'lib/lp/services/apachelogparser/script.py'
>> --- lib/lp/services/apachelogparser/script.py 2010-08-27 18:41:01 +0000
>> +++ lib/lp/services/apachelogparser/script.py 2010-08-30 15:03:44 +0000
>> @@ -8,6 +8,7 @@
>>
>>  from zope.component import getUtility
>>
>> +from canonical.config import config
>>  from lp.app.errors import NotFoundError
>>  from lp.services.apachelogparser.base import (
>>      create_or_update_parsedlog_entry,
>> @@ -65,8 +66,15 @@
>>
>>          self.setUpUtilities()
>>          country_set = getUtility(ICountrySet)
>> +        parsed_lines = 0
>> +        max_parsed_lines = getattr(
>> +            config.launchpad, 'logparser_max_parsed_lines', None)
>>          for fd, position in files_to_parse:
>> -            downloads, parsed_bytes = parse_file(
>> +            # If we've used up our budget of lines to process, stop.
>> +            if (max_parsed_lines is not None
>> +            and parsed_lines >= max_parsed_lines):
>
> This line must be indented and the "and" should probably go on the previous
> line but multi-line "if" conditions are always bad to read because the
> dangling lines align with the body of the statement. Storing the boolean in a
> variable first and checking that in the statement often works better.

Done.

>> +        # If we have already parsed some lines, then the number of lines
>> +        # parsed will be passed in (parsed_lines argument) and parse_file will
>> +        # take that number into account when determining if the maximum number
>> +        # of lines to parse has been reached.
>
> Hm, this is a functional test and not documentation. This description should
> probably be found in the docstring of parse_file, should it not? If found
> there, it need not be explained in such detail here.

My intent was to explain the functioning of the test. I'm sure I would
appreciate a nice explanation of what's going on if I was reading the
test....

Read more...

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
The diff is not available at this time. You can reload the page or download it.