Merge lp://qastaging/~dpm/launchpad/translations-exporter into lp://qastaging/launchpad

Proposed by David Planella
Status: Work in progress
Proposed branch: lp://qastaging/~dpm/launchpad/translations-exporter
Merge into: lp://qastaging/launchpad
Diff against target: 335 lines (+315/-0)
3 files modified
cronscripts/translations-export-stats.py (+25/-0)
lib/lp/services/config/schema-lazr.conf (+6/-0)
lib/lp/translations/scripts/export_stats.py (+284/-0)
To merge this branch: bzr merge lp://qastaging/~dpm/launchpad/translations-exporter
Reviewer Review Type Date Requested Status
Abel Deuring (community) code Approve
Review via email: mp+124373@code.qastaging.launchpad.net

Description of the change

This is a result of the request on RT #55759 to move the Ubuntu translations exporter script [1] into the Launchpad tree.

As it's my first ever Launchpad merge proposal, I'm submitting this as work in progress to get some initial feedback if overall what I'm trying to do and the code looks good. I'll start working on the tests once that's been reviewed and I'm certain I'm doing the right thing :).

In short, this code is thought to be run daily to provide a tarball with an export of Ubuntu translations stats. The data is fetched by querying the Launchpad database. The data will then be used to produce the translations coverage report at release time [2] and also to provide a list of of priority templates to help the community focus on the most important translations to complete, as well as producing graphs of their progress.

There are some TODO comments in the code for a couple of areas I was not sure how to go about. The most important one is the fact that the tarballs that are generated with the database dump are uploaded to Librarian, but as they are not listed anywhere in Launchpad, I have no way of knowing their URL in advance when I'll want to fetch them. I thought perhaps a URL alias such as the ones we use for https://translations.launchpad.net/ubuntu/hardy/+latest-full-language-pack could be useful (e.g. .../+latest-stats-export). I could not really figure out how to do it. Any pointers? Or other approaches?

Thanks.

[1] https://launchpad.net/lp-get-ul10nstats/
[2] http://people.canonical.com/~dpm/stats/ubuntu-12.04-translation-stats.html

To post a comment you must log in.
Revision history for this message
Abel Deuring (adeuring) wrote :
Download full text (15.2 KiB)

Hi David,

firstly, a few general remarks:

- The big obstacle: "thou shalt not increase the LOC count for (the branch
  you are landing code in) unless" (quoted from
  https://dev.launchpad.net/PolicyAndProcess/MaintenanceCosts?highlight=%28loc%29

  See the wiki page for more details -- you can either try to convice
  somebody in charge that this is a good change, or you can "compess" the
  existing LP code base first.

  Personally, I think this script would be worth to add, but I am not the
  project lead...

> This is a result of the request on RT #55759 to move the Ubuntu translations
> exporter script [1] into the Launchpad tree.
>
> As it's my first ever Launchpad merge proposal, I'm submitting this as work
> in progress to get some initial feedback if overall what I'm trying to do
> and the code looks good. I'll start working on the tests once that's been
> reviewed and I'm certain I'm doing the right thing :).
>
> In short, this code is thought to be run daily to provide a tarball with an
> export of Ubuntu translations stats. The data is fetched by querying the
> Launchpad database. The data will then be used to produce the translations
> coverage report at release time [2] and also to provide a list of of
> priority templates to help the community focus on the most important
> translations to complete, as well as producing graphs of their progress.
>
> There are some TODO comments in the code for a couple of areas I was not
> sure how to go about. The most important one is the fact that the tarballs
> that are generated with the database dump are uploaded to Librarian, but as
> they are not listed anywhere in Launchpad, I have no way of knowing their
> URL in advance when I'll want to fetch them. I thought perhaps a URL alias
> such as the ones we use for
> https://translations.launchpad.net/ubuntu/hardy/+latest-full-language-pack
> could be useful (e.g. .../+latest-stats-export). I could not really figure
> out how to do it. Any pointers? Or other approaches?

Such a URL would mean that the Librarian data would have to be fed through
the app server in order to reach the client.

But the core problem is that a Librarian file that is not referenced from
anywhere will be deleted by a cron job. So we need a column in some LP
DB table that points to the LibraryFileAlias. I think a new column
DistroSeries.translation_statistics would do the job.

Adding this column requires a separate branch and merge proposal,
see https://dev.launchpad.net/PolicyAndProcess/DatabaseSchemaChangesProcess

This may look a bit scary, but just adding a column is not difficult.
But I suspect that the column will need an index so that the garbo job
that deletes unreferenced LFA records can run efficiently. But stub
ask stub if my suspicion is right. Anyway, the index should be added
in a separate branch.

database/schema/patch-2209-28-1.sql and database/schema/patch-2209-28-2.sql
are an example for such a DB schema change.

And once the DB column exists, you can add a new property to the Python
class DistroSeries and export it to the API.

>
> Thanks.
>
> [1] https://launchpad.net/lp-get-ul10nstats/
> [2] http://people.canonical.com/~dpm/stats/ubuntu-12...

Revision history for this message
David Planella (dpm) wrote :
Download full text (16.4 KiB)

Al 14/09/12 13:38, En/na Abel Deuring ha escrit:
> Hi David,
>
> firstly, a few general remarks:
>
> - The big obstacle: "thou shalt not increase the LOC count for (the branch
> you are landing code in) unless" (quoted from
> https://dev.launchpad.net/PolicyAndProcess/MaintenanceCosts?highlight=%28loc%29
>
> See the wiki page for more details -- you can either try to convice
> somebody in charge that this is a good change, or you can "compess" the
> existing LP code base first.
>
> Personally, I think this script would be worth to add, but I am not the
> project lead...
>

Thanks for the heads up. As per the conversation just now on
#launchpad-dev, a waiver has been granted to move this code into LP:

<lifeless> I'll grant a waiver
 this code already exists in the wrong place
 its not adding debt to move it into LP
<lifeless> its reducing debt by getting it into the right place.

http://irclogs.ubuntu.com/2012/09/14/%23launchpad-dev.html#t11:54

I'll apply fixes for all the rest of points mentioned and will talk to
stub to double-check on the database changes you're describing, as
suggested.

Thanks!

Cheers,
David.

>
>> This is a result of the request on RT #55759 to move the Ubuntu translations
>> exporter script [1] into the Launchpad tree.
>>
>> As it's my first ever Launchpad merge proposal, I'm submitting this as work
>> in progress to get some initial feedback if overall what I'm trying to do
>> and the code looks good. I'll start working on the tests once that's been
>> reviewed and I'm certain I'm doing the right thing :).
>>
>> In short, this code is thought to be run daily to provide a tarball with an
>> export of Ubuntu translations stats. The data is fetched by querying the
>> Launchpad database. The data will then be used to produce the translations
>> coverage report at release time [2] and also to provide a list of of
>> priority templates to help the community focus on the most important
>> translations to complete, as well as producing graphs of their progress.
>>
>> There are some TODO comments in the code for a couple of areas I was not
>> sure how to go about. The most important one is the fact that the tarballs
>> that are generated with the database dump are uploaded to Librarian, but as
>> they are not listed anywhere in Launchpad, I have no way of knowing their
>> URL in advance when I'll want to fetch them. I thought perhaps a URL alias
>> such as the ones we use for
>> https://translations.launchpad.net/ubuntu/hardy/+latest-full-language-pack
>> could be useful (e.g. .../+latest-stats-export). I could not really figure
>> out how to do it. Any pointers? Or other approaches?
>
> Such a URL would mean that the Librarian data would have to be fed through
> the app server in order to reach the client.
>
> But the core problem is that a Librarian file that is not referenced from
> anywhere will be deleted by a cron job. So we need a column in some LP
> DB table that points to the LibraryFileAlias. I think a new column
> DistroSeries.translation_statistics would do the job.
>
> Adding this column requires a separate branch and merge proposal,
> see https://dev.launchpad.net/PolicyAndProcess/Databas...

Revision history for this message
Robert Collins (lifeless) wrote :

On Fri, Sep 14, 2012 at 11:38 PM, Abel Deuring
<email address hidden> wrote:
> Hi David,
>
> firstly, a few general remarks:
>
> - The big obstacle: "thou shalt not increase the LOC count for (the branch
> you are landing code in) unless" (quoted from
> https://dev.launchpad.net/PolicyAndProcess/MaintenanceCosts?highlight=%28loc%29
>
> See the wiki page for more details -- you can either try to convice
> somebody in charge that this is a good change, or you can "compess" the
> existing LP code base first.
>
> Personally, I think this script would be worth to add, but I am not the
> project lead...

I'm granting a waiver; the code already exists, so moving into tree
reduces maintenance overhead.

Revision history for this message
Richard Harding (rharding) wrote :

I'm going to mark this as in progress then since it looks like it might need to wait on a db patch to get through the system before moving forward. Once the db side has been worked out it this can go forward per Abel.

Revision history for this message
Abel Deuring (adeuring) wrote :

On 14.09.2012 13:38, Abel Deuring wrote:

> But the core problem is that a Librarian file that is not referenced from
> anywhere will be deleted by a cron job. So we need a column in some LP
> DB table that points to the LibraryFileAlias. I think a new column
> DistroSeries.translation_statistics would do the job.

Another suggestion: It might make sense not only to add a column like
DistroSeries.translation_statistics, but also another column
DistroSeries.translation_statistics_updated, a datetime column
containing, well, the date and time of the most recent update of the
other column.

Revision history for this message
Abel Deuring (adeuring) wrote :

As stub wrote on IRC:

(10:08:00) stub: Assuming it hasn't atrophied, this can just land and run with the --output= option right now.
(10:08:36) stub: And I think it is a minor change to get the librarian url reported rather than the file alias id

so, r=me

review: Approve (code)

Unmerged revisions

15945. By David Planella

Converted Ubuntu translations export script to a LaunchpadCronScript

15944. By David Planella

Added exporter script, without modifications

15943. By David Planella

Switched to use Launchpad config

15942. By David Planella

Added wrapper for translations export script

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
The diff is not available at this time. You can reload the page or download it.