Launchpad itself

Database policy reevaluated frequently, slowing everything down during database maintenance

Bug #1885859 reported by William Grant on 2020-07-01

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	Launchpad itself	In Progress	Low	Colin Watson

Bug Description

lp.services.webapp.adapter.StoreSelector invokes lp.services.database.policy.BaseDatabasePolicy.getStore each time it's used (e.g. during an IStore(Product) call). In normal operation this is cheap, since the replication lag check is in LaunchpadDatabasePolicy.install and only run once per request, so getStore already knows if it's looking for a master or slave.

But if the requested database is down (e.g. it's a slave-capable request but the slave is offline), each getStore call will try to connect and only then fall back to the other one, often taking a few milliseconds if it fails at the pgbouncer level. If a request context available, a failed connection should probably blacklist the flavour for the remainder of the request.

This could potentially also cause weird behaviour if a DB is flaky, as you'd end up with repeated requests for an object returning two from different stores.

There's possibly also another bug here: LaunchpadDatabasePolicy.getReplicationLag will actually check lag against the master if the slave is down, due to the same getStore fallback behaviour.

Tags:

Colin Watson (cjwatson) on 2020-07-01

Changed in launchpad:
assignee:	nobody → Colin Watson (cjwatson)
status:	Triaged → In Progress

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.