Merge lp://qastaging/~vishvananda/nova/no-db-messaging into lp://qastaging/~hudson-openstack/nova/trunk

Proposed by Vish Ishaya
Status: Work in progress
Proposed branch: lp://qastaging/~vishvananda/nova/no-db-messaging
Merge into: lp://qastaging/~hudson-openstack/nova/trunk
Prerequisite: lp://qastaging/~termie/nova/rpc_multicall
Diff against target: 658 lines (+220/-143)
9 files modified
nova/context.py (+3/-3)
nova/db/sqlalchemy/api.py (+1/-1)
nova/db/sqlalchemy/models.py (+18/-5)
nova/scheduler/manager.py (+7/-5)
nova/tests/scheduler/test_scheduler.py (+32/-27)
nova/tests/test_volume.py (+34/-18)
nova/utils.py (+14/-1)
nova/volume/api.py (+58/-23)
nova/volume/manager.py (+53/-60)
To merge this branch: bzr merge lp://qastaging/~vishvananda/nova/no-db-messaging
Reviewer Review Type Date Requested Status
Matt Dietz (community) Needs Information
Rick Harris (community) Needs Fixing
Dan Prince (community) Abstain
Review via email: mp+61687@code.qastaging.launchpad.net

Description of the change

This is an initial proposal for feedback. This branch is an attempt to start on this blueprint:
https://blueprints.launchpad.net/nova/+spec/no-db-messaging
which will allow for the implementation of this blueprint:
https://blueprints.launchpad.net/nova/+spec/separate-code-for-services
And will ultimately make it easy to replace our various services with external projects.

This prototype changes volume_create and volume_delete to pass data through the queue instead of writing information to the database and reading it on the other end. It attempts to make minimal changes. It includes:
 * a small change to model code to allow for conversion into dicts
 * scheduler uses muticall instead of cast
 * volume.api.create_volume and delete volume use multicall to communicate with volume.manager
 * volume.manager returns updates instead of modifying the database directly

Please note that this is an initial proposal. It is based on some changes made by termie to allow the driver to return multiple times for a single call. It works to create volumes, but I haven't modified the tests to pass yet.

To Do:
 * change the driver code so it doesn't store its extra data about volumes in the global db.
 * fix the tests
 * modify attach and detach to use the same methodology

I'm open to feedback about this approach. I tried a few other versions and this seems like the simplest change set to get what we want. If this looks good, I will modify the other volume commands to work the same way and propose a similar set of changes for compute.

To post a comment you must log in.
Revision history for this message
Dan Prince (dan-prince) wrote :

Hey Vish,

I'm not able to boot any instances via the OS API w/ this branch. Looks to be related to the set_admin_password functionality which now polls for the compute host to be assigned:

(nova.api.openstack): TRACE: File "/usr/lib/pymodules/python2.6/nova/compute/api.py", line 501, in _set_admin_password
(nova.api.openstack): TRACE: host = self._find_host(context, instance_id)
(nova.api.openstack): TRACE: File "/usr/lib/pymodules/python2.6/nova/compute/api.py", line 497, in _find_host
(nova.api.openstack): TRACE: % instance_id)
(nova.api.openstack): TRACE: Error: Unable to find host for Instance 1
(nova.api.openstack): TRACE:

review: Needs Fixing
Revision history for this message
Vish Ishaya (vishvananda) wrote :

Thanks dan,

Turns out it was an error where if you cast to a generator (multicall uses generators), then it wouldn't actually execute the generator on the other end. I think casting should work now.

Vish

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

The prerequisite lp:~termie/nova/rpc_multicall has not yet been merged into lp:nova.

Revision history for this message
Dan Prince (dan-prince) wrote :

Hey Vish,

So you latest fix resolves the RPC issues. Thanks!

I'm still hitting an issue with things like instance metadata (aka a model attributes that aren't columns).

The following changes to NovaBase.update resolves it for me.

            if key == 'name' and key in columns:
                setattr(self, key, value)
            else:
                setattr(self, key, value)

review: Needs Fixing
Revision history for this message
Dan Prince (dan-prince) wrote :

Oops. This one should work:

      if key == 'name' and key in columns:
            setattr(self, key, value)
       elif key != 'name':
            setattr(self, key, value)

Revision history for this message
Mark Washenberger (markwash) wrote :

I'm having a little trouble grasping the multicall approach as shown here so I might be a bit confused.

It looks like the way this works is, when we schedule a volume to be created, we create an eventlet-based volume listener which will update the database as it hears back from the volume manager.

Maybe I'm mistaken about eventlet, but this implies that if for some reason the process terminates before the multicall has finished its last return, then the database won't be updated. Then, when the process is restarted, there will be no calling context remaining to handle the updates.

Would it be a more robust approach to create a permanently running VolumeListener that handles all volume update events?

Thanks!

1113. By Vish Ishaya

merged rpc_multicall

1114. By Vish Ishaya

keep the database on the receiving end as well

1115. By Vish Ishaya

fix tests

1116. By Vish Ishaya

use strtime for passing datetimes back and forth through the queue

1117. By Vish Ishaya

lost some changes from rpc branch, bring them in manually

1118. By Vish Ishaya

merged trunk and removed conflicts

1119. By Vish Ishaya

make sure to handle VolumeIsBusy

1120. By Vish Ishaya

return not yield in scheduler shortcut

1121. By Vish Ishaya

fix snapshot test

Revision history for this message
Dan Prince (dan-prince) wrote :

Sorry for holding this up. I'm no multicall expert but I should probably remove my 'needs fixing' at least so more people check it out. The latest fix resolves my issue with model attributes that aren't columns.

review: Abstain
Revision history for this message
Rick Harris (rconradharris) wrote :

Ran into a test failure:

======================================================================
FAIL: test_create_and_delete_volume (nova.tests.integrated.test_volumes.VolumesTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/rick/openstack/nova/no-db-messaging/nova/tests/integrated/test_volumes.py", line 122, in test_create_and_delete_volume
    self.assertEquals(1, len(create_actions))
AssertionError: 1 != 2
-------------------- >> begin captured logging << --------------------

Other than that, I think this looks good.

review: Needs Fixing
Revision history for this message
Dan Prince (dan-prince) wrote :

Couple of conflicts now w/ a trunk merge too:

Text conflict in nova/volume/api.py
Text conflict in nova/volume/manager.py
2 conflicts encountered.

Revision history for this message
Matt Dietz (cerberus) wrote :

I really like this approach, but I have the same reservations that Mark does.

We could go with the Listener approach, but I will say that I've seen issues with bottlenecking in our current architecture trying to do something very similar.

It seems like this would make updating a running nova installation prohibitively difficult. If there was a way to recall the calling context/msg_id even after a worker bounce, then I'd feel a lot better.
We may need some kind of state dump mechanism for the workers. Perhaps we could pickle some pertinent data upon a worker receiving a HUP/KILL/whatever.

review: Needs Information
1122. By Vish Ishaya

remove merge error calling failing test

1123. By Vish Ishaya

merged trunk

Unmerged revisions

1123. By Vish Ishaya

merged trunk

1122. By Vish Ishaya

remove merge error calling failing test

1121. By Vish Ishaya

fix snapshot test

1120. By Vish Ishaya

return not yield in scheduler shortcut

1119. By Vish Ishaya

make sure to handle VolumeIsBusy

1118. By Vish Ishaya

merged trunk and removed conflicts

1117. By Vish Ishaya

lost some changes from rpc branch, bring them in manually

1116. By Vish Ishaya

use strtime for passing datetimes back and forth through the queue

1115. By Vish Ishaya

fix tests

1114. By Vish Ishaya

keep the database on the receiving end as well

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
The diff is not available at this time. You can reload the page or download it.