← Back to team overview

openstack team mailing list archive

Re: [Quantum] Scalable agents

 

On 07/18/2012 04:23 AM, Dan Wendlandt wrote:


On Mon, Jul 16, 2012 at 3:30 AM, Gary Kotton <gkotton@xxxxxxxxxx <mailto:gkotton@xxxxxxxxxx>> wrote:

    Hi,
    The patch https://review.openstack.org/#/c/9591/ contains the
    initial support for the scalable agents (this is currently
    implemented on the linux bridge). At the moment this does not
    support a network or port update, that is, the user can set
    'admin_status_up' to 0. This means that either the network or the
    port should stop handling traffic.
    The network/port update is challenging in a number of respects.
    First and foremost the quantum plugin is not aware of the agent on
    which the port may have been allocated (this is where the VM has
    been deployed). In addition to this there may be a number of
    agents running.
    There are a number of options to perform the port update. They are
    listed below:
    1. Make use of the openstack-common notifier support. This would
    have the plugin notify "all" of the agents. I have yet to look at
    the code but guess that it is similar to the next item.
    2. Make use of the RPC mechanism to have the plugin notify the
    agents. At the moment the plugin has the topic of all of the
    agents (this is used for a health check to ensure that the
    configuration on the agent is in sync with that of the plugin). It
    is described in detail in
    https://docs.google.com/document/d/1MbcBA2Os4b98ybdgAw2qe_68R1NG6KMh8zdZKgOlpvg/edit?pli=1

    If I understand correctly then both of the above would require
    that the agents are also RPC consumers. In both of the above the
    when there is a update to either a network or port then there will
    be a lot of traffic broadcast on the network.


Hi Gary,

Yes, I think either way, to eliminate the polling, we need to have some mechanism to inform the agents that they need to update state. My goal would be to build a standard mechanism for this that to the degree possible leverages existing APIs and data formats, so that we can avoid having multiple formats for the same data and avoid any RPC-call sprawl.

I agree with you wholeheartedly here. In my opinion this is what I have started with the RPC inclusion and initial support. At the moment this lacks the "update from the service" side (which essentially is what this mail is about :))


I agree that we don't want to broadcast all data everyone. At the same time, I'd like to avoid having to make the the core plugin code running within quantum-server be aware of all of the different agents. What I think would be idea is that we have a fine-grained notification mechanism for when objects (networks, subnets, ports) are updated, and that agents could choose to register for updates on particular objects.

This is along the sames lines that I was thinking. In the current implementation the agent connects to the service to sync with the configuration. I was thinking of having the agent publicize it what information it would like to receive, for example: - quantum agent - needs port and network updates (port creation and deletion are treated in the current implementation)
    - dhcp-agent - port creation, deletion and updates
    - firewall agent - ...

When the service performs an operation and one of the agents supports the operation type then that agent should be updated. These are not "real time" opertaions and for the first phase we can use the broadcast mechnism. I do think that we should optimize for very large scale environments.

For example, a DHCP agent handling all DHCP for a deployment might register for create/update/delete operations on subnets + ports, whereas a plugin agent might only register for updates from the ports that it sees locally on the hypervisor. Conceptually, you could think of there being a 'topic' per port in this case, though we may need to implement it differently in practice.

The agent ID is currently stored in the database (this is for the configuration sync mechanism). I think that adding an extra column indicating the capabilities enables the service to notify the agents. The issue is how refined can the updates be - we want to ensure that we have a scalable architecture.


In general, I think it is ideal if these external agents can use standard mechanisms and formats as much as possible. For example, after learning that port X was created, the DHCP agent can actually use a standard webservice GET to learn about the configuration of the port (or if people feel that such information should be included in the notification itself, this notification data uses the same format as the webservice API).

I am not sure that I agree here. If the service is notifying the agent then why not have the information being passed in the message (IP + mac etc.) There is no need for the GET operation.


So in sum, I'm hoping that we can take an approach to this problem that build a base framework that will continue to work as we add more rich functionality to quantum networks, recognizing that in most cases, agents will need to follow the pattern of triggering off of changes to API objects. I'm not sure whether this is inline with your thinking or not, so I'd be curious to hear your thoughts. Thanks,

Dan


    Another alternative is to piggy back onto the health check
    message. This will contain the ID's of the networks/ports that
    were updated prior to the last check. When an agent receives
    these, if they are using the the network or port then they will
    request the details from the plugin. This will certainly have less
    traffic on the network.

    If anyone has any ideas then it would be great to hear them.
    Hopefully we can discuss this in tonight's meeting.
    Thanks
    Gary


    _______________________________________________
    Mailing list: https://launchpad.net/~openstack
    <https://launchpad.net/%7Eopenstack>
    Post to     : openstack@xxxxxxxxxxxxxxxxxxx
    <mailto:openstack@xxxxxxxxxxxxxxxxxxx>
    Unsubscribe : https://launchpad.net/~openstack
    <https://launchpad.net/%7Eopenstack>
    More help   : https://help.launchpad.net/ListHelp




--
~~~~~~~~~~~~~~~~~~~~~~~~~~~
Dan Wendlandt
Nicira, Inc: www.nicira.com <http://www.nicira.com>
twitter: danwendlandt
~~~~~~~~~~~~~~~~~~~~~~~~~~~



Follow ups

References