← Back to team overview

nova team mailing list archive

Re: Nova's network architecture

 

Yes, Sorry about that, we're still scrambling here.  Explanations follow.

Current
-------

Currently, there are three strategies for networking, implemented by different managers:
FlatManager -- ip addresses are grabbed from a network and injected into the image on launch.  All instances are attached to the same manually configured bridge.
FlatDHCPManager -- ip addresses are grabbed from a network, and a single bridge is created for all instances. A dhcp server is started to pass out addresses
VlanManager -- each project gets its own vlan, bridge and network.  A dhcpserver is started for each vlan, and all instances are bridged into that vlan.

The implementation of creating bridges, vlans, dhcpservers, and firewall rules is done by the driver linux_net.  This layer of abstraction is so that we can at some point support configuring hardware switches etc. using the same managers.

My goal with the Manager refactor was to move all of the code relating to each component into the Manager classes.  For example, all code relating to networks/ips would be done by NetworkManager.  That said, code relating to businesss objects needs to be run by three separate workers, and not all of it has been encapsulated in the manager.  For example, the initial creation of the record for instances/volumes is done  by nova-api by hitting the db directly instead of using a method on the manager.

Here is some horrible ascii art of what each component does during run instance. The manager's methods are called directly for local setup, and they are called via rpc to nova-network which runs the manager wrapped by a service which exposes public methods to rpc. The *d items aren't specifically related to networking but help make it clear when things occur
+----------------+       +----------------+       +----------------+     
|    nova-api    |       |  nova-network  |       |  nova-compute  |     
-----------------+       +----------------+       +----------------+
 (direct manager)            (service)             (direct manager)
*create instance in db
find correct network host
(get_network)
call (if no host) -----> sets up network
                         (set_network_host)
<----------------------- returns host
allocates fixed ip
(allocate_fixed_ip)
cast-------------------> sets up fixed ip
V                        (setup_fixed_ip)
cast-----------(goes through scheduler)---------> (ComputeManager.run_instance)
|                                                 sets up compute network
|                                                 (setup_compute_network)
|                                                 * launches instance
|                                                 sets up security groups
V                                                 (done by compute driver)
*return instance data

This all works and allows us to return an ip very quickly from an api call, but it does have some issues.  Most importantly, in some network models, the api server may not have enough information to know how to allocate an ip.  For example, in the flat mode, we may want to have a cluster of compute nodes associated with a particular network host, and get_network returns a network based on the compute host instead of the project as we do in vlan mode.

Future (IMO)
------------

I think we should move the allocating of the ip further down the stack like so:

+----------------+       +----------------+       +----------------+     
|    nova-api    |       |  nova-network  |       |  nova-compute  |     
-----------------+       +----------------+       +----------------+
 (direct manager)            (service)             (direct manager)
*create instance in db
check for available network resources
(check_free_ips?)
cast-----------(goes through scheduler)---------> (ComputeManager.run_instance)
|                                                 find correct network host
|                                                 (get_network)
|                        sets up network <------- call (if no host)
|                        (set_network_host)
|                        -----------------------> returns host
|                                                 sets up compute network
|                                                 (setup_compute_network)
|                        allocates fixed ip <---- call (because we're not in a rush)
|                        (allocate_fixed_ip)
|                        (setup_fixed_ip)
|                        -----------------------> returns fixed_ip
|                                                 * launches instance
|                                                 sets up security groups
V                                                 (done by compute driver)
*return instance data (with no ip address)

It is also possible to do some of this in scheduler instead of compute if we want to keep compute really dumb in terms of communicating over the queue.

Floating Ips
------------
The floating ip allocation is simpler and rarer, so we actually just to a simple call through the queue and all of the setup is done by nova-network.  It is important to note that floating ips are associated with fixed ips, not instances directly, and more than one floating ip can be associated with each fixed ip.  They are implemented with iptables forwarding rules.

Security Groups
---------------
Soren did the security group code, so I'll let him do an overview.  They are implemented on the host level.  I have a couple patches I need to submit to speed them up and to allow intra-project communication in vlan mode.


I hope this helps.  Let me know if there are further questions or if I have left anything out.

Vish


On Oct 15, 2010, at 12:31 AM, Ewan Mellor wrote:

> Vish, will you be able to get around to writing this up soon?  It would be
> really useful for me to understand it before the summit.
> 
> Thanks,
> 
> Ewan.
> 
> On Sat, Oct 02, 2010 at 07:21:23PM +0100, vishvananda wrote:
> 
>> I will make a detailed description of this and where it is going.  I have a couple of branches cleaning up some of this, and some ideas for how it should be.  Briefly, the allocate and setup were split out so we could give a quick response from the api including the ip without having to wait for the setup, but it should really be returning 'pending' and doing all of the network setup at once with a call to nova-network from the compute host or the scheduler.  More after Nebula 1.0 (Monday)
>> 
>> Vish
>> On Oct 2, 2010, at 6:56 AM, Ewan Mellor wrote:
>> 
>>> Can someone please explain the intended architecture for Nova's networking
>>> code?
>>> 
>>> Firstly, there is a daemon called nova-network, which I assumed would be
>>> responsible for managing networking in the entirety, but we also have
>>> nova-api handling networking configuration directly (calling
>>> network_manager.allocate_fixed_ip in response to a run_instance request)
>>> and nova-compute also doing some network setup (calling
>>> network_manager.setup_compute_network).
>>> 
>>> I don't understand the dividing line of responsibility for these three
>>> (particularly why we need the API layer to allocate IP addresses, and how
>>> it's even possible to do this before the scheduler has had a chance to place
>>> the VM).
>>> 
>>> Secondly, looking at the code in network.manager, I see a base class called
>>> NetworkManager, with two subclasses FlatManager and VlanManager, but
>>> NetworkManager also has a driver, of which we seem to only have one example:
>>> linux_net.
>>> 
>>> I don't understand the division between these two orthogonal customization
>>> routes.  What belongs to the manager subclass, and what belongs to the driver?
>>> 
>>> Thirdly, I don't understand the configuration of the network manager.  We've
>>> got a flag called "rs_network_manager" in the Rackspace API layer.  How is this
>>> different to flag called "network_manager" elsewhere?  What does it mean
>>> if these settings are different?  rs_network_manager=FlatManager,
>>> network_manager=VlanManager is the default at the moment, and I don't
>>> understand how that can have a sensible meaning.
>>> 
>>> The reason I'm looking at all this at the moment is there are definitely
>>> going to be changes required to make this work with XenAPI, both for the
>>> existing bridging mechanism that we use, and for the upcoming Open vSwitch.
>>> I'm trying to figure out how these distinctions should be managed.
>>> 
>>> Thanks in advance,
>>> 
>>> Ewan.
>>> 
>>> _______________________________________________
>>> Mailing list: https://launchpad.net/~nova
>>> Post to     : nova@xxxxxxxxxxxxxxxxxxx
>>> Unsubscribe : https://launchpad.net/~nova
>>> More help   : https://help.launchpad.net/ListHelp
>> 


References