← Back to team overview

openstack team mailing list archive

Re: Periodic clean-up of fixed_ip addresses in multi-host DHCP mode

 

Hey Phil.  I think you have a case of old-coditis

This was modified to properly query in multihost mode before the essex release:

996 def fixed_ip_disassociate_all_by_timeout(context, host, time):
 997     session = get_session()
 998     # NOTE(vish): only update fixed ips that "belong" to this
 999     #             host; i.e. the network host or the instance
1000     #             host matches. Two queries necessary because
1001     #             join with update doesn't work.
1002     host_filter = or_(and_(models.Instance.host == host,
1003                            models.Network.multi_host == True),
1004                       models.Network.host == host)
1005     result = session.query(models.FixedIp.id).\
1006                      filter(models.FixedIp.deleted == False).\
1007                      filter(models.FixedIp.allocated == False).\
1008                      filter(models.FixedIp.updated_at < time).\
1009                      join((models.Network,
1010                            models.Network.id == models.FixedIp.network_id)).\
1011                      join((models.Instance,
1012                            models.Instance.id == models.FixedIp.instance_id)).\
1013                      filter(host_filter).\
1014                      all()
1015     fixed_ip_ids = [fip[0] for fip in result]
1016     if not fixed_ip_ids:
1017         return 0
1018     result = model_query(context, models.FixedIp, session=session).\
1019                      filter(models.FixedIp.id.in_(fixed_ip_ids)).\
1020                      update({'instance_id': None,
1021                              'leased': False,
1022                              'updated_at': utils.utcnow()},
1023                              synchronize_session='fetch')
1024     return result


On Apr 27, 2012, at 11:30 AM, Day, Phil wrote:

> Hi Folks,
>  
> In multi-host mode the “host” field of a network never seems to get set (as only IPs are allocated, not networks)
>  
> However the periodic revovery task in NetworkManager uses the host field to filter what addresses it should consider cleaning up (to catch the case where the message from dnsmasq is either never sent or not delivered for some reason)
>  
>     if self.timeout_fixed_ips:
>             now = utils.utcnow()
>             timeout = FLAGS.fixed_ip_disassociate_timeout
>             time = now - datetime.timedelta(seconds=timeout)
>             num = self.db.fixed_ip_disassociate_all_by_timeout(context,
>                                                                self.host,
>                                                                time)
>             if num:
>                 LOG.debug(_('Dissassociated %s stale fixed ip(s)'), num)
>  
>  
> Where “db.fixed_ip_disassociate_all_by_timeout”   is:
>  
> def fixed_ip_disassociate_all_by_timeout(_context, host, time):
>     session = get_session()
>     inner_q = session.query(models.Network.id).\
>                       filter_by(host=host).\
>                       subquery()
>     result = session.query(models.FixedIp).\
>                      filter(models.FixedIp.network_id.in_(inner_q)).\
>                      filter(models.FixedIp.updated_at < time).\
>                      filter(models.FixedIp.instance_id != None).\
>                      filter_by(allocated=False).\
>                      update({'instance_id': None,
>                              'leased': False,
>                              'updated_at': utils.utcnow()},
>                              synchronize_session='fetch')
>     return result
>  
>  
> So what this seems to do to me is:
> -          Find all of the fixed_ips which are:
> o   on networks assigned to this host
> o   Were last updated more that “Timeout” seconds ago
> o   Are associated to an instance
> o   Are not allocated
>  
> Because in multi-host mode the network host field is always Null, this query does nothing apart from give the DB a good work out every 10 seconds – so there could be a slow leakage of IP addresses.
>  
> Has anyone else spotted this – and if so do you have a good strategy for dealing with it ?
>  
> It seems that running this on every network_manager every 10 seconds is excessive – so what still running on all netwok_managers but using a long random sleep between runs in mult-host mode ?
>  
> Thoughts ?
>  
> Cheers,
> Phil


References