← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1438113] [NEW] Use plain HTTP listeners in the conductor

 

Public bug reported:

The conductor is consuming messages form single queue which has performance limitation due to various reasons.:
- per queue lock
- Some broker also limiting same part of the message handling to single CPU thread/queue
- Multiple broker instances needs to synchronise to queue content, which causes additional delays die to the tcp request/response times

The single queue limitation is much greater than the limits getting by
single mysql server, the rate is even worse when you consider slave
reads.

This can be workarounded by explicitly or implicit distributing the rpc
calls to multiple different queue.

The message broker provides additional message durability properties which is not needed just for an rpc_call,
we spend resource on what we actually do not need.

For TCP/HTTP traffic load balancing we have many-many tools even hardware assisted options are available providing virtually unlimited scalability.
At TCP level also possible to exclude the loadbalancer node(s) form the response traffic.

Why HTTP?
Basically any protocol which can do request/response `thing` with arbitrary  type and size of data with keep-alive connection and with ssl option, could be used.
HTTP is a simple and well know protocol, with already existing many-many load balancing tool.

Why not have the agents to do a regular API call?
The regular API calls needs to do policy check, which in this case is not required, every authenticated user can be considered as admin.  

The  the conductor clients needs to use at least a single shared key configured in every nova host.
It has similar security as openstack used with the brokers, basically all nova node had credentials in one rabbitmq virtual host,
configured in the /etc/nova/nova.conf . If any of those credentials stolen it provided access to the whole virtual host. 

NOTE.: HTTPs can be used with certificate or kerberos based
authentication as well.


I think the for `rpc_calls` which are served by the agents using AMQP is still better option,  this bug is just about the situation when the conductor itself serves  rpc_call(s). 

NOTE.: The 1 Million msq/sec rabbitmq benchmark is done 186 queues, in
way which does not hits the single queue limitations.

** Affects: nova
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1438113

Title:
  Use plain HTTP listeners in the conductor

Status in OpenStack Compute (Nova):
  New

Bug description:
  The conductor is consuming messages form single queue which has performance limitation due to various reasons.:
  - per queue lock
  - Some broker also limiting same part of the message handling to single CPU thread/queue
  - Multiple broker instances needs to synchronise to queue content, which causes additional delays die to the tcp request/response times

  The single queue limitation is much greater than the limits getting by
  single mysql server, the rate is even worse when you consider slave
  reads.

  This can be workarounded by explicitly or implicit distributing the
  rpc calls to multiple different queue.

  The message broker provides additional message durability properties which is not needed just for an rpc_call,
  we spend resource on what we actually do not need.

  For TCP/HTTP traffic load balancing we have many-many tools even hardware assisted options are available providing virtually unlimited scalability.
  At TCP level also possible to exclude the loadbalancer node(s) form the response traffic.

  Why HTTP?
  Basically any protocol which can do request/response `thing` with arbitrary  type and size of data with keep-alive connection and with ssl option, could be used.
  HTTP is a simple and well know protocol, with already existing many-many load balancing tool.

  Why not have the agents to do a regular API call?
  The regular API calls needs to do policy check, which in this case is not required, every authenticated user can be considered as admin.  

  The  the conductor clients needs to use at least a single shared key configured in every nova host.
  It has similar security as openstack used with the brokers, basically all nova node had credentials in one rabbitmq virtual host,
  configured in the /etc/nova/nova.conf . If any of those credentials stolen it provided access to the whole virtual host. 

  NOTE.: HTTPs can be used with certificate or kerberos based
  authentication as well.

  
  I think the for `rpc_calls` which are served by the agents using AMQP is still better option,  this bug is just about the situation when the conductor itself serves  rpc_call(s). 

  NOTE.: The 1 Million msq/sec rabbitmq benchmark is done 186 queues, in
  way which does not hits the single queue limitations.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1438113/+subscriptions


Follow ups

References