openstack team mailing list archive
-
openstack team
-
Mailing list archive
-
Message #04527
Re: OpenStack + RDMA + Infiniband
On Mon, Oct 3, 2011 at 4:21 PM, Caitlin Bestler
<Caitlin.Bestler@xxxxxxxxxxx> wrote:
>
>
> Narayan Desai wrote:
>
>
>> I suspect that the original poster was looking for instance access
> (mediated in some way) to IB gear.
>> When we were trying to figure out how to best use our IB gear inside
> of openstack, we decided that
>> it was too risky to try exposing IB at the verbs layer to instances
> directly, since the security model
>> doesn't appear to have a good way to prevent administrative commands
> from being issued from
>> untrusted instances.
>
>> We decided to use to IB as fast plumbing for data movement (using
> IPoIB) and have ended up with
>> pretty nice I/O performance to the volume service, etc. We haven't
> managed to use it for much more
>> than that at this point.
>
> There's no reason to expect use of IPoIB to end up providing better
> TCP/IP service for large bulk data
> transfer than you would get from a quality Ethernet NIC. But if you have
> an existing IB infrastructure
> it is certainly worth considering. You should experiment to see whether
> you get better performance
> under load form IPoIB in connected mode as opposed to trying SDP.
I suppose that is true, if your link speeds are the same. We're
getting (without much effort) 3 GB/s over IPoIB (connected mode, etc).
> Either IPoIB or SDP should be accessible via a standard sockets
> interface,meaning they could be
> plugged in without modifying the Python code or Python libraries.
Yeah, that is exactly what we did. We used addresses on the IPoIB
layer 3 network to get all of our I/O traffic going over that instead
of ethernet.
> The response to congestion by an IB network is different than the
> response from a TCP network,
> and the response of a TCP network simulated over IPoIB is something else
> entirely. So you'd want
> to do your evaluation with realistic traffic patterns.
Yeah, in our case, the system was specced like an HPC cluster, so the
management network is pretty anemic compared with QDR.
-nld
References