← Back to team overview

openstack team mailing list archive

Re: Greatest deployment?

 

Matt,

LXC is not a good alternative for several obvious reasons.  So think on all
of that.
Could you expand on why you believe LXC is not a good alternative? As an
HPC provider we're currently weighing up options to get the most we can out
of our Openstack deployment performance-wise. In particular we have quite a
bit of IB, a fairly large Lustre deployment and some GPUs, and are
seriously considering going down the LXC route to try to avoid wasting all
of that by putting a hypervisor on top.

 - Michael Chapman

On Fri, May 25, 2012 at 1:34 AM, Matt Joyce <matt.joyce@xxxxxxxxxxxxxxxx>wrote:

> We did some considerable HPC testing when I worked over at NASA Ames with
> the Nebula project.  So I think we may have been the first to try out
> openstack in an HPC capacity.
>
> If you can find Piyush Mehrotra from the NAS division at Ames, ( I'll
> leave it to you to look him up ) he has comprehensive OpenStack tests from
> the Bexar days.  He'd probably be willing to share some of that data if
> there was interest ( assuming he hasn't already ).
>
> Several points of interest I think worth mentioning are:
>
> I think fundamentally many of the folks who are used to doing HPC work
> dislike working with hypervisors in general.  The memory management and
> general i/o latency is something they find to be a bit intolerable.
> OpenNebula, and OpenStack rely on the same sets of open source
> hypervisors.  In fact, I believe OpenStack supports more.  What they do
> fundamentally is operate as an orchestration layer on top of the hypervisor
> layer of the stack.  So in terms of performance you should not see much
> difference between the two at all.  That being said, that's ignoring the
> possibility of scheduler customisation and the sort.
>
> We ultimately, much like Amazon HPC ended up handing over VMs to customers
> that consumed all the resources on a system thus negating the benefit of
> VMs by a large amount.  1 primary reason for this is pinning the 10 gig
> drivers, or infiniband if you have it, to a single VM allows for direct
> pass through and no hypervisor latency.  We were seeing a maximum
> throughput on our 10 gigs of about 8-9 gbit with virtio / jumbo frames via
> kvm, while hardware was slightly above 10.  Several vendors in the area I
> have spoken with are engaged in efforts to tie in physical layer
> provisioning with OpenStack orchestration to bypass the hypervisor
> entirely.  LXC is not a good alternative for several obvious reasons.  So
> think on all of that.
>
> GPUs are highly specialised.  Depending on your workloads you may not
> benefit from them.  Again you have the hardware pinning issue in VMs.
>
> As far as Disk I/O is concerned, large datasets need large disk volumes.
> Large non immutable disk volumes.  So swift / lafs go right out the
> window.  nova-volume has some limitations ( or it did at the time ) euca
> tools couldn't handle 1 TB volumes and the APT maxed out around 2.  So we
> had users raiding their volumes and asking how to target them to nodes to
> increase I/O.  This was sub optimal.  Luster or gluster would be better
> options here.  We chose gluster because we've used luster before, and
> anyone who has knows it's pain.
>
> As for node targeting users cared about specific families of cpus.  Many
> people optimised by cpu and wanted to target westmeres of nehalems.  We had
> no means to do that at the time.
>
> Scheduling full instances is somewhat easier so long as all the nodes in
> your zone are full instance use only.
>
> Matt Joyce
> Now at Cloudscaling
>
>
>
>
> On Thu, May 24, 2012 at 5:49 AM, John Paul Walters <jwalters@xxxxxxx>wrote:
>
>> Hi,
>>
>> On May 24, 2012, at 5:45 AM, Thierry Carrez wrote:
>>
>> >
>> >
>> >> OpenNebula has also this advantage, for me, that it's designed also to
>> >> provide scientific cloud and it's used by few research centres and even
>> >> supercomputing centres. How about Openstack? Anyone tried deploy it in
>> >> supercomputing environment? Maybe huge cluster or GPU cluster or any
>> >> other scientific group is using Openstack? Is anyone using Openstack in
>> >> scentific environement or Openstack's purpose is to create commercial
>> >> only cloud (business - large and small companies)?
>> >
>> > OpenStack is being used in a number of research clouds, including NeCTAR
>> > (Australia's national research cloud). There is huge interest around
>> > bridging the gap there, with companies like Nimbis or Bull being
>> involved.
>> >
>> > Hopefully people with more information than I have will comment on this
>> > thread.
>> >
>> >
>> We're developing GPU, bare metal, and large SMP (think SGI UV) support
>> for Openstack and we're targeting HPC/scientific computing workloads.  It's
>> a work in progress, but we have people using our code and we're talking to
>> folks about getting our code onto nodes within FutureGrid.  We have GPU
>> support for LXC right now, and we're working on adding support for other
>> hypervisors as well.  We're also working on getting the code into shape for
>> merging upstream, some of which (the bare metal work) has already been
>> done.  We had an HPC session at the most recent Design Summit, and it was
>> well-attended with lots of great input.  If there are specific features
>> that you're looking for, we'd love to hear about it.
>>
>> By the way, all of our code is available at
>> https://github.com/usc-isi/nova, so if you'd like to try it out before
>> it gets merged upstream, go for it.
>>
>> best,
>> JP
>>
>>
>> _______________________________________________
>> Mailing list: https://launchpad.net/~openstack
>> Post to     : openstack@xxxxxxxxxxxxxxxxxxx
>> Unsubscribe : https://launchpad.net/~openstack
>> More help   : https://help.launchpad.net/ListHelp
>>
>
>
> _______________________________________________
> Mailing list: https://launchpad.net/~openstack
> Post to     : openstack@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp
>
>


-- 
Michael Chapman
*Cloud Computing Services*
ANU Supercomputer Facility
Room 318, Leonard Huxley Building (#56), Mills Road
The Australian National University
Canberra ACT 0200 Australia
Tel: *+61 2 6125 7106*
Web: http://nci.org.au

Follow ups

References