openstack team mailing list archive
-
openstack team
-
Mailing list archive
-
Message #05902
Re: HPC with Openstack?
Hi Cole:
That link you posted refers to our work at ISI. We're currently running LXC as the hypervisor on our SGI UV. Other than performance, one of the issues with KVM is that it currently has a hard-coded limit on how many vCPUs you can run in a single instance, so we can't run, say, a 256 vcpus instance.
Some of the LXC-related issues we've run into:
- The CPU affinity issue on LXC you mention. Running LXC with OpenStack, you don't get proper "space sharing" out of the box, each instance actually sees all of the available CPUs. It's possible to restrict this, but that functionality doesn't seem to be exposed through libvirt, so it would have to be implemented in nova.
- LXC doesn't currently support volume attachment through libvirt. We were able to implement a workaround by invoking "lxc-attach" inside of OpenStack instead (e.g., see <https://github.com/usc-isi/nova/blob/hpc-testing/nova/virt/libvirt/connection.py#L482>. But to be able to use lxc-attach, we had to upgrade the Linux kernel in RHEL6.1 from 2.6.32 to 2.6.38. This kernel isn't supported by SGI, which means that we aren't able to load the SGI numa-related kernel modules.
Take care,
Lorin
--
Lorin Hochstein, Computer Scientist
USC Information Sciences Institute
703.812.3710
http://www.east.isi.edu/~lorin
On Dec 3, 2011, at 5:08 PM, Cole wrote:
> First and foremost: http://wiki.openstack.org/HeterogeneousSgiUltraVioletSupport
>
> With Numa and lightweight container technology (LXC / OpenVZ) you can achieve very close to real hardware performance for certain HPC applications. The problem with technologies like LXC is there isn't a ton of logic to address the cpu affinity that other hypervisors offer (which generally wouldn't be ideal for HPC).
>
> On the interconnect side. There are plenty of open-mx(http://open-mx.gforge.inria.fr/) HPC applications running on everything from single channel 1 gig to bonded 10 gig.
>
> This is an area I'm personally interested in and have done some testing and will be doing more. If you are going to try HPC with ethernet, Arista makes the lowest latency switches in the business.
>
> Cole
> Nebula
>
> On Sat, Dec 3, 2011 at 11:11 AM, Tim Bell <Tim.Bell@xxxxxxx> wrote:
> At CERN, we are also faced with similar thoughts as we look to the cloud on how to match the VM creation performance (typically O(minutes)) with the required batch job system rates for a single program (O(sub-second)).
>
> Data locality to aim that the job runs close to the source data makes this more difficult along with fair share to align the priority of the jobs to achieve the agreed quota between competing requests for limited and shared resource. The classic IaaS model of 'have credit card, will compute' does not apply for some private cloud use cases/users.
>
> We would be interested to discuss further with other sites. There is further background from OpenStack Boston at http://vimeo.com/31678577.
>
> Tim
> tim.bell@xxxxxxx
>
>
>
> _______________________________________________
> Mailing list: https://launchpad.net/~openstack
> Post to : openstack@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~openstack
> More help : https://help.launchpad.net/ListHelp
>
> _______________________________________________
> Mailing list: https://launchpad.net/~openstack
> Post to : openstack@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~openstack
> More help : https://help.launchpad.net/ListHelp
Follow ups
References