← Back to team overview

openstack team mailing list archive

Re: Openstack and Google Compute Engine

 

On Tue, Jul 3, 2012 at 2:01 AM, Simon G. <semyazz@xxxxxxxxx> wrote:

> Secondly, I don't think we shouldn't compare GCE to Openstack. I
> understand that right now cloud (Openstack, Amazon, ...) is just easy in
> use, managed and scalable datacenter. It allows users to create VMs, upload
> their images, easily increase their (limited) demands, but don't you think
> that HPC is the right direction? I've always thought that final cloud's
> goal is to provide easy in use HPC infrastructure. Where users could do
> what they can do right now in the clouds (Amazon, Openstack), but also
> could do what they couldn't do in typical datacenter. They should run
> instance, run compute-heavy software and if they need more resources, they
> just add them. if cloud is unable to provide necessary resources, they
> should move their app to bigger cloud and do what they need. Openstack
> should be prepared for such large deployment. It should also be prepared
> for HPC use cases. Or if it's not prepared yet, it should be Openstack's
> goal.
>

HPC in the cloud operates more like a grid computing solution.  With things
like Amazon HPC or HPC under openstack the idea is to allocate entire
physical systems to a user on the fly.  Traditionally to date that has been
done with m1.full style instances.  In many ways bare metal provisioning is
a better option here than a hypervisor.  And for many people who do work in
an HPC environment bare metal really is the only solution that makes
sense.

The reality is that HPC use cases lose a lot of the underlying benefits of
cloud infrastructure.  So they really are something of an edge case at the
moment.  I believe that bare metal provisioning from within openstack could
be a bit of a game changer in HPC, and that it could be useful in a wide
variety of areas.  But, ultimately I believe the usage that HPC in no way
reflects general computing needs.  And that really sums it up.  Most folks
do not need or want HPC.  Most folks with HPC needs don't want a hypervisor
slowing down their memory access.


> I know that clouds are fulfilling current needs for scalable datacenter,
> but it should also fulfill future needs. Apps are faster and faster. More
> often they do image processing, voice recognition, data mining and it
> should be clouds' goal to provide an easy way to create such advanced apps,
> not just simple web server which could be scaled up, by adding few VMs and
> load balancer to redirect requests. Infrastructure should be prepared even
> for such large deployment like that in google. It should also be optimized
> and support heavy computations. In the future it should be as efficient as
> grids (or almost as efficient), because ease of use has already been
> achieved. If, right now, it's easy to deploy VM into the cloud, the next
> step should be to optimize infrastructure to increase performance.
>

Apps are actually slower and slower.  The hardware is faster.  The
Applications themselves abstract more and more and thus slow down.  As for
what you do on your instances, that's entirely your own thing herr user.
Some large data and some serious compute use cases simply don't lend
themselves to cloud today.  Hypervisors are limiting in so far as they give
up some speed to provide the ability to share resources better.  If you
have no desire to share resources then virt machines become something of an
impediment to you.  So I don't see this as being accurate for some use
cases.

There are also other external limiting factors.  People don't just turn on
a dime.  Many of the scientific and industrial applications of computing
power are built around software stacks that have grown over time, and for a
long time.  Those stacks can't be made to easily adopt the benefits of a
new technology.  Sometimes the reason not to use cloud as a platform is
entirely related to your inability to modify an existing software suite
enough to make it worthwhile.  I have seen this before at super computing
facilities.


> I've always thought about clouds in that way. Maybe I was wrong. Maybe
> cloud should do only what it's doing right now and let to others
> technologies handle HPC.
>

I think many, in the HPC environment, argue this is probably true.  I don't
necessarily agree.  GCE obviously proves a point.  Sharing resources means
that you don't have to run your own super computer.  You can simply rent
enough of a compute environment to solve your problem at will.  And odds
are the environment will be pretty up to date.  For many use cases cloud
environments are just dandy.  And HPC offerings in IaaS providers are
getting better all the time.  For low funded research, citizen science, and
a million other small fries out there there is certainly a value in
lowering the barrier to entry in this technology.

That being said, I think that private HPC will never go away if only
because of data retention rules and law.  Much research deals with data
that must be either safeguarded or simply classified and placed in an
environment that meets that classification levels needs.  In some cases
those limiting factors can make working with the amazons or googles or
rackspaces of the world an impossibility.

So on one hand, yes I think HPC in openstack is important.  And will grow
still more so as time goes on.  But, on the flip side I believe HPC user
requirements do not reflect the needs of general computing users.  From a
technical backend perspective the likelihood is most businesses have no
necessary use of HPC.  And bursting to a public offering probably makes a
lot more sense than on their own private pond of compute resources.

One future environment might see an openstack environment local to the org
that has users testing instances and prepping them then sending them out
for a few days at a time every few months to crunch some data set on an HPC
environment.  In that case the openstack environment would become an HPC
instance proving ground / staging area.

But hey, maybe I am wrong.  =D

-Matt

Follow ups

References