← Back to team overview

savanna-all team mailing list archive

Cluster scaling discussion


A comment on how you go about this. I believe you've already run into issues w/ using the start/stop-*.sh scripts as a foundation for this feature.

Long term I believe that an active cluster need not mean every instance is up and running. The core infrastructure must be (ambari + jt + nn), and some % of worker instances (jt + dn). For example, if I want to make a 500 instance cluster, I won't need to wait for all 500 instances before I can reasonably start using the cluster. In fact, I may never have 500 instances at any given time, 98% may be acceptable operating procedure. The start/stop-*.sh scripts are not good for that use case either.

However you go about this, keep the 98% cluster use case in mind.