AWS Clusters and templates

Since mid June 2017, the AMP CLI will create a cluster by using the AWS cloudformation API and the URL of a cloudformation template. Which template should be use to secure the cluster deployment and at the same time allow a smooth deployment of AMP on it?

Sept 2017 Update: the default templated used by the AMP AWS plugin (and the only supported one) is in the repo and published on S3 at each release.

Criteria

The choice of the cloudformation template used by the AMP CLI cluster create command for the AWS provider should be based on:

stability of the template (how well is it tested, who's maintaining it?)
simplicity of the template (does it make use of specific resources that would unnecessarily make our cloud deployments less generic than desirable?)
maintainability (how can we update it to make it fit our growing needs?)
consistent with the prerequisites of AMP and of the stacks we want to deploy on it

prerequisites for AMP

Mandatory:

node labels for service scheduling
systemctl configuration for core services prerequisites
ability to scale the worker nodes

Optional:

set the version of Docker installed on the nodes
open the engine API from workers to managers (for core services that need access to swarm metadata)
dedicated group for nodes where user services will be deployed (they should not have the same rights, in particular no reason to give them access to the engine API of the managers)
deployment of a list of Docker plugins (volume, network)

Templates

Docker for AWS

official and stable
choice between stable or beta channel
automatic listener creation in ELB based on published ports in the Swarm
the OS is a custom one from Docker (probably LinuxKit)
no access to the OS, the ssh runs in a container
no way to install plugins
no way to set labels
a single group for all workers
engine API is not accessible from the workers
ES 5 can't run on these nodes without sysctl operations first

As a consequence, after a CF stack creation, before the core services can be deployed, we have to add labels to nodes (but it's not safe, if an instance are recycled, the labels are lost). The monitoring with Prometheus will be limited (no input from the Docker engines and the hosts). Teams who will want to monitor the cluster will only be able to rely on Cloudwatch.

custom made template

has to be thoroughly tested
one autoscaling group for the managers
one or more autoscaling group for the workers
engine API available from the workers
AMP labels are set when the nodes join the swarm
the Docker version can be forced
Debian or Ubuntu (can be extended)

Once the CF stack is created, core services can be directly deployed, they will only need the secrets and config deployments first. The full metric set will be available in Prometheus.

Provide feedback

Saved searches