Skip to content

AWS Clusters and templates

Nicolas Degory edited this page Sep 8, 2017 · 3 revisions

Since mid June 2017, the AMP CLI will create a cluster by using the AWS cloudformation API and the URL of a cloudformation template. Which template should be use to secure the cluster deployment and at the same time allow a smooth deployment of AMP on it?

Sept 2017 Update: the default templated used by the AMP AWS plugin (and the only supported one) is in the repo and published on S3 at each release.

Criteria

The choice of the cloudformation template used by the AMP CLI cluster create command for the AWS provider should be based on:

  • stability of the template (how well is it tested, who's maintaining it?)
  • simplicity of the template (does it make use of specific resources that would unnecessarily make our cloud deployments less generic than desirable?)
  • maintainability (how can we update it to make it fit our growing needs?)
  • consistent with the prerequisites of AMP and of the stacks we want to deploy on it

prerequisites for AMP

Mandatory:

  • node labels for service scheduling
  • systemctl configuration for core services prerequisites
  • ability to scale the worker nodes

Optional:

  • set the version of Docker installed on the nodes
  • open the engine API from workers to managers (for core services that need access to swarm metadata)
  • dedicated group for nodes where user services will be deployed (they should not have the same rights, in particular no reason to give them access to the engine API of the managers)
  • deployment of a list of Docker plugins (volume, network)

Templates

Docker for AWS

  • official and stable
  • choice between stable or beta channel
  • automatic listener creation in ELB based on published ports in the Swarm
  • the OS is a custom one from Docker (probably LinuxKit)
  • no access to the OS, the ssh runs in a container
  • no way to install plugins
  • no way to set labels
  • a single group for all workers
  • engine API is not accessible from the workers
  • ES 5 can't run on these nodes without sysctl operations first

As a consequence, after a CF stack creation, before the core services can be deployed, we have to add labels to nodes (but it's not safe, if an instance are recycled, the labels are lost). The monitoring with Prometheus will be limited (no input from the Docker engines and the hosts). Teams who will want to monitor the cluster will only be able to rely on Cloudwatch.

custom made template

  • has to be thoroughly tested
  • one autoscaling group for the managers
  • one or more autoscaling group for the workers
  • engine API available from the workers
  • AMP labels are set when the nodes join the swarm
  • the Docker version can be forced
  • Debian or Ubuntu (can be extended)

Once the CF stack is created, core services can be directly deployed, they will only need the secrets and config deployments first. The full metric set will be available in Prometheus.