Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core: introduce "backends" to replace "remote state" (superset) and fix UX #11286

Merged
merged 18 commits into from
Jan 26, 2017

Conversation

mitchellh
Copy link
Contributor

😃 👋

This PR introduces "backends" as a concept to replace "remote state" (and support more features now and in the future). In the process, it ended up smoothing over a lot of rough edges and introducing a LOT more features necessary to support backends.

This PR introduces the following:

  • Backends to remote remote state as a more general concept
  • File-based configuration of backends (and therefore remote state)
  • No longer stores a "cache" of state locally if it is remote
  • Specified (tested) and safer behavior around plan files, remote state, and various combinations of state flags for apply.
  • Unified init command as a single source of setup for backends, module downloads, and file initialization.
  • Removal of the terraform remote subcommands since they're either unnecessary now (such as the always-confusing-pinnacle-of-great-UX remote config) or replaced (such as terraform state pull)

Very few prior tests were changed, though some changes were necessary. Those should be looked over more closely and questions asked if necessary. Many, many more tests were added, especially around the loading of backends which was a very under-tested part of Terraform prior to this (in that case: loading the terraform.Context).

Backwards Compatibility 👵

This PR removes the terraform remote commands as a method of configuring remote state.

Previous environments/states configured with remote state are fully compatible with this new PR. A warning is shown that the user is using legacy remote state, but the environment will load and function as usual. Legacy remote state is deprecated however and support will be removed in some future version.

Otherwise, everything should behave as normal!

"Backend" ❓

"Backend" is the replacement term for "remote state". It includes the remote state functionality but also expands it to new functionality beyond that such as remote operations and in the future state locking, environments, and more.

The previous architecture of Terraform was like this:

┌───────────────────────────────────────────┐
│                                           │
│                    CLI                    │
│                                           │
└───────────────────────────────────────────┘
┌───────────────────────────────────────────┐
│                                           │
│        Core ("terraform" package)         │
│                                           │
└───────────────────────────────────────────┘

Backends make the achitecture more like this:

┌────────┐┌────────┐
│        ││        │
│  CLI   ││  ...   │
│        ││        │
└────────┘└────────┘
┌───────────────────────────────────────────┐
│                                           │
│                 Backend                   │
│                                           │
└───────────────────────────────────────────┘
┌───────────────────────────────────────────┐
│                                           │
│        Core ("terraform" package)         │
│                                           │
└───────────────────────────────────────────┘

The backend is a single interface that the top ("frontends") hook into. Backends are then expected to somehow eventually interface with the core.

For local operations, this is exactly what you expect: converting the backend API into the Terraform core API and just shuttling data through. For future backends, we plan on introducing remote operations so you can run terraform apply against a remote server.

Fixing Remote State 🤕

Going further, we wanted to use this opportunity to fix a lot of the issues around remote state. There were a few primary issues that we wanted to resolve:

  1. Initial configuration of remote state is not simple, intuitive, or enjoyable. The terraform remote config command is extremely unclear and the -var approach of passing in configurations is not scalable, not easy to share with team members, and not safe since it goes into your CLI history.

  2. Remote state caches the state locally in .terraform/terraform.tfstate. This introduces a lot of syncing complexity but also introduces confusing aspects of sensitive state perhaps remaining on the local disk. One of the benefits of remote state is to not store potentially sensitive values on disk, but this defeated that.

  3. Remote state didn't always work with every terraform subcommand. There was a lot of special case handling within the CLI to handle remote state which caused this since remote state was bolted on later.

With backends, we wanted to fix these issues as we looked to adding more remote behavior to Terraform in the future. We had an opportunity to learn from our mistakes and see how the community uses Terraform and use that knowledge to design something we believe is much better.

Features (External)

terraform init 🔰

terraform init has always existed but is now supercharged to do much more. It is now the single command you run whenever you check out a new or existing Terraform environment. It is safe to run multiple times.

Terraform init now: copies configurations (optional), downloads any modules, and initializes the backend. All of these are optional and can be disabled with flags. The idea though is that any developer working with Terraform starts with terraform init and they're ready to work!

Init will interactively guide them through configuring their remote backend if it is configured. For example, the screenshot below shows the setup of Consul:

2017-01-18 at 9 21 pm

With init, you can now use HCL files to configure backends, removing secrets from your CLI history and easing the burden of setup where previously you had to type complex CLI flags.

Backend Configuration in TF Config 🗄

The backend configuration now lives in the Terraform config. Example:

terraform {
  backend "consul" {}
}

The configuration is in the new terraform meta-configuration block introduced in Terraform 0.8. The above would configure Terraform to use "consul" to store state.

You can also include configuration within the {}, but it is optional. For example, if a backend requires access credentials, you can leave those out and use the interactive setup to set those. They will only be stored locally and don't need to be committed to version control.

Change Detection and State Migration 🕵️‍♀️

If you change the backend configuration at any point, Terraform detects this and tells you to rerun the init command. This will reconfigure your backend.

If your existing backend had state, Terraform will also ask if you want to copy your state to the newly configured backend. If you're configuring a new backend (for example moving from local to Consul or Consul to S3), then this allows you to easily take your state with you as part of the setup process. No more manual pushing/pulling to initialize state.

No More Cached State 😕

The .terraform/terraform.tfstate file no longer caches state locally.

Put another way: when using remote state, the state is downloaded per-run and only used in-memory. It never is written to disk unless a write error occurs to the remote state.

The number of users getting value out of the local cache was extremely, extremely low. For those users, they can still use the advanced terraform state pull command to bring remote state local. But by default, we simplify the whole experience by never downloading the state.

Cleaner .gitignore 🙈

You now always add .terraform/ to your .gitignore. There is never a good reason not to.

Prior to this PR, you'd have to put the state in .terraform in Git since it contained the remote state configuration. It was messy. Now, the configuration (or maybe partial to avoid secrets!) lives in your TF configuration, and the terraform init command automates and ensures the proper setup prior to configuration use.

More Safeguards 👮

Terraform state handling is now safer than ever. Multiple new safeguards have been introduced:

Lineage is checked. When initializing a new backend (local or not), we check if existing state is there with a different "lineage". A "lineage" is a unique ID assigned when the state is created. If two states have a different lineage, they are very likely different infrastructures. Terraform rightly freaks out in this scenario and gives you an error message.

Never allow pushing lower serial numbers. Every time state is updated, a serial number is incremented. Writing to remote state never allows writing a lower serial number if the remote end contains a higher serial number. There is a -force flag to force this behavior, but by default you don't want to do this.

Apply with a saved plan file doesn't write state. An apply with a saved plan file no longer writes state locally (unless it is configured to). This means it won't overwrite existing state that may be in the folder you're applying the plan.

Apply with a saved plan disallows many flags, such as -state. This [and other weird combinations] "worked" prior to this PR. By "worked", I mean they did something, but their behavior was almost always surprising and very dangerous (blindly overwriting or reading state). This PR introduces safeguards around all this and only allows the obvious, straightforward usage of plans: no flags are allowed except -state-out to specify a custom place for the final state.

Future 🛰

In the future, there are more features we plan on supporting with this new backend interface. In many ways, the interfaces introduces in this PR were the minimum to support Terraform currently.

In the future, we plan on introducing: state locking, remote operations, better conflict/merge automation, and more. This lays important groundwork for all of that.

Guide for Review 😎

The order in which you should look at things:

  • backend/ is the new interfaces
  • backend/local is the implementation for local behavior
  • backend/legacy is an implementation for legacy remote state
  • helper/schema was updated to have a framework for Backends
  • command/meta_backend.go contains the meat for loading backends, including legacy remote state. The meta_backend_test.go is a huge file containing tests for hopefully every imaginable case (but probably not, I tried!).

@mitchellh
Copy link
Contributor Author

In the CI: the AWS provider is failing to compile. This is present in master currently, too. That package was not touched for this PR. If it gets fixed in master I'll rebase and repush.

@cemo
Copy link

cemo commented Jan 19, 2017

Great work.

ping @brikis98

@brikis98
Copy link
Contributor

Fantastic! Thanks for the ping @cemo!

@dayglojesus
Copy link

holy cow

@apparentlymart
Copy link
Contributor

(I didn't yet dig into the code so I apologise if this is obvious in there... will dig in soon)

The backend config syntax looks like I could configure multiple backends... is that the case? I find myself wondering about e.g. multiple environments from the same config, or a config that can be optionally applied locally to test before I apply it to the "real" state.

Also yay for doing something with lineage. I was feeling bad about adding that and then never finishing the code to make use of it! There is a PR somewheres for me to find and close on that.

@mitchellh
Copy link
Contributor Author

@apparentlymart You cannot have multiple, it is validated in the config parsing. But a block still felt like the correct structure for that.

Copy link
Member

@jbardin jbardin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to go with LGTM if tests pass after a rebase.

@mitchellh mitchellh force-pushed the f-remote-backend branch 5 times, most recently from 3573ad1 to a8d8d36 Compare January 21, 2017 17:49
@mitchellh
Copy link
Contributor Author

I figured it out: the tests were failing because after this go test ./... now runs out of RAM in Travis. :( I've updated the Makefile to test 4 packages at a time in a loop instead which seems to fix the issue. As a downside, Travis is slower. :(

@jbardin
Copy link
Member

jbardin commented Jan 23, 2017

I made a small addition to the Makefile to install the test dependencies, which will cut the individual test times down significantly when splitting up the tests.

@mitchellh
Copy link
Contributor Author

@jbardin Great! That makes a lot of sense. Thanks!

Backends are a mechanism that allow abstracting the behavior of
Terraform CLI from the actual core. This allows us to slip in special
behavior such as state loading, remote operations, etc.
The local backend implementation is an implementation of
backend.Enhanced that recreates all the behavior of the CLI but through
the backend interface.
This allows using legacy remote state backends with the new backend
interface.
This allows migration of the remote state implementations to a richer
experience including input asking.
This is a complex function that handles all the potential cases that can
happen with legacy remote state, new configurations, etc.
mitchellh and others added 5 commits January 26, 2017 14:33
We were running out of RAM on Travis
Running `go test -i` installs the requirements for a package's tests.
This way when running the tests in batches of 4, we can cut the runtime
in half be only compiling the dependencies once.
@brycefisher
Copy link

brycefisher commented Feb 3, 2017

Apologies for not being fully up to speed on the release cycle -- are these changes in the compiled release binaries or only in master? Has the documentation been updated to reflect these changes?

EDIT: These changes seem amazing -- thanks @mitchellh!

@mitchellh
Copy link
Contributor Author

mitchellh commented Feb 3, 2017

Hey @brycefisher, they will be part of 0.9.0. That is "master" currently, but won't be released as a final release for a bit. We expect betas in Feb. And edit here: Thanks :)

@joslynesser
Copy link

@mitchellh this is an awesome step forward. I can finally remove a lot of functionality I've depended on from outside remote state wrappers with these changes and see a much simpler future for team collaboration. Cheers! 🍺

@jleclanche
Copy link

Excellent changes @mitchellh - looks like terraform 0.9 is removing a huge pain point :) Thank you.

@jerger
Copy link

jerger commented Mar 20, 2017

@mitchellh When using the new backend, how can I access the data from my other terraform modules:
data.terraform_remote_state.my_output_from_other_modules ?

@mitchellh
Copy link
Contributor Author

@jerger The behavior should be the same as before, you can access root outputs, you can't access nested module output (this is documented and the behavior hasn't changed at all)

@jerger
Copy link

jerger commented Mar 22, 2017

@mitchellh thanx for the fast answer. Do you plan to make configuration shareable between backend & remote_state?

@emoshaya
Copy link

Hi all,

We've been using Terragrunt for a while now to manage state locking. Unfortunately, Terragrunt is not compatible with 0.9.0. Therefore, we now want to deprecate use of Terragrunt and migrate to 0.9.0 which fully supports state locking. However, we've noticed a few discrepancies and missing features available in the Terragrunt tool and missing in 0.9.0 Terraform. Please could you advise if these features will be available in future releases and if so, are their any timelines?

The configuration I had in place was for the following:

S3 Backend for States and DynamoDB for State Locking

  • In Terragrunt, if the S3 bucket or DynamoDB table is missing, then Terragrunt would create them for you based on the configuration specified. This is missing in Terraform 0.9.0

  • In Terragrunt, if you ran a Terragrunt apply, it will poll the dynamoDB table for the lock and will retry based on the value assigned in "max_lock_retries: (Optional) The maximum number of times to retry acquiring a lock. Terragrunt waits 10 seconds between retries. Default: 360 retries (one hour)."

  • However, in Terraform 0.9.0 it doesn't retry or poll for lock to be released. Instead you get the following error:

Error locking state: Error acquiring the state lock: ConditionalCheckFailedException: The conditional request failed status code: 400, request id: LTCHNG5G8U10REB2A6DV39LO6BVV4KQNSO5AEMVJF66Q9ASUAAJG Lock Info: ID: a4728cc3-303d-4987-af4a-25a54e954907 Path: ce-cog-test-tfstates/devops-terraform/cog.test.tfstate Operation: OperationTypePlan Who: josh@magpie Version: 0.9.1 Created: 2017-03-22 13:19:29.60094345 +0000 UTC Info:

This is potentially a blocker for our CI builds as we have multiple CI builds running simultaneously at any given day and having the build just retrying to acquire the lock is a must to avoid failing builds.

@mitchellh
Copy link
Contributor Author

Hey @emoshaya-cognito that's a really good point and we were just discussing introducing a -lock-timeout type option. We'll likely do that sooner than later.

@babatundebusari
Copy link

babatundebusari commented Jun 14, 2017

@mitchellh

You mentioned

The backend configuration now lives in the Terraform config

but you never mentioned what the name of this config file is and where this file is located

Thanks

@ghost
Copy link

ghost commented Apr 8, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@ghost ghost locked and limited conversation to collaborators Apr 8, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.