Skip to content

Latest commit

 

History

History
53 lines (46 loc) · 2.48 KB

HISTORY.rst

File metadata and controls

53 lines (46 loc) · 2.48 KB

WAL-E's History

WAL-E's development began Heroku in early 2011, commissioned by the Department of Data with the intent of ensuring that appreciable data loss on Heroku would have to be accompanied by spectacularly obscure mechanisms: it was not acceptable to lose a lot of data in common cases of media failure or operator error. The subsequent replication-related "fork" and "follow" features did not exist. The "Heroku Postgres" product was not yet a concept.

Within one month after the first commit of WAL-E, the very earliest versions were deployed to Heroku production, v0.2.0. At this point, it was somewhat unclear that WAL-E even worked, which was to be determined at leisure following deployment. However, within two weeks, the large April 30th, 2011 AWS crash took place, and this was the first of several large-scale disaster recoveries that took place under similar circumstances at Heroku. This fortuitous timing of events — including WAL-E happening to work on the first try — exhibits a rare circumstance where cutting the corners of engineering practice to deploy slightly earlier prevented enormous heartache for Heroku staff and its customers.

At the time, WAL-E was little more than a shell-script-esque Python program wrapping around s3cmd. It had terrible error messages, blunt error handling — sometimes none, with hazardous results, and lacked parallelism. However, by this time, the critical four operators and one critical concept of WAL-E had gelled and been shown to work well:

  • wal-push
  • wal-fetch
  • backup-push
  • backup-fetch
  • WALE_S3_PREFIX

Also in April of 2011, the v0.5 series was released. Notably, this was the first version possessing a stable version of the storage format in S3, version '005'.

By May of 2011, it became clear that a lot more control over error handling and exceptions would be necessary to make WAL-E an ergonomic tool not hated by all its operators. It was also desirable to use less aggregate virtual memory, as tended to occur when fork()-ing subprocesses. The was the cause of major re-positioning to use boto and gevent in WAL-E. Although started in May of 2011, the bulk of development took place in September and October of 2011, culminating in v0.5.8. This version is the first that is looks recognizably like WAL-E would for quite some time to come.

In January of 2013, WAL-E inducted its first committers beyond the principal author at Heroku, and its project development gained a distinct identity from Heroku.