Skip to content
This repository has been archived by the owner on Feb 22, 2022. It is now read-only.

add helm v2 chart for patroni with petset #57

Merged
merged 9 commits into from
Sep 26, 2016

Conversation

linki
Copy link
Contributor

@linki linki commented Sep 2, 2016

Patroni: A Template for PostgreSQL HA with ZooKeeper, etcd or Consul

Patroni is a template for you to create your own customized, high-availability solution using Python and a distributed configuration store like ZooKeeper, etcd or Consul.

This directory contains a Kubernetes chart to deploy a five node patroni cluster using a petset.

It depends on a running etcd cluster for distributed shared state which can be installed using the etcd chart: https://github.com/kubernetes/charts/tree/master/incubator/etcd

Please find out more about patroni at https://github.com/zalando/patroni

Caveat: Currently this chart can only be deployed in Kubernetes' default namespace to function.

@viglesiasce
Copy link
Contributor

Awesome stuff @linki!!! You can vendor in your dependent chart (etcd) by putting it in the "charts" directory of your chart. This way the patroni chart will be useful out of the gate without extra setup steps.

The vendoring process will improve so that it is less heavy handed in the future. For example when we get helm/helm#874

@linki
Copy link
Contributor Author

linki commented Sep 5, 2016

@viglesiasce thanks for pointing out the vendoring feature. I vendored etcd and have patroni use it by default now.

@viglesiasce
Copy link
Contributor

Thanks @linki!! I'm on holiday today but will give it a look tomorrow.


```bash
$ helm delete <release-name>
$ kubectl delete petset,po,pvc,svc,secret -l release=<release-name>
Copy link
Contributor

@chrislovecnm chrislovecnm Sep 5, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This won't work btw. Lets check this bashfu into here:

$ grace=$(kubectl get po web-0 --template '{{.spec.terminationGracePeriodSeconds}}')
$ kubectl delete petset,po -l app=nginx
$ sleep $grace
$ kubectl delete pvc -l app=nginx

Update nginx with your petset name. Also, the code block should be "console" not "bash".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original bash worked fine for me. How did it fail for you @chrislovecnm?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed to take terminationGracePeriodSeconds into account

@viglesiasce
Copy link
Contributor

1 nit on the values file but the rest looks good from a code perspective. That having been said I can't figure out how to connect using psql.

@linki,

Can you provide an example of how to get started so I can verify the functionality?

In a follow up PR I'd like to ask that you add a NOTES.txt file so that after installation users know how to connect to patroni. Here is what I did for the Jenkins chart:
https://github.com/viglesiasce/charts/blob/6d39e6dbe5f8698255bd47eff2d1ccaf2db948f4/incubator/jenkins/templates/NOTES.txt

@viglesiasce
Copy link
Contributor

@k8s-bot test this

@linki
Copy link
Contributor Author

linki commented Sep 16, 2016

@viglesiasce Thanks for your suggestions. I added documentation on how to connect to Postgres in the readme. I would love to get your feedback.

The NOTES.txt approach didn't work for me (helm sends it to Kubernetes as a manifest on helm install).

I also reacted to the remaining open issues we discussed.

@chrislovecnm
Copy link
Contributor

@viglesiasce were are we at with this one?

@viglesiasce
Copy link
Contributor

Hey @linki

Thanks for the updates. NOTES.txt will only work with alpha.4+. I will retest now and if all things go as planned will merge in. Can you add NOTES.txt as a follow on PR? It will be required to move from incubator->stable.

@chrislovecnm
Copy link
Contributor

@linki let me rephrase what @viglesiasce asked. Can you please put in an issue to update NOTES.txt. Much appreciated!

@viglesiasce
Copy link
Contributor

@linki

I was not able to get failover to work. I did the following:

  1. Install chart with all default values
  2. Connect to it as mentioned in the README
  3. Create a DB, create some tables
  4. Kill the master pod

After that point no endpoints were added back to the service although the pod was rebuilt. The following is in the logs:

DETAIL: The exception type is <class 'oauth2client.client.ApplicationDefaultCredentialsError'> and its value is An error was encountered while reading json file: /etc/patroni/gcloud-credentials.json (pointed to by GOOGLE_APPLICATION_CREDENTIALS environment variable): Expecting property name enclosed in double quotes: line 5 column 3 (char 83) and its traceback is   File "/usr/local/lib/python3.4/dist-packages/wal_e/retries.py", line 62, in shim
            return f(*args, **kwargs)
          File "/usr/local/lib/python3.4/dist-packages/wal_e/blobstore/gs/utils.py", line 98, in download
            blob = _uri_to_blob(creds, url)
          File "/usr/local/lib/python3.4/dist-packages/wal_e/blobstore/gs/utils.py", line 28, in _uri_to_blob
            conn = calling_format.connect(creds)
          File "/usr/local/lib/python3.4/dist-packages/wal_e/blobstore/gs/calling_format.py", line 16, in connect
            credentials = get_credentials()
          File "/usr/local/lib/python3.4/dist-packages/gcloud/credentials.py", line 82, in get_credentials
            return client.GoogleCredentials.get_application_default()
          File "/usr/local/lib/python3.4/dist-packages/oauth2client/client.py", line 1288, in get_application_default
            return GoogleCredentials._get_implicit_credentials()
          File "/usr/local/lib/python3.4/dist-packages/oauth2client/client.py", line 1273, in _get_implicit_credentials
            credentials = checker()
          File "/usr/local/lib/python3.4/dist-packages/oauth2client/client.py", line 1248, in _implicit_credentials_from_files
            extra_help, error)
          File "/usr/local/lib/python3.4/dist-packages/oauth2client/client.py", line 1446, in _raise_exception_for_reading_json
            credential_file + extra_help + ': ' + str(error))
          There have been 4628 attempts to fetch wal file gs://some-google-bucket/spilo/exiled-snake-patroni/wal/wal_005/00000002.history.lzo so far.

Is the GCS integration required? Is there a way to turn it off by default?

@chrislovecnm
Copy link
Contributor

@viglesiasce my question about GCE is bigger than that. How is Patroni integrating with GCE and what is it getting from that integration. Do we need to integrate with EC2? How does this run on bare metal and vSphere.

@viglesiasce
Copy link
Contributor

@chrislovecnm Patroni uses object storage for backup/restore. If it were to be integrated into AWS it would leverage S3 for the same use case.

@chrislovecnm
Copy link
Contributor

@viglesiasce and vSphere and Bare Metal? Does it have support there?

@viglesiasce
Copy link
Contributor

Minio and RiakCS can be used on premise to provide S3 compatible object storage.

@linki
Copy link
Contributor Author

linki commented Sep 22, 2016

@viglesiasce @chrislovecnm thanks for the hint. create the issue: #83

@linki
Copy link
Contributor Author

linki commented Sep 22, 2016

@viglesiasce @chrislovecnm thanks for trying it out and the healthy discussion.

yes, the chart currently requires GCE credentials and the default is invalid. We specifically targeted GCE for the first iteration. Seeing the error message this is likely your issue.

Patroni is also capable of leveraging S3 for the same purpose. It's also possible to run it without any backup location. In those cases restore times significantly increase and consequently reliability suffers.

Not sure if it supports a mounted volume, but i'll find out.

How about we turn the Google cloud storage integration off by default and then add the possibility to opt-in to GCS and S3 storage in separate PRs?

@viglesiasce
Copy link
Contributor

@linki that plan sounds great. Let's err on the side of functional but not performant to start. With the follow up PRs you can also leverage just the S3 implementation and use GCS' interop mode.

Thanks again!

@linki
Copy link
Contributor Author

linki commented Sep 23, 2016

@viglesiasce I changed the chart to not use GCS/S3 by default. You should be able to test it now.

two gotchas:

  • when connected to postgres (via the service url) and the current master fails, the connection gets disrupted (explicit reconnect required)
  • works only in default namespace (we have a fix, but no pushed image yet, zalando/spilo@7006d04)

known bug:

  • very long release names result in a wrong default ETCD_DISCOVERY_DOMAIN env var, not allowing patroni to find etcd.

@chrislovecnm
Copy link
Contributor

when connected to postgres (via the service url) and the current master fails, the connection gets disrupted (explicit reconnect required)

Any idea what is going on there?

Copy link
Contributor

@viglesiasce viglesiasce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worked as expected this time around @linki!!! Thanks so much for the effort, looks great.

Don't forget to submit another PR with your NOTES.txt

@viglesiasce viglesiasce merged commit 7932838 into helm:master Sep 26, 2016
@lasomethingsomething
Copy link

@viglesiasce Hi! @mikkeloscar created #83. Let us know what else you need from us!

viglesiasce pushed a commit to viglesiasce/charts that referenced this pull request Oct 8, 2016
* feat(patroni): add helm v2 chart for patroni with petset

* chore(patroni): vendor etcd chart

* feat(patroni): have patroni use vendored etcd by default

* fix(patroni): use console instead of bash

* fix(patroni): make termination commands more reliable

* docs(patroni): add documentation to the configurable values

* docs(patroni): describe how to access postgres

* docs(patroni): add some details about how patroni on k8s works

* fix(patroni): do not require GCS by default
@linki linki deleted the feat-add-patroni branch November 7, 2016 15:35
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants