-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issues with external PostgreSQL on AWS RDS #270
Comments
Hi @miaeyg. The steps that you've followed to download the AWS CA file in PEM format are what you need to fix the error. Since you are using the DAC docker container and specifying the volume mapping as |
Hi @dhoucgitter, tried your suggestion as well and got the same results (Java SSL exceptions in all SAS services) |
Hi @miaeyg, did you run DAC using "viya,uninstall" tags to remove your previous deployment if you haven't already, then reinstall with "viya,install"? Also scan for the string "Running: ansible-playbook" in the initial lines of the docker run output to inspect the value of V4_CFG_TLS_TRUSTED_CA_CERTS being set. |
Hi @dhoucgitter, each time I do a test I always uninstall and reinstall. site-config/vdm/security/cacerts/il-central-1-bundle.pem The contents of the second file is
The second file is also included in the "kustomization.yaml" like this:
Note this line:
So it looks like the AWS CA PEM file was captured in the DaC docker and included in the correct places. |
In addition to my previous reply I want to suggest that the generated file "site-config/vdm/generators/customer-provided-ca-certificates.yaml" contains a syntax error "cacerts1.pem=" in the last line:
The source of this possible syntax error is here https://github.com/sassoftware/viya4-deployment/blob/4ef910891e6d6e6dca0bccbb7fd013c93a589ae0/roles/vdm/templates/generators/customer-provided-ca-certificates.yaml#L8 By the way, I tried to fix this by removing the "cacerts1.pem=" text and it did not help. |
The DAC generated
format as well so I do not expect that is causing the problem. |
Hi @miaeyg, thanks for noticing that and good catch. The description for the
|
Thanks.
Note the default PVC size for internal PG is 128GB so one would expect the defsult size for the external PG to be the same.
About the SSL problem I think I know the reason but can be sure only later today. I believe I downloaded the CAs of the wrong region... I have two accounts in different regions and I accidently downloaded the CAs of the other region. Quite sure this is the root cause. Will update again once I test with the correct region.
קבל Outlook עבור Android<https://aka.ms/AAb9ysg>
…________________________________
From: David Houck ***@***.***>
Sent: Tuesday, January 30, 2024 9:47:23 PM
To: sassoftware/viya4-iac-aws ***@***.***>
Cc: Eyal Gonen ***@***.***>; Mention ***@***.***>
Subject: Re: [sassoftware/viya4-iac-aws] Issues with external PostgreSQL on AWS RDS (Issue #270)
EXTERNAL MAIL
Hi @miaeyg<https://github.com/miaeyg>, thanks for noticing that and good catch. The description for the storage_size input variable should indicate that the units are in GB, not MB. I will open a ticket to fix that doc reference. Will also handle adjusting the default size if needed.
1. I think there is a error in the description of the "storage_size" parameter described here https://github.com/sassoftware/viya4-iac-aws/blob/main/docs/CONFIG-VARS.md#postgresql-server
The descriptions says: "Max storage allowed for the PostgreSQL server in MB"
but looking at https://registry.terraform.io/modules/terraform-aws-modules/rds/aws/latest and the parameter "allocated_storage" (which uses the "storage_size" input var) is measured in GB, not MB: "The allocated storage in gigabytes".
—
Reply to this email directly, view it on GitHub<#270 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AMSUNY4L7MT2Z55IIYZQ5GDYRFE4XAVCNFSM6AAAAABCQZVSJSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMJXG43TEMZYGA>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Hi @dhoucgitter, about the SSL problem - by bad! I was using the wrong AWS CA PEM file as I downloaded by mistake the PEM for a different region than the region my RDS was running on (I have two accounts in different regions so I downloaded the PEM appropriate for the second account). Got it to work now. Sorry for the mess. I have another related question - the generated RDS instance is marked as "Publicly accessible=yes". What if I do not want to make it public. How can I make it private and not public using this Terraform project? Is it supported to have it non-public? Also, how can I control in which AZ the generated RDS instance will be created using this project? SAS recommends that the RDS instance will be in the same AZ as the SAS instances to avoid latency and to improve performance. Also, it seems like the RDS instance generated by this project uses the following CA from AWS: " |
Hi @miaeyg, where do you see that your RDS instance is using the |
Hi @dhoucgitter, I have already destroyed my env. so I do not have a screenshot for you but I went to RDS and saw the DB instance, clicked on it and in the details screen there was a "Certificate Authority' option and the value was "rds-ca-2019". I am sure about it. I have created the RDS instance in the "us-east-1" region. Is this the same region you are testing? Is there an option to control the CA used for the RDS instance within any of the parameters/options of the "postgres_servers" variable in terraform.tfvars file? Here is my "postgres_servers" section from terraform.tfvars:
|
Found an AWS blog entry that seems relevant to the discussion:
|
Thanks @dhoucgitter ! I have another related question - the generated RDS instance is marked as "Publicly accessible=yes". What if I do not want to make it public. How can I make it private and not public using this Terraform project? Is it supported to have it non-public? Also, how can I control in which AZ the generated RDS instance will be created using this project? SAS recommends that the RDS instance will be in the same AZ as the SAS instances to avoid latency and to improve performance. |
Hi @miaeyg, the CA being assigned to new RDS instances seems to be different in different regions. I have opened a AWS case asking about the discrepancy between RDS instances created in us-west-1 where they are assigned the
The RDS instance AZs are controlled by the AZs of the subnets that the RDS instance is using. The last few sentences of the first paragraph in the Subnet Requirements section helps explain the possibilities. |
Hi @dhoucgitter thanks. Reading the doc you pointed me to says:
Can I assume that first database subnet (and hence the corresponding AZ) on the provided list to terraform.tfvars file will be used for the RDS Master node and the second database subnet on the list will be used for the RDS Standby node? If not, then I do not understand how I can force the RDS Master to be on the same AZ as the private subnet's AZ (where all SAS components reside)? |
Hi @miaeyg, as you described, the behavior that I have observed is that RDS deploys the PostgresSQL database instance to the AZ corresponding to the first subnet in the list, whether that is the public subnet list or the database subnet list. I have not experienced or forced a DB failover however my expectation is that if a local AZ event caused the running PostgresSQL db to fail, RDS would start up a new instance in the AZ corresponding to the second subnet in the list as you describe. |
Hi @dhoucgitter thanks. I think this info should be reflected in the project's doc to have the RDS in the same AZ as SAS components by setting so the first database subnet will be in the same AZ as the private subnet's AZ for performance reasons. Still waiting on the last question about how can I make the RDS instance non-public. Thanks for all your kind help! |
Hi @dhoucgitter in addition to my previous comment from yesterday, there is one more question that came up - when setting up the "default" and "CDS" PostgreSQL as external PostgreSQL meaning as AWS RDS Database instances is it possible to setup only one database instance for both and have the two databases (default + CDS) reside within this single database instance? I believe that using the supplied examples in this project will result in two database instances each with one database - correct me if I am wrong |
Hi @miaeyg,
You are right, the intent is to set up two separate database instances, each with it's own database. There may be additional concerns, but one reason for the separate instance is to provide a dedicated and thus higher performance path for the applications that plan to rely upon the CDS instance without suffering any latencies that may occur for those apps when using the default DB instance that most other Viya applications may already using. |
Hi @miaeyg, circling back around on the unanswered question.
Yes, IAC supports creation of a private or publicy_accessible=false RDS instance. If your .tfvars content specifies a CIDR range for either |
Thanks @dhoucgitter for answering this leftover question and all my other previous RDS questions! |
Hi @dhoucgitter, I am reopening this issue - maybe you will have an idea. I just deployed SAS Viya in AWS using this project and noticed that RDS instance was deployed in the AZ corresponding to the second (not first as the assumption was) in the "database" list. I am using BYON scenario #2. Question is if and how to control in which AZ the RDS deploys to. Here are screenshots of what I noticed: Note RDS instance was deployed to the il-central-1b AZ: The terraform.tfvars includes this: And you can see that the first subnet in the "database" list is in the il-central-1a AZ: I guess this is bad from performance perspective. Any idea what happened and how to control this behavior of RDS deployment? Here is your reply about this some time ago:
|
Hi @dhoucgitter, I am reopening this issue - maybe you will have an idea. I just deployed SAS Viya in AWS using this project and noticed that RDS instance was deployed in the AZ corresponding to the second (not first as the assumption was) in the "database" list. I am using BYON scenario #2. Question is if and how to control in which AZ the RDS deploys to. Here are screenshots of what I noticed: Note RDS instance was deployed to the il-central-1b AZ: The terraform.tfvars includes this: And you can see that the first subnet in the "database" list is in the il-central-1a AZ: I guess this is bad from performance perspective. Any idea what happened and how to control this behavior of RDS deployment? Here is your reply about this some time ago:
|
Hi @miaeyg , what happens if you remove the second subnet and only supply the first one. Is RDS able to successfully place the postgres instance there? This is only to verify that RDS can create the instance there and that your ID is correct along with the other bits, not a permanent solution. |
Hi @dhoucgitter your suggestion to supply only 1 database subnet fails to create an RDS instance with this error:
It seems like supplying 2 subnets in two different AZs is an AWS requirement but it seems like AWS RDS instance is not necessarily placed in the first database subnet provided to Terraform, as I have shown in my previous comment, which can cause SAS private subnet to be placed in AZ1 and RDS instance to be placed in AZ2 which I understand is bad performance-wise. How can we force AWS to place the RDS instance in the same AZ as the SAS private subnet? |
Hi @dhoucgitter did you see me previous update on Jun 20th? Do you have any suggestion for this? |
I have an issue and a question about using an external PostgreSQL on AWS (RDS) instead of the SAS internal default postgreSQL (Crunchy):
I think there is a error in the description of the "storage_size" parameter described here https://github.com/sassoftware/viya4-iac-aws/blob/main/docs/CONFIG-VARS.md#postgresql-server
The descriptions says: "
Max storage allowed for the PostgreSQL server in MB
"but looking at https://registry.terraform.io/modules/terraform-aws-modules/rds/aws/latest and the parameter "allocated_storage" (which uses the "storage_size" input var) is measured in GB, not MB: "
The allocated storage in gigabytes
". In addition the parameter's default value of 50 seems low compared to the Crunchy default value which is 128GB.I installed SAS with an external PostgreSQL and used the default value for "ssl_enforcement_enabled" which is "true" and now my SAS services to not start. I see in the log of sas-logon-app (as an example) this Java SSL exception below. Question is how do I fix this exception?
java.lang.IllegalStateException: org.postgresql.util.PSQLException: SSL error: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
at com.sas.commons.db.postgres.PostgresDbInitializer.doCreate(PostgresDbInitializer.java:199)
at com.sas.commons.db.postgres.PostgresDbInitializer.createSchema(PostgresDbInitializer.java:158)
I downloaded the AWS CA in PEM format for my region from https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/UsingWithRDS.SSL.html#UsingWithRDS.SSL.RegionCertificates and saved in a local directory then when I invoked the DaC project I added this option to the invocation of the viya-deployment Ansible project:
--volume /home/sasinst/deployments/aws-ca:/config/v4_cfg_tls_trusted_ca_certs viya4-deployment
and added this line to ansible-vars.yaml:V4_CFG_TLS_TRUSTED_CA_CERTS: "/home/sasinst/deployments/aws-creds"
but it did not helpThe text was updated successfully, but these errors were encountered: