Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Meeting with Four Points and AWS #3769

Closed
2 of 5 tasks
PaulClark2 opened this issue May 15, 2019 · 4 comments
Closed
2 of 5 tasks

Meeting with Four Points and AWS #3769

PaulClark2 opened this issue May 15, 2019 · 4 comments
Assignees
Milestone

Comments

@PaulClark2
Copy link
Contributor

PaulClark2 commented May 15, 2019

What we are after: As website user, I want data pages to load quickly so that I can see the information I'm interested in seeing and go on with my day.

Why are we after it: We want to be able to present data to users quickly even when they are searching across multiple two-year periods.

Completion criteria:

  • Have meeting
  • Produce list of next steps

Examples:

  • n/a

General technical

  • n/a

Technical steps

  • Provide existing architecture
  • Provide slow queries
  • Contract for support
@PaulClark2
Copy link
Contributor Author

PaulClark2 commented May 20, 2019

  • Set up call with RDS expert (Priyanka)

Next 2 weeks:

  • Set up meeting to review Landing Zone and Security Workstreams (Tyler)
  • Set up meeting to scope out DR solution (Tyler)
    • Landing Zone, Security Workstreams and DR discussion will occur during the same meeting

Next 4-6 weeks:

  • Review immersion day topics/schedule (Priyanka)
  • Set up security workshop (Tyler/Kitty)

Tyler Penn
Amazon Web Services
E-mail: tylepenn@amazon.com
Phone: 571-334-1680

@PaulClark2
Copy link
Contributor Author

  • Our current Postgres DB engine is very much under utilized. This setup can handle much more load than it is getting right now.
  • Lack of Query Performance caused by poorly written SQL statements (AWS consultant does not offer user query tuning - need to seek help from 4-Points on this)
  • No tuning necessary on DB parameters and/or DB resources.
  • It is best to create a another MV combining fec_fitem_sched_a table and ofec_committee_history_mv (avoid making double outer joins to get committee and contributor information)
  • Reduce number of columns return from queries
  1. SQL Alchemy is not the best tool for generating reports due to the fact that it generates dynamic SQL that may not be optimized. This can have exponentially negative effects as the reports get larger. Case in point, the query I rewrote returned in ~100 seconds versus ~1,200 seconds when generated by SQL Alchemy. You may want to consider having purpose-built fact tables for each type of report in which the attributes are projected into the table. This could be a materialized view or regular table in PostgreSQL. And pass the search criteria as bind variables to the query. The queries themselves would be managed and controlled as source code.

  2. You may want to consider leveraging Redshift with multiple instances of the same fact tables with different distribution and sort keys to accommodate different query predicate patterns. Redshift is significantly faster than RDS for this type of workload because it's an MPP columnar database.

  3. When a user makes a request to download a large data set, have that dataset generated in the background into an S3 bucket instead of the website. Once the data has been generated in the S3 bucket, a notification can be sent to the user via SNS with the link to download the dataset. This is a common approach to sharing large data set.

@rjayasekera
Copy link
Contributor

Hi Rohan,

Hope the meeting with Stephen has been good for you. I was wondering if you would like to see a Redshift demo which will give you an idea on comparison between how you are currently doing things in RDS to what it could look like and how does that make it better for you. From there, we could hopefully draw a conclusion as to which technology is better suited for your use case. If you would like to see this, please let me know and I can set this up with the Specialist.

Thanks a ton!

Priyanka Mahankali
AWS Solutions Architect
Federal Civilian Customers
Washington D.C, US
E: mahanka@amazon.com | M: +1 571 241 3215

I'll coordinate with Priyanka to have a RedShift demo for us next couple of weeks

@dorothyyeager
Copy link
Contributor

Next steps have been determined and future meetings and a demo have been set up. See #3769 (comment) and following comments above.

Work is complete, so closing this ticket.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants