Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve slowest queries by adding a new index composited schedue_a for desc sorting #4352

Closed
1 task
dzhang-fec opened this issue May 13, 2020 · 3 comments · Fixed by #4386
Closed
1 task
Assignees
Milestone

Comments

@dzhang-fec
Copy link
Contributor

dzhang-fec commented May 13, 2020

What we're after:
Improve slowest queries by adding a new index composited schedue_a for desc sorting

Completion criteria:

  • After making change, test to confirm
@dzhang-fec
Copy link
Contributor Author

dzhang-fec commented May 14, 2020

summary:
make a new composite index on desc orders to use index on sorting instead of
Sort Method: quicksort Memory: 1867kB"
to
Sort Method: using index idx_sched_a_2019_2020_cmte_id_dt_sub_id_desc
(note: it depends the DB optimizer to pick one, but there is no any hurts to the performance)
result; 2+times faster.
it takes about 12mins to creare a new one.

Top 1 sqls
Problem order by DESC on contb_receipt_dt DESC and sub_id DESC
but index on ASC order, which reduces the sql performance without on each schedule_a hit.

e.g.,

select ... from ( SELECT disclosure.fec_fitem_sched_a.cmte_id AS cmte_id,
disclosure.fec_fitem_sched_a.rpt_yr AS rpt_yr,
disclosure.fec_fitem_sched_a.rpt_tp AS rpt_tp,
disclosure.fec_fitem_sched_a.image_num AS image_num,
disclosure.fec_fitem_sched_a.line_num AS line_num,
disclosure.fec_fitem_sched_a.tran_id AS tran_id,
disclosure.fec_fitem_sched_a.file_num AS file_num,
disclosure.fec_fitem_sched_a.cmte_nm AS cmte_nm,
disclosure.fec_fitem_sched_a.entity_tp AS entity_tp,
disclosure.fec_fitem_sched_a.entity_tp_desc AS entity_tp_desc,
disclosure.fec_fitem_sched_a.contbr_id AS contbr_id,
disclosure.fec_fitem_sched_a.contbr_prefix AS contbr_prefix,
disclosure.fec_fitem_sched_a.contbr_nm AS contbr_nm,
disclosure.fec_fitem_sched_a.cmte_tp AS cmte_tp,
disclosure.fec_fitem_sched_a.org_tp AS org_tp,
disclosure.fec_fitem_sched_a.cmte_dsgn AS cmte_dsgn,
disclosure.fec_fitem_sched_a.contbr_nm_first AS contbr_nm_first,
disclosure.fec_fitem_sched_a.contbr_m_nm AS contbr_m_nm,
disclosure.fec_fitem_sched_a.contbr_nm_last AS contbr_nm_last,
disclosure.fec_fitem_sched_a.contbr_suffix AS contbr_suffix,
disclosure.fec_fitem_sched_a.contbr_st1 AS contbr_st1,
disclosure.fec_fitem_sched_a.contbr_st2 AS contbr_st2,
disclosure.fec_fitem_sched_a.contbr_city AS contbr_city,
disclosure.fec_fitem_sched_a.contbr_st AS contbr_st,
disclosure.fec_fitem_sched_a.contbr_zip AS contbr_zip,
disclosure.fec_fitem_sched_a.contbr_employer AS contbr_employer,
disclosure.fec_fitem_sched_a.contbr_occupation AS contbr_occupation,
disclosure.fec_fitem_sched_a.clean_contbr_id AS clean_contbr_id,
disclosure.fec_fitem_sched_a.receipt_tp AS receipt_tp,
disclosure.fec_fitem_sched_a.receipt_tp_desc AS receipt_tp_desc,
disclosure.fec_fitem_sched_a.receipt_desc AS receipt_desc,
disclosure.fec_fitem_sched_a.memo_cd AS memo_cd,
disclosure.fec_fitem_sched_a.memo_cd_desc AS memo_cd_desc,
disclosure.fec_fitem_sched_a.contb_receipt_dt AS contb_receipt_dt,
disclosure.fec_fitem_sched_a.contb_receipt_amt AS contb_receipt_amt,
disclosure.fec_fitem_sched_a.contb_aggregate_ytd AS contb_aggregate_ytd,
disclosure.fec_fitem_sched_a.cand_id AS cand_id,
disclosure.fec_fitem_sched_a.cand_nm AS cand_nm,
disclosure.fec_fitem_sched_a.cand_nm_first AS cand_nm_first,
disclosure.fec_fitem_sched_a.cand_nm_last AS cand_nm_last,
disclosure.fec_fitem_sched_a.cand_m_nm AS cand_m_nm,
disclosure.fec_fitem_sched_a.cand_prefix AS cand_prefix,
disclosure.fec_fitem_sched_a.cand_suffix AS cand_suffix,
disclosure.fec_fitem_sched_a.cand_office AS cand_office,
disclosure.fec_fitem_sched_a.cand_office_desc AS cand_office_desc,
disclosure.fec_fitem_sched_a.cand_office_st AS cand_office_st,
disclosure.fec_fitem_sched_a.cand_office_st_desc AS cand_office_st_desc,
disclosure.fec_fitem_sched_a.cand_office_district AS cand_office_district,
disclosure.fec_fitem_sched_a.conduit_cmte_id AS conduit_cmte_id,
disclosure.fec_fitem_sched_a.conduit_cmte_nm AS conduit_cmte_nm,
disclosure.fec_fitem_sched_a.conduit_cmte_st1 AS conduit_cmte_st1,
disclosure.fec_fitem_sched_a.conduit_cmte_st2 AS conduit_cmte_st2,
disclosure.fec_fitem_sched_a.conduit_cmte_city AS conduit_cmte_city,
disclosure.fec_fitem_sched_a.conduit_cmte_st AS conduit_cmte_st,
disclosure.fec_fitem_sched_a.conduit_cmte_zip AS conduit_cmte_zip,
disclosure.fec_fitem_sched_a.donor_cmte_nm AS donor_cmte_nm,
disclosure.fec_fitem_sched_a.national_cmte_nonfed_acct AS national_cmte_nonfed_acct,
disclosure.fec_fitem_sched_a.election_tp AS election_tp,
disclosure.fec_fitem_sched_a.election_tp_desc AS election_tp_desc,
disclosure.fec_fitem_sched_a.fec_election_tp_desc AS fec_election_tp_desc,
disclosure.fec_fitem_sched_a.fec_election_yr AS fec_election_yr,
disclosure.fec_fitem_sched_a.action_cd AS action_cd,
disclosure.fec_fitem_sched_a.action_cd_desc AS action_cd_desc,
disclosure.fec_fitem_sched_a.schedule_type_desc AS schedule_type_desc,
disclosure.fec_fitem_sched_a.pg_date AS pg_date,
disclosure.fec_fitem_sched_a.orig_sub_id AS orig_sub_id,
disclosure.fec_fitem_sched_a.back_ref_tran_id AS back_ref_tran_id,
disclosure.fec_fitem_sched_a.back_ref_sched_nm AS back_ref_sched_nm,
disclosure.fec_fitem_sched_a.filing_form AS filing_form,
disclosure.fec_fitem_sched_a.link_id AS link_id,
disclosure.fec_fitem_sched_a.contributor_name_text AS contributor_name_text,
disclosure.fec_fitem_sched_a.contributor_employer_text AS contributor_employer_text,
disclosure.fec_fitem_sched_a.contributor_occupation_text AS contributor_occupation_text,
disclosure.fec_fitem_sched_a.is_individual AS is_individual,
disclosure.fec_fitem_sched_a.memo_text AS memo_text,
disclosure.fec_fitem_sched_a.two_year_transaction_period AS two_year_transaction_period,
disclosure.fec_fitem_sched_a.schedule_type AS schedule_type,
disclosure.fec_fitem_sched_a.increased_limit AS increased_limit,
disclosure.fec_fitem_sched_a.sub_id AS sub_id,
disclosure.fec_fitem_sched_a.pdf_url AS pdf_url,
disclosure.fec_fitem_sched_a.line_number_label AS line_number_label
FROM disclosure.fec_fitem_sched_a
WHERE disclosure.fec_fitem_sched_a.cmte_id IN ('C00415331')
AND disclosure.fec_fitem_sched_a.two_year_transaction_period IN (2020)
ORDER BY disclosure.fec_fitem_sched_a.contb_receipt_dt ,
disclosure.fec_fitem_sched_a.contb_receipt_dt ,
disclosure.fec_fitem_sched_a.sub_id ) AS anon_1
left outer join ofec_committee_history_mv AS ofec_committee_history_mv_1 ON anon_1.cmte_id = ofec_committee_history_mv_1.committee_id
AND
anon_1.two_year_transaction_period = ofec_committee_history_mv_1.CYCLE left outer join ofec_committee_history_mv AS ofec_committee_history_mv_2 ON anon_1.clean_contbr_id = ofec_committee_history_mv_2.committee_id
AND
anon_1.two_year_transaction_period = ofec_committee_history_mv_2.CYCLE

ORDER BY anon_1.contb_receipt_dt , anon_1.sub_id ;

Analysis/solution

-- drop INDEX idx_sched_a_2019_2020_cmte_id_dt_sub_id from

CREATE INDEX idx_sched_a_2019_2020_cmte_id_dt_sub_id
ON disclosure.fec_fitem_sched_a_2019_2020 USING btree
(cmte_id COLLATE pg_catalog."default" ASC NULLS LAST, contb_receipt_dt ASC NULLS LAST, sub_id ASC NULLS LAST)
TABLESPACE pg_default;

to

CREATE INDEX idx_sched_a_2019_2020_cmte_id_dt_sub_id_desc
ON disclosure.fec_fitem_sched_a_2019_2020 USING btree
(cmte_id COLLATE pg_catalog."default" ASC NULLS LAST, contb_receipt_dt DESC NULLS LAST, sub_id DESC NULLS LAST)
TABLESPACE pg_default;

improve about 2+ times based on the data.

@dzhang-fec
Copy link
Contributor Author

Jean and I had an discussion to do this on these 2 2019_2020 and 2017_2018 firstly.
This has been applied to prod, dev. and stage.

CREATE INDEX idx_sched_a_2019_2020_cmte_id_dt_sub_id_desc
ON disclosure.fec_fitem_sched_a_2019_2020 USING btree
(cmte_id COLLATE pg_catalog."default" ASC NULLS LAST, contb_receipt_dt DESC NULLS LAST, sub_id DESC NULLS LAST)
TABLESPACE pg_default;

CREATE INDEX idx_sched_a_2017_2018_cmte_id_dt_sub_id_desc
ON disclosure.fec_fitem_sched_a_2017_2018 USING btree
(cmte_id COLLATE pg_catalog."default" ASC NULLS LAST, contb_receipt_dt DESC NULLS LAST, sub_id DESC NULLS LAST)
TABLESPACE pg_default;

drop INDEX disclosure.idx_sched_a_2017_2018_cmte_id_dt_sub_id;
drop INDEX disclosure.idx_sched_a_2019_2020_cmte_id_dt_sub_id;

@fecjjeng
Copy link
Contributor

fecjjeng commented Jun 2, 2020

idx_sched_a_%_cmte_id_dt_sub_id_desc had been created in all fec_fitem_sched_a partition tables in DEV/STG/PRD environment in advance during off-peak hour. The existence of the updated index can be confirmed as described in PR #4386
However, migration file V0199 had been submit to keep the version control process.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants