-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[YSQL] Improve index scan to batch select rows from base table #3103
Comments
nocaway
added a commit
that referenced
this issue
Feb 25, 2020
Summary: **Background**: The query SELECT <data> FROM <table> WHERE ybctid IN (SELECT base_ybctid FROM <index>) Currently, after getting ybctids from IndexTable, ybctid values are sent to tablet server one at a time to select one row at a time from the storage. To improve the performance, ybctid values are now sent in batches. **IndexScan's New process** - Fetch ybctids from IndexTable in batches. Size of each batch is determined by the yql_prefetch_limit gflag. - The selected ybctids are then grouped by hash-code buckets that are associated with the table tablets. - The ybctids batches are sent to tablet servers to query rows in batches. - The above steps are repeated until all ybctids are read from the index. **Implementation** (1) New data structures Two different SELECT classes are introduced, PgSelect and PgSelectIndex. PgSelect is to query data from table. PgSelectIndex is to query ybctid from IndexTable. - Sequential and PrimaryKey scans use PgSelect to query data from table. - IndexOnlyScan uses PgSelectIndex to query data from IndexTable - IndexScan uses both PgSelect and PgSelectIndex. When index-scanning system catalogs, PgSelect and PgSelectIndex combine their read-request into one. This has been done in the past. (2) Parallel processing ybctids when querying data. - For each partition, one read request is created. - Ybctid values within a partition are added to their associated read request. - The requests are sent to select data. - PgGate also keeps track of the orders of ybctid values and send them to users in the same order as the indexing order. Test Plan: Testing is in progress Reviewers: mihnea Reviewed By: mihnea Subscribers: yql Differential Revision: https://phabricator.dev.yugabyte.com/D7952
Fixed by eeab192 |
nocaway
added a commit
that referenced
this issue
Mar 13, 2020
Summary: This regression is due the following commit eeab192 ``` commit eeab192 Author: neil <nocaway@users.noreply.github.com> Date: 2020-02-24 [YSQL] #3103 Improve performance when running index scan to query data ``` When no-ybctid is found from the IndexTable, PgGate should stop the processing effort and return empty result-set. However, it continues processing SELECT from the main table and issues a read request for full-scan. This bug only affect the scenarios where no ybctid is found by the secondary-index scan. When ybctids are found, PgGate issues a read request for only those ybctids. Test Plan: Extend test yb_perf_secondary_index_scan.sql for the reported issue. Reviewers: kannan, mihnea Reviewed By: kannan, mihnea Subscribers: kannan, yql Differential Revision: https://phabricator.dev.yugabyte.com/D8118
Closed
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Today, for index scans, we first scan the index table to get the rows of interest. Then for each row, we send 1 read query per row to tserver. We should batch these reads from tserver instead.
The text was updated successfully, but these errors were encountered: