Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[YSQL] Enable Batch Transaction on COPY FROM STDIN #6069

Closed
emhna opened this issue Oct 15, 2020 · 0 comments
Closed

[YSQL] Enable Batch Transaction on COPY FROM STDIN #6069

emhna opened this issue Oct 15, 2020 · 0 comments
Assignees
Labels
area/ysql Yugabyte SQL (YSQL)

Comments

@emhna
Copy link
Contributor

emhna commented Oct 15, 2020

Background

In Postgres, there is a flag called ybDataSent in the TransactionState structure.
When output buffer is flushed, the flag gets set to true, marking that we cannot do a transparent restart anymore.

In CopyFrom() method, prior to copying, ybDataSent flag is also used to verify if the current query is inside another transaction block (ie. nested transaction).
If so, an error will be thrown to disable rows_per_transaction option from being used.
This is to prevent previous transactions to get committed while running batch commits.

Problem

Today, when COPY FROM query is sourced from STDIN, it runs ReceiveCopyBegin() function to flush output buffer and ensure front-end knows it can send.
While flushing, 'ybDataSent' flag also gets set to true even though it is not inside a nested transaction.

Eg.

create table t (a int);
copy t from stdin with (rows_per_transaction 10); 
>> ERROR: ROWS_PER_TRANSACTION option is not supported in nested transaction

Solution

The flag should be explicitly turned off when it is not inside a nested transaction.
Before executing ReceiveCopyBegin() function, we should store the current state of ybDataSent to determine if inside a nested transaction.
After executing the flush,

  • switch off flag (false) if inside transaction block
  • keep the flag turned on (true) if outside transaction block.
@emhna emhna self-assigned this Oct 15, 2020
@emhna emhna added the area/ysql Yugabyte SQL (YSQL) label Oct 15, 2020
emhna added a commit that referenced this issue Oct 19, 2020
…n option.

Summary:
**Background**
In Postgres, there is a flag called `ybDataSent` in the `TransactionState` structure.
When output buffer is flushed, the flag gets set to true, marking that we cannot do a transparent restart anymore.

In `CopyFrom()` method, prior to processing, `ybDataSent` flag is used to verify if the current query is inside another transaction block (ie. nested transaction).
If so, an error will be thrown to disable `rows_per_transaction` option from being used.
This is to prevent previous transactions from getting committed while running batch commits.

**Problem**
Today, when COPY FROM query is sourced from STDIN, it runs `ReceiveCopyBegin()` function to flush the output buffer and ensure front-end knows it can send.
While flushing, `ybDataSent` flag also gets set to true even though it is not inside a nested transaction.

Eg.

```
create table t (a int);
copy t from stdin with (rows_per_transaction 10);
>> ERROR: ROWS_PER_TRANSACTION option is not supported in nested transaction

```
**Solution**
The flag should be explicitly turned off when it is not inside a nested transaction.
Before executing `ReceiveCopyBegin()` function, we should store the current state of `ybDataSent` to determine if it's inside a nested transaction.
After executing the flush,
  - keep the flag turned on to true if inside a transaction block
  - switch off flag to false if outside a transaction block.

Test Plan: Added more tests cases under TestBatchCopyFrom.

Reviewers: jason, mihnea

Reviewed By: mihnea

Subscribers: zyu, yql

Differential Revision: https://phabricator.dev.yugabyte.com/D9648
emhna added a commit that referenced this issue Oct 26, 2020
…ws_per_transaction option.

Summary:
**Background**
In Postgres, there is a flag called `ybDataSent` in the `TransactionState` structure.
When output buffer is flushed, the flag gets set to true, marking that we cannot do a transparent restart anymore.

In `CopyFrom()` method, prior to processing, `ybDataSent` flag is used to verify if the current query is inside another transaction block (ie. nested transaction).
If so, an error will be thrown to disable `rows_per_transaction` option from being used.
This is to prevent previous transactions from getting committed while running batch commits.

**Problem**
Today, when COPY FROM query is sourced from STDIN, it runs `ReceiveCopyBegin()` function to flush the output buffer and ensure front-end knows it can send.
While flushing, `ybDataSent` flag also gets set to true even though it is not inside a nested transaction.

Eg.

```
create table t (a int);
copy t from stdin with (rows_per_transaction 10);
>> ERROR: ROWS_PER_TRANSACTION option is not supported in nested transaction

```
**Solution**
The flag should be explicitly turned off when it is not inside a nested transaction.
Before executing `ReceiveCopyBegin()` function, we should store the current state of `ybDataSent` to determine if it's inside a nested transaction.
After executing the flush,
  - keep the flag turned on to true if inside a transaction block
  - switch off flag to false if outside a transaction block.

Test Plan: Jenkins: rebase: 2.2

Reviewers: jason, mihnea

Reviewed By: mihnea

Subscribers: yql, zyu

Differential Revision: https://phabricator.dev.yugabyte.com/D9699
@emhna emhna closed this as completed Oct 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ysql Yugabyte SQL (YSQL)
Projects
None yet
Development

No branches or pull requests

1 participant