Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: [Spanner] make ValueMapper customizable #5700

Closed

Conversation

taka-oyama
Copy link
Contributor

@taka-oyama taka-oyama commented Dec 13, 2022

This PR makes Spanner's ValueMapper customizable.

I made this PR because I'd like to modify ValueMapper so that decodeValues return raw values instead of column types like TIMESTAMP being converted into DateTimeImmutable automatically.
Right now, there is no good way to achieve this.

I want this feature because in Laravel (the web framework I use and wrote a spanner driver for), data is expected to be passed to the model as raw values and should be converted to useful types only when it is actually used, which is more efficient since not all rows retrieved from the database are actually used during the request's lifecycle.

This feature does not introduce any breaking changes.

Please let me know what you all think.

Thank you.

@taka-oyama taka-oyama requested review from a team as code owners December 13, 2022 09:33
@google-cla
Copy link

google-cla bot commented Dec 13, 2022

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@conventional-commit-lint-gcf
Copy link

🤖 I detect that the PR title and the commit message differ and there's only one commit. To use the PR title for the commit history, you can use Github's automerge feature with squashing, or use automerge label. Good luck human!

-- conventional-commit-lint bot
https://conventionalcommits.org/

@taka-oyama taka-oyama force-pushed the feature/custom-value-mapper branch 2 times, most recently from 121dc00 to 699547a Compare December 16, 2022 05:08
@taka-oyama
Copy link
Contributor Author

Rebased to fix conflict.

CLA signing is going through the corporate pipeline and should be done by next week.

@bshaffer
Copy link
Contributor

Hi @taka-oyama ! Thanks for your contribution. This looks great, but I have a few questions.

I made this PR because I'd like to modify ValueMapper so that decodeValues return raw values instead of column types like TIMESTAMP being converted into DateTimeImmutable automatically.

Is there a reason why you can't use DateTimeImmutable to convert back to the raw TIMESTAMP? Are there other examples of this?

@taka-oyama
Copy link
Contributor Author

taka-oyama commented Dec 19, 2022

Hi @bshaffer, thanks for the quick reply.

Is there a reason why you can't use DateTimeImmutable to convert back to the raw TIMESTAMP?

I could do that, and I plan on doing so if this doesn't work out, but I wanted to make an effort to make it run as optimal as possible, since we have had incidents in the past where unnecessary DateTime conversion was using up 10% of the total CPU in some projects.

Are there other examples of this?

By "this", did you mean an example where a custom ValueMapper might be needed?

I haven't encountered any cases for any of the projects I've encountered, but I think the same could be said for DATE type which is also converted to DateTimeImmutable and for BYTE type which are decoded and converted into streams.

By the way, I noticed there was a similar request in google-cloud-go (googleapis/google-cloud-go#854) which was accepted.

Copy link
Contributor

@bshaffer bshaffer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@taka-oyama thank you for your prompt response!
I am shocked that DateTime processing would take up 10% of the total CPU, but perhaps processing thousands of datetime fields could have this effect?

Thank you for providing the link to the Go PR as well, that is very helpful.

I'm placing a code sample here so others can see how this option would be used in practice:

$spanner = new SpannerClient();
$instance = $spanner->instance('instance-name');
$database = $instance->database('database-name', ['valueMapper' => $myCustomDataMapper]);

Spanner/src/Database.php Outdated Show resolved Hide resolved
Spanner/src/Instance.php Outdated Show resolved Hide resolved
Spanner/tests/Unit/OperationTest.php Show resolved Hide resolved
@taka-oyama taka-oyama force-pushed the feature/custom-value-mapper branch 2 times, most recently from efea8a9 to 8827eff Compare December 20, 2022 03:44
@taka-oyama
Copy link
Contributor Author

taka-oyama commented Jan 10, 2023

Hi, sorry for the long wait.

The CLA is now passing.

I am shocked that DateTime processing would take up 10% of the total CPU, but perhaps processing thousands of datetime fields could have this effect?

Yes, this happened when the code was processing a few thousand rows with many datetime columns.

I believe cases like this happen because most web frameworks just read columns with * by default and map all values to a model. This makes managing data easier but at the cost of performance. By adding this change, it will allow us to modify the value mapper so that reading unused columns will put less stress on performance.

Copy link
Contributor

@bshaffer bshaffer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is LGTM from me, but I'd like to get @dwsupplee and @saranshdhingra's thoughts before merging!

@vishwarajanand
Copy link
Contributor

vishwarajanand commented Jan 28, 2023

@taka-oyama thanks for raising this PR and fixing all outstanding comments. Upon discussion we are convinced about the reasoning for the feature requirement. However, we have reservations with the current changes, some of them are below:

  1. ValueMapperInterface isn't very easy to implement for a developer as-is. It was meant to be an internal class and not exposed externally. Making it public will mean we will have to support it for back-compat (Google's other language libraries also don't seem to expose it). It also confuses users with a similar name class Spanner/src/ValueInterface.php.

  2. Alternatively, we want to do something like BigQuery's returnRawResults flag. code pointer. And it will take a complete different implementation approach, so we want to close this PR for now and move this to a feature request issue.

@taka-oyama
Copy link
Contributor Author

Alternatively, we want to do something like BigQuery's returnRawResults flag. code pointer. And it will take a complete different implementation approach, so we want to close this PR for now and move this to a feature request issue.

That works for me.

@yash30201
Copy link
Contributor

Created a feature request issue for this (#5838) and thus closing this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants