-
Notifications
You must be signed in to change notification settings - Fork 436
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: [Spanner] make ValueMapper customizable #5700
Conversation
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
🤖 I detect that the PR title and the commit message differ and there's only one commit. To use the PR title for the commit history, you can use Github's automerge feature with squashing, or use -- conventional-commit-lint bot |
121dc00
to
699547a
Compare
Rebased to fix conflict. CLA signing is going through the corporate pipeline and should be done by next week. |
Hi @taka-oyama ! Thanks for your contribution. This looks great, but I have a few questions.
Is there a reason why you can't use |
Hi @bshaffer, thanks for the quick reply.
I could do that, and I plan on doing so if this doesn't work out, but I wanted to make an effort to make it run as optimal as possible, since we have had incidents in the past where unnecessary DateTime conversion was using up 10% of the total CPU in some projects.
By "this", did you mean an example where a custom ValueMapper might be needed? I haven't encountered any cases for any of the projects I've encountered, but I think the same could be said for DATE type which is also converted to By the way, I noticed there was a similar request in google-cloud-go (googleapis/google-cloud-go#854) which was accepted. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@taka-oyama thank you for your prompt response!
I am shocked that DateTime processing would take up 10% of the total CPU, but perhaps processing thousands of datetime fields could have this effect?
Thank you for providing the link to the Go PR as well, that is very helpful.
I'm placing a code sample here so others can see how this option would be used in practice:
$spanner = new SpannerClient();
$instance = $spanner->instance('instance-name');
$database = $instance->database('database-name', ['valueMapper' => $myCustomDataMapper]);
efea8a9
to
8827eff
Compare
b068ce9
to
2e18c70
Compare
Hi, sorry for the long wait. The CLA is now passing.
Yes, this happened when the code was processing a few thousand rows with many datetime columns. I believe cases like this happen because most web frameworks just read columns with |
2e18c70
to
9e66d0c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is LGTM from me, but I'd like to get @dwsupplee and @saranshdhingra's thoughts before merging!
@taka-oyama thanks for raising this PR and fixing all outstanding comments. Upon discussion we are convinced about the reasoning for the feature requirement. However, we have reservations with the current changes, some of them are below:
|
That works for me. |
Created a feature request issue for this (#5838) and thus closing this issue. |
This PR makes Spanner's ValueMapper customizable.
I made this PR because I'd like to modify ValueMapper so that
decodeValues
return raw values instead of column types like TIMESTAMP being converted intoDateTimeImmutable
automatically.Right now, there is no good way to achieve this.
I want this feature because in Laravel (the web framework I use and wrote a spanner driver for), data is expected to be passed to the model as raw values and should be converted to useful types only when it is actually used, which is more efficient since not all rows retrieved from the database are actually used during the request's lifecycle.
This feature does not introduce any breaking changes.
Please let me know what you all think.
Thank you.