-
Notifications
You must be signed in to change notification settings - Fork 798
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding checksums to Jetpack_JSON_API_Sync_Status_Endpoint #12209
Conversation
Thank you for the great PR description! When this PR is ready for review, please apply the Scheduled Jetpack release: May 14, 2019. |
The initial attempt was to provide an array, but the output on the WPCOM side would always result in the output including |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems reasonable. I was not able to test, because i'm not sure how to run requests to the sync/status endpoint.
I was able to test after getting instructions from @enejb in Slack. My approval stands! |
sync/class.jetpack-sync-actions.php
Outdated
self::initialize_sender(); | ||
|
||
$sync_module = Jetpack_Sync_Modules::get_module( 'full-sync' ); | ||
$queue = self::$sender->get_sync_queue(); | ||
$full_queue = self::$sender->get_full_sync_queue(); | ||
$cron_timestamps = array_keys( _get_cron_array() ); | ||
$next_cron = $cron_timestamps[0] - time(); | ||
$store = new Jetpack_Sync_WP_Replicastore(); | ||
$checksums = $store->checksum_all(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How expensive is this method? Is there a situation where we would call this endpoint but not care about the checksums?
If the answers are: "pretty expensive: it's slow" and "yes, we sometimes don't care about the checksums", then we should write this in such a way that if ?fields=
is set and does not include any of the checksum fields, we don't bother calculating the checksums.
If it's cheap or if we know we always want this information, it doesn't matter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call. AFAIK, the SQL queries may be rather expensive as they calculate the checksum for all posts, comments and post_meta records.
However, calling this endpoint is slow anyway, but I could try timing it before and after this change.
I'm going to follow-up on this within Poseidon to coordinate on a way forward here. Using ?fields=
is possibly a good idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did the research and crunched the numbers. Note that this is made on my recent MBP development machine, using our Jetpack Docker image.
The relative overhead created by Jetpack_Sync_WP_Replicastore::checksum_all()
, while high for smaller instances, becomes lower and lower the higher the number of posts is.
Initially, up to 200 posts (with increments of ca 30), the overhead added ranges between 30%-45%.
However, when we get to the tens of thousands, with increments of 5000-10000, the overhead becomes closer to 5%-15%.
The absolute range of this overhead in milliseconds ranged from 0.0032 ms to 0.0070 ms, so I really don't think this adds any significant overhead.
Looks at this fancy graph!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that on the WPCOM side, I could have done a similar timing test, but I think my internet connection in combination with tunneling services that we use such as Pagekite and ngrok add too much of a response lag to make be able to trust such tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aldavigdis
I don’t understand how the as the numbers of posts increases the call with the non checksums also gets slower.
Do you have an explanation for that?
Is it also possible to remove any dependence on the network in this performance tests. Since that can have a lot of variance in there as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was all done within the Jetpack_Sync_Actions::get_sync_status
function, the diff is here:
diff --git a/sync/class.jetpack-sync-actions.php b/sync/class.jetpack-sync-actions.php
index 79c47b9c1..7f229342a 100644
--- a/sync/class.jetpack-sync-actions.php
+++ b/sync/class.jetpack-sync-actions.php
@@ -440,6 +440,7 @@ class Jetpack_Sync_Actions {
}
static function get_sync_status() {
+ $time_start = microtime( true );
require_once JETPACK__PLUGIN_DIR . 'sync/class.jetpack-sync-wp-replicastore.php';
self::initialize_sender();
@@ -451,7 +452,13 @@ class Jetpack_Sync_Actions {
$store = new Jetpack_Sync_WP_Replicastore();
$checksums = $store->checksum_all();
+ $time_taken_before_sync = microtime( true ) - $time_start;
+
$full_sync_status = ( $sync_module ) ? $sync_module->get_status() : array();
+
+ $time_taken = microtime( true ) - $time_start;
+ l( wp_count_posts()->publish . ' ' . $time_taken . ' ' . $time_taken_before_sync );
+
return array_merge(
$full_sync_status,
array(
If there is a better way to do this without too much hassle, then I'm all ears.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice fancy graph!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the record, I have done this as of ac1d2f8.
The checksums are not included and not crunched by the helper method unless those keys are specifically requested using the fields
parameter.
Example:
/sites/site_id/sync/status?fields=posts_checksum,comments_checksum,post_meta_checksum,comment_meta_checksum
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tests well for me, code also looks good 👍
Left a non-blocking question, but am also curious to see how the checksums affect the performance.
…rom comment_meta_checksum
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some nits, but this looks as good as before.
Would be nice to make the checksums optionally requested with a query argument or something as @mdawaffe suggested - if there are cases where they make things 50% slower, it's an improvement worthy to be considered.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This still tests well for me 👍 Would be nice to add a test or update an existing one with the checksum fields.
I have a concern about casting the checksums for post/comment meta - that might lead to unexpected results if an error occurs during checksum calculation. Seems like we should account for that, just in case.
Also, I wasn't sure if we're planning to make the checksums optionally triggered by a query arg like @mdawaffe suggested - is that in the plans for this PR, @aldavigdis?
…r parameter. Updating docs.
json-endpoints/jetpack/class.jetpack-json-api-sync-endpoint.php
Outdated
Show resolved
Hide resolved
sync/class.jetpack-sync-actions.php
Outdated
$checksums = array(); | ||
|
||
if ( is_null( $fields ) === false ) { | ||
$fields_params = explode( ',', $fields ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the preparation (exploding, potentially trimming) happen in the endpoint?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really just a helper function. Messy but it works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a quick note. As @tyxla suggested, I think it would be nice to extend our tests to cover this.
Co-Authored-By: Jeremy Herve <jeremy@tagada.hu>
There don't seem to be any tests for |
…c/jetpack into add/site-audit-simple-checksum
@@ -439,7 +439,7 @@ static function cleanup_on_upgrade( $new_version = null, $old_version = null ) { | |||
} | |||
} | |||
|
|||
static function get_sync_status() { | |||
static function get_sync_status( $fields = null ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be great if fields was always an array. instead of sometimes an array and sometimes a string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a helper function. Query params need to be strings anyway and I think this is the most straightforward way to handle things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could also support both array
and string
(like many other WP core functions and methods do), but we should make sure that's properly supported by adding specific tests for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say we don't need that because this is just a helper function used in a single JSON API response.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we never really using anything but a string.
$this->assertArrayNotHasKey( 'post_meta_checksum', $no_checksum ); | ||
$this->assertArrayNotHasKey( 'comment_meta_checksum', $no_checksum ); | ||
|
||
$kitchen_sink_checksum = Jetpack_Sync_Actions::get_sync_status( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, it would be a good way to split this test to separate cases (we're currently testing several cases in the same test)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see nothing wrong with that.
$this->assertArrayHasKey( 'post_meta_checksum', $post_meta ); | ||
$this->assertArrayNotHasKey( 'comment_meta_checksum', $post_meta ); | ||
|
||
$comment_meta = Jetpack_Sync_Actions::get_sync_status( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add some test cases for other sync status keys, other than the checksum fields?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. In this case, it's just testing a specific concept and while the test is long, I just wish there was a simpler way to write it. :)
@@ -439,7 +439,7 @@ static function cleanup_on_upgrade( $new_version = null, $old_version = null ) { | |||
} | |||
} | |||
|
|||
static function get_sync_status() { | |||
static function get_sync_status( $fields = null ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could also support both array
and string
(like many other WP core functions and methods do), but we should make sure that's properly supported by adding specific tests for that.
…c/jetpack into add/site-audit-simple-checksum
$this->assertArrayNotHasKey( 'post_meta_checksum', $no_checksum ); | ||
$this->assertArrayNotHasKey( 'comment_meta_checksum', $no_checksum ); | ||
|
||
$kitchen_sink_checksum = Jetpack_Sync_Actions::get_sync_status( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what do you think about one test per Jetpack_Sync_Actions::get_sync_status
call?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Multiple assertions in a single test is something I've considered to be fine until now.
As stated before, it's just testing a single concept. I'd love to have a simpler way to write this while maintaining human-readability, but PHPUnit doesn't seem to allow it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With or without moving the tests to separate methods, this tests well and LGTM 🚢
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good and tests well on my end. 👍
* Kick off the changelog * Add 7.3.1 * Update date and post link * changelog: add #12219 * changelog: add #12170 * changelog: add #12184 * Changelog: add #12268 * Changelog: add #12081 * Changelog: add #12323 * Changelog: add #12204 * Changelog: add #12269 * Changelog: add #12332 * changelog: add #12339 * changelog: add #12209 * Changelog: add #12319 * Changelog: add #12357 * Changelog: add #12124 * Changelog: add #12373 * Changelog: add #12252 * Changelog: add #12383 * Changelog: add #12372 * changelog: add #12337 * Changelog: add #12290 * Changelog: add #12301 * Changelog: add #12061 * Testing list: add instructions for #12061 * Changelog: add #12393 * Update minimum supported version See #12287 * Changelog: add #12406 * Testing list: add #12406 * Changelog: add #12277 * Changelog: add #12412 * Changelog: add #11318 * Changelog: add #12328 * Changelog: add #12425 * Changelog: add #12380 * Changelog: add #12428 * Changelog: add #12414 * Changelog: add #12395 * Changelog & Testing list: add #12416, #12417, #12418, and #12348 * changelog: add #12379 * Changelog: add #12341 * changelog: add #12444 * Changelog: add #12434 * Changelog: add #12454 * Changelog: add #12460 * Changelog: add #12463 * Changelog: add #12457 * Changelog / testing list: add #10333 * Changelog: add #12467 Co-authored-by: Jeremy Herve <jeremy@jeremy.hu>
Changes proposed in this Pull Request:
This PR adds the following attributes to the JSON API Sync Status Endpoint:
posts_checksum
comments_checksum
post_meta_checksum
comment_meta_checksum
Those values are derived from
Jetpack_Sync_WP_Replicastore::checksum_all()
.Testing instructions:
Does the
/sites/%s/sync/status
endpoint return the attributes listed above?Does it match with the schema provided in
json-endpoints/jetpack/json-api-jetpack-endpoints.php:548
?Proposed changelog entry for your changes:
Posts, Comments, Post Meta and Comment Meta checksums are now provided via the Sync Status Endpoint.