Skip to content
This repository has been archived by the owner on Jan 17, 2023. It is now read-only.

Add an index and revise a select related to deleting all shots #3629

Merged
merged 7 commits into from
Oct 18, 2017

Conversation

chenba
Copy link
Collaborator

@chenba chenba commented Oct 12, 2017

Would be nice to get some before and after numbers on an environment with a lot of data.

Fix #3568

@ianb
Copy link
Contributor

ianb commented Oct 12, 2017

It is possible to load up your local database some, with npm run fill_database -- --times 1000 (I think that will create 1000 accounts, 50 shots each – you can play around with the command by looking at package.json). But you need a lot of accounts to start seeing the performance problem.

Copy link
Contributor

@ianb ianb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not confident that this will fix the query.

But, we now store the accountid in req.accountId, and have changed a bunch of these queries to be something like Shot.deleteEverythingForDevice = function(backend, deviceId, accountId) {...}, and then if accountId is not null/undefined, we use that to query for all the other deviceIds we should delete.

@chenba
Copy link
Collaborator Author

chenba commented Oct 12, 2017

Hmm I was only optimizing against Postgres's EXPLAIN output on those queries so the delete all request won't time out. Upon further review, what that function is doing doesn't seem correct.

It deletes the image data for a given deviceId. Then it deletes the data records for all the devices associated to the same account as the given deviceId, which cascades to images deletes. But it does this without deleting the image data for those other devices. That creates orphaned image data.

@ianb Are you suggesting that (a) we delete all images for an account if req.accountId is not null/undefined, but only delete images for the deviceId when there's no req.accountId, or (b) we would still use the given deviceId to query for other devices when req.accountId is not present?

@chenba
Copy link
Collaborator Author

chenba commented Oct 13, 2017

Shot.deleteEverythingForDevice now will use the account id to find the device ids when it's present. Otherwise it queries for device ids in the same fashion as before. The image ids that are used to delete image data are now queried with all the found device ids instead of only the deviceId from the params.

ianb
ianb previously requested changes Oct 16, 2017
Copy link
Contributor

@ianb ianb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The deviceIdSelect/promise part should be changed a bit

WHERE devices.accountid = $1`,
[accountId]);
} else {
deviceIdsSelect = db.select(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case we know exactly what the deviceIds will be: [deviceId]. So probably what it should be is something like:

if (accountId) {
  deviceIdsSelect = db.select(...).then((rows) => {
    return rows.map((row) => row.deviceId);
  });
} else {
  deviceIdsSelect = Promise.resolve([deviceId]);
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to clarify/confirm: this changes the previous behavior when shots of one device is deleted, if the device is associated to an account (but not authenticated thus no accountId), then the shots of the other devices that are associated to that account will also be deleted. After this patch, only the shots of the given device will be deleted.

Copy link
Contributor

@ianb ianb Oct 16, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really? That doesn't seem like what I'm reading here. This query seems to get images from all devices:

return db.select(
     `SELECT images.id
      FROM images JOIN data
      ON images.shotid = data.id
      WHERE data.deviceid IN (${db.markersForArgs(1, deviceIds.length)})`,
     deviceIds);

Or at least all devices listed in deviceIds...?

Note the change I'm proposing is that when accountId is null (and so there are no other devices), then we just use deviceIds = [deviceId] (more or less)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are reading that query correctly. But if we use deviceIds = [deviceId], then there's only one device id in IN. Currently, the device ids come from https://github.com/mozilla-services/screenshots/blob/master/server/src/servershot.js#L634 regardless of the user's authentication state, and that could returns multiple device ids.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If accountId is null then devices.accountid is also null, and there aren't any associated devices. At login or account connect time we set accountid, so we can trust that if it's null then there aren't any other associated deviceIds.

@chenba chenba dismissed ianb’s stale review October 16, 2017 22:27

Made the suggested change.

@ianb ianb merged commit f183b3c into mozilla-services:master Oct 18, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants