Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runValidators: true for findOneAndUpdate is orders of magnitude slower than save(). #8093

Closed
neuro-sys opened this issue Aug 22, 2019 · 6 comments · Fixed by #8192
Closed

Comments

@neuro-sys
Copy link

Do you want to request a feature or report a bug?
It can be considered a bug, or a performance issue.

What is the current behavior?
Using Model#findOneAndUpdate(filter, update, { runValidators: true }) is orders of magnitude slower than using Model#save() on a relatively large document. That Model#findOneUpdate() being slower than Model#save() is rather unexpected.

If the current behavior is a bug, please provide the steps to reproduce.

const mongoose = require('mongoose');

const { Schema } = mongoose;

const start = async () => {
  const connection = await mongoose.connect('mongodb://localhost:27017/issue-db', {
    useNewUrlParser: true, // Supress deprecation
    useFindAndModify: false, // Supress deprecation. Note having it true doesn't make any difference.
  });

  await mongoose.connection.dropDatabase();

  const issueSchema = new Schema ({
    myArray: [{
      valueA: { type: Number, required: true },
      valueB: { type: Number, required: true },
      valueC: { type: Number, required: true },
    }],
  });

  const IssueModel = mongoose.model('Issue', issueSchema);

  const dummyArray = new Array(10000).fill().map(val => ({
    valueA: Math.random() * 1000,
    valueB: Math.random() * 1000,
    valueC: Math.random() * 1000,
  }));

  console.time('findOneUpdate()#runValidators:true');

  await IssueModel.findOneAndUpdate({}, {
    myArray: dummyArray,
  }, { new: true, upsert: true, runValidators: true });

  console.timeEnd('findOneUpdate()#runValidators:true');

  console.time('findOneUpdate()#runValidators:false');

  await IssueModel.findOneAndUpdate({}, {
    myArray: dummyArray,
  }, { new: true, upsert: true, /* runValidators: true */ });

  console.timeEnd('findOneUpdate()#runValidators:false');

  console.time('save()');

  const issue = new IssueModel({
    myArray: dummyArray,
  });

  await issue.save();

  console.timeEnd('save()');
};

start()
  .catch(err => {
    console.error(err);
    process.exit(1);
  });

Outputs:

$ node index.js
findOneUpdate()#runValidators:true: 33294.824ms
findOneUpdate()#runValidators:false: 2120.082ms
save(): 1504.498ms

What is the expected behavior?

Expected behaviour is that the Model#findOneUpdate() with runValidators: true option not to take any longer than Model#save(), let alone 22 times more in the provided example.

I suspect the bottleneck may be here, but I haven't profiled where the time is spent most.

What are the versions of Node.js, Mongoose and MongoDB you are using? Note that "latest" is not a version.

$ node --version
v12.4.0
$ cat package.json | grep \"mongoose\"
    "mongoose": "5.6.10"
$ mongod --version | grep "db version"
db version v4.0.3
$ mongo --version | grep "shell version"
MongoDB shell version v4.0.3

@sgatade
Copy link

sgatade commented Aug 26, 2019

findOneAndUpdate() with runValidators:true WILL take more time since each document will be validated during the execution.

But I had similar results like yours and I do agree that the difference between save() and findOneAndUpdate() with runValidators:true is too large to ignore...

Cheers!
SG

@birdofpreyru
Copy link
Contributor

Same here, but even wierder: I have my schema like

new Schema({
  data: {
    type: Object,
  },
  // A bunch of simple fields.
});

and findOneAndUpdate() becomes slow specifically when data is large object, even it looks to me that no validation is necessary here at all! All other operations (object creation, lookup) work fast.

I also did a bit of profiling and found that all time is wasted inside this call: https://github.com/Automattic/mongoose/blob/master/lib/query.js#L3432, at which point, I also got a question, shouldn't it be wrapped into if (runValidators) condition? Because it is get called for me even if I run findOneAndUpdate() with runValidators: false option set.

@sgatade
Copy link

sgatade commented Aug 27, 2019

Quick investigation on @birdofpreyru 's point above...

image

Looks like adding if() does help!

Code change as below in query.js...

image

Cheers!

SG

@sgatade
Copy link

sgatade commented Aug 27, 2019

@birdofpreyru Can you add the if() line in your local repo and run the same test again with objects?

Cheers!

SG

@birdofpreyru
Copy link
Contributor

@sgatade yeap, I checked it, that without that call it works fast for me. But I also believe there is some performance bug inside the validation code: it is clearly so slow because it traverses all paths in a large JS document I put inside update, while in my case there is no need to touch it at all: in the Schema that field is declared as Object with no validation, thus its internal structure does not matter (and apparently the validator inside .create() does take it into account properly).

@vkarpov15 vkarpov15 added this to the 5.x Unprioritized milestone Sep 4, 2019
@birdofpreyru
Copy link
Contributor

birdofpreyru commented Sep 22, 2019

Just a heads up, as of mongoose@5.7.1 this works a way better, but still not the optimal. I have a test which at the time of my previous message timed out after 5 seconds, if I used .findOneAndUpdate(..) in the tested logic; and it worked faster than 1 second if I worked that around with a pair of .findOne(..) and .create(..) calls.

I just had a brief new look into this, and with mongoose@5.7.1 my test takes

  • 230 ms with .findOneAndUpdate(..)
  • 53 ms with the workaround by .findOne(..) and .create(..).

It was not an improvement of validation, it was because of the fix for runValidators option check, which defaults to false. If I set runValidators: true option, my test still times out after 5 seconds.

birdofpreyru added a commit to birdofpreyru/mongoose that referenced this issue Sep 23, 2019
…function logic

Before this commit the `flatten(..)` function failed to deliver what it
promised. Namely, it entered into Mixed paths of objects.

Update validator, on its side, did not pass the casted doc schema into
`flatten(..)`. If the casted doc contained a large Mixed field, all its
paths were added into the list of updated paths. They were lated ignored
by now removed check for schemaPath type, but performance was already
hurt.

This commit makes sure that inner sub-paths of Mixed paths are not
included into the array of paths at all, thus no further checks of that
are necessary, and the performance is restored.

Fixes Automattic#8093
vkarpov15 added a commit that referenced this issue Sep 26, 2019
[#8093] Fixes performance of update validator, and flatten function logic
@vkarpov15 vkarpov15 removed this from the 5.x Unprioritized milestone Oct 14, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants