Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test(Model): reworked model.js test #329

Closed
wants to merge 1 commit into from

Conversation

mbroadst
Copy link
Contributor

  • cleaned up test code to be as asynchronous as possible
  • used chai/chai-as-expected to make tests more readable
  • introduced test-fixture to properly track/clean/drop tables
    throughout all test runs
  • made sure to wait for all table creation before operating on
    those tables

You'll notice these tests take far longer to run. That's because previously thinky's tests did not actually wait for table creation (and frankly did not do any cleanup after test runs). This issue should track the progress of table creation speed (the bottleneck): rethinkdb/rethinkdb#4746

@mbroadst
Copy link
Contributor Author

I forgot to mention: these tests are basically the same as in model.js, however I've reduced a lot of copy-pasta, cleaned it up a bit, etc. I wanted to keep it in a separate file so we could discuss any potential issues (as there were a few last time).

My goal is to come to some conclusions regarding converting the rest of test suite over to some sort of consistent style/format. I would also absolutely love if we could come to an agreement on a .jshintrc that could be added to the project

@mbroadst mbroadst force-pushed the new-model-tests branch 2 times, most recently from 82e3785 to 42d943b Compare August 24, 2015 23:16
 * cleaned up test code to be as asynchronous as possible
 * used chai/chai-as-expected to make tests more readable
 * introduced test-fixture to properly track/clean/drop tables
   throughout all test runs
 * made sure to wait for all table creation before operating on
   those tables
@marshall007
Copy link
Contributor

  • made sure to wait for all table creation before operating on those tables

I think we should probably still have some tests (maybe all?) that don't wait for Model.ready(), especially if that's a lot faster. Since thinky is supposed to support executing queries on a model before the table is ready, it's useful to include tests that verify requests are queue'd up properly.

@mbroadst
Copy link
Contributor Author

@marshall007 yeah, I agree, and that's possible with passing in { init: false } generally. The real problem with the current design imho is that far too much happens in thinky as a side effect: there's a distinct lack of separation between declaring what your schema looks like and the actual creation of that schema in the database (an example of this is the implicit promises set into motion as a part of creating an association, an internal table schema is automatically created and executed).

The tests where I added the ready() calls here all at some point depend on the table being ready (especially in order to clean it up) - the speed was coming from a sort of set-and-forget mentality. In fact, I had opened another issue against rethinkdb because tearing down the table after the existing queries causes tables to be orphaned with the current version of rethink.

Finally, considering these are the model tests there is probably just going to be more of this type of behavior, rather than the query tests for instance where we might set up a bunch of sample data and just query away without needing to setup/teardown tables.

@marshall007
Copy link
Contributor

@mbroadst agreed concerning the "fire-and-forget" stuff, it's nice to have the tests encapsulated the way you've done with test/test-fixture.js. That said, I still don't see the value in waiting on Model.ready() everywhere (specifically in the Batch insert and Insert tests).

@mbroadst
Copy link
Contributor Author

@marshall007 well to be honest, adding the ready() there is just a way to inform the reader of the test that that is exactly what's happening. In reality, or at least my experience so far, you will wait just as long since either way rethinkdb has to make the table in order for you to insert into it. I haven't seen any speed improvement (at least in Batch insert) by removing the calls to ready().

By all means fork this PR and improve upon it, I'm very open to discussion.

@neumino
Copy link
Owner

neumino commented Aug 25, 2015

I don't have much time today (and probably this week), so I just gave the pull request a quick look but I'm not sure to understand what this PR is suppose to bring.

A few quick/random thoughts:

  • We used to create one model per test, and the truth is while it was prettier, it blows because it takes forever to run the tests. Typically this branch needs 2m40 instead of one minute to run the tests on wercker.
  • chai is not needed. It's one more dependence, and more verbose tests...
  • The tests do not wait for the table to be ready because thinky does it under the hood.
  • Test fixtures is a good idea. initializeDatatabase has no reason to return a promise though
  • Tests are not strictly the same in my opinion. I'll write more comments on the code later.

I'm aware that tests are messy, that we could refactor some code (like test-fixtures), but this PR doesn't seem to simplify things or make developing easier.

@mbroadst
Copy link
Contributor Author

To answer those points in order:

  • "I'm not sure to understand what this PR is suppose to bring" If you would like me to be more verbose in my commit message please let me know which of those points is unclear to you. The primary goal of this is to
    • prefer asynchronous behavior everywhere it was not previously (using done() when you could just return a promise)
    • generally clean up the test code, removing full duplicate tests, reducing the line count by reusing tests which were just copy and pasted for e.g. different join types.
    • using chai to improve readability
    • adding the TestFixture to ensure that thinky's tests do not leave all sorts of junk on my local database when I run them.
  • I haven't created any more models than exist in test/model.js. It's not just an issue of being pretty, it's an issue of consistency imho. I've already addressed that the rethinkdb guys are handling it, and adding 1m40s to a test on wercker seems inconsequential (the 1200+ tests for sequelize run ~24 variants at ~2.5min a piece, this is the nature of testing) if it guarantees consistency/integrity.
  • I have no idea why you are so against chai :) it is an extra dependency, but only for the tests (this will have zero effect on the end user), and it in fact makes the tests far more readable in some cases (no more try/catch + done(), instead I can use chai-as-promised to write a single human-readable string to tell me what the test is doing). I'm not sure I've ever actually spoken with someone before who thought more verbose tests was a bad thing...
  • thinky is not complete in it's support for waiting for the table, as evidenced by my need to add it everywhere to ensure that tables were properly created before trying to remove them. Also as I've mentioned above, adding the table ready here makes it very explicit (a good thing) to the user that the table is ready before trying to operate on the table. The fact that thinky (through side-effects) does this stuff in the background I think is a bad thing, because its more cognitive overhead for the implementor. Adding ready() where it's been added has a beneficial effect in some places, and no effect in other places, but it is always very clear that it's a required step this way.
  • okay, sure
  • The tests are definitely the same. I removed a number of duplicate tests, and even added a few extra tests for completeness sake.

As for the comment "doesn't seem to simplify things or make developing easier", can you please expand upon that? Because I strongly disagree.

});

return model.ready()
.then(function() { expect(model).to.exist; });
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

model is immediately defined. There's no need to wait for the promise returned by ready

describe('ensureIndex', function() {
afterEach(function() { return test.dropTables(); });

it('should add and ensure an index', function() {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done?

@neumino
Copy link
Owner

neumino commented Aug 26, 2015

I don't have much time tonight, but still did a full pass (I think) on the test file.

Thanks @mbroadst, I appreciate the effort but there are a few issues with the tests. They are not strictly equivalent (or at least testing the same set of things). The prototype chain in thinky is really special, and the tests were actually meaningful.

The comments about equality, to.be.undefined etc. are one reason I don't like chai. I don't want to add a library for test that does fancy things like to.be.undefined but that no one knows how it works or what it tests.

I'm still the one who has to run the tests the most (since I'm still the main one writing code for thinky), and honestly I don't want to have to google how chai works everytime or write a script to be sure that it will test what I want.

The extra time it needs to run the tests may not be an issue for most of the people, but as the one who always run the tests before releasing and often while adding features, this would be a huge pain/downside for my interest to work on thinky.

@neumino
Copy link
Owner

neumino commented Aug 26, 2015

Arg you guys already replied to my comments. I have to run now. I'll try to catch up tonight (if it's not too late) or tomorrow.

@mbroadst
Copy link
Contributor Author

afaict the issues you brought up are quite easily solved. your discomfort with chai can be overcome (I'll just revert to assert, even though I think it's stretching it a bit far to say that nobody reading chai would have any idea what it's doing.. "1,006,653 downloads in the last month" on npmjs just completely disagrees with you here).

As far as the time it takes to run the tests, well that's just how long it has to take until the guys at rethink optimize the tableCreate process I think. IMHO an ORM/ODM has to be incredibly precise in its testing, specifically involving making sure that tests run from a clean slate and in a reproducible manner. I'm adding roughly 1min (in fact I think its more like ~40s) to your test run here, but you make it seem like we're talking about 10min extra here (and tbh if it was 10min I'd still be for it).

@marshall007
Copy link
Contributor

The extra time it needs to run the tests may not be an issue for most of the people ...

@mbroadst if the database/table cleanup is adding additional overhead perhaps we could add a command line argument to disable this? That way the CI server can run in "no-cleanup" mode for performance and/or @neumino's sanity :)

@mbroadst
Copy link
Contributor Author

@marshall007 we certainly could, though I believe it would be non-trivial.

Again I point to Sequelize. They release regularly with only two official maintainers, and a test suite on travis that runs 25 variations for a grand total of 52 min 23 sec of testing. It is confusing to me that we are discussing how expect(undefined).to.be.undefined is "insane", while at the same time saying that a 2 minute overall test time is far too great to handle for release purposes.

Not to mention, when tableCreate is optimized in rethinkdb it's likely to be incredibly fast (compared to the 2-3s it takes now). In fact, one could test against 2.0.4 since the time it takes is specific to the addition of the high availability (raft) code, I'm sure you would find that the tests are only marginally slower in that case.

@neumino
Copy link
Owner

neumino commented Aug 28, 2015

Here's a list of actionable items for this PR:

  • Please add done in the argument even if it works when you return a promise. Typically on line 742, one could be lead into thinking that doc.validate() returns a promise. While it can be the case in some situations, it's not here.
  • Please remove chai. It's an extra dependencies that is not required. assert is part of the core, works fine and I happen to be familiar with it. I'm not familiar with chai, and since I still maintain the code, it's more work for me. An example of why I think chai doesn't do a good job is the description of eql. It says "assert that the target is deeply equal to value. It's typically different than assert.deepEqual since it tests for all keys in the object and its prototype chain
    http://chaijs.com/api/bdd/#eql
  • Have the fixture be able to create a model and store whether the table under the hood is ready or not.
    Have this method takes an extra argument to match {init: false}. Model should be recreated, tables should not. Tests should take the same time or be faster since we currently recreate tables per "group" of tests (describe) if I'm not mistaken.
  • Fix the tests to test the same things. You can add extra tests, the new test should supersede the old test. The tests about strict equality, deep equality, etc. are important.
  • Do not wait for model to be ready to fire queries. Queries are queued until the model is ready, this is by design. It's dev friendly. In prod, it is supposed to be a no op. If it is not, you are probably crazy-ish. In prod you are supposed to pass {init: false} and run a dummy script with your models to create the indexes (though I admit I never wrote about that). Typically this use case sucks if you wait for your model to be ready:
  • Have a dataset/models already in the database
  • Create a new experimental relation between two models (and therefore a new index) that is not required for real traffic, only for experiments.
  • Launch the index creation
  • Kill/restart your node apps
    Now your app won't start serving until your index is ready. In this case you push all your traffic on a few existing nodes (if some didn't die), or you have no node to serve the traffic.

Then it should be good I think. I don't really like having a for loop to generate test because it's more painful to isolate the test after when it fails, but that's fine, I can work around.

@mbroadst
Copy link
Contributor Author

Please add done in the argument even if it works when you return a promise. Typically on line 742, one > could be lead into thinking that doc.validate() returns a promise. While it can be the case in some
situations, it's not here.

you clearly saw that as needless here. Sure in cases like 742 I shouldn't presently return the validation function (though I believe in the future validations should be a promise chain, but more on this later), but I don't agree that bar-none all tests should use done(). Returning a promise as a part of a test is idiomatic mocha, it saves on boilerplate .then(function() { done(); }); code littering your test suites (and making them harder to read through imho).

Please remove chai. It's an extra dependencies that is not required. assert is part of the core, works
fine and I happen to be familiar with it. I'm not familiar with chai, and since I still maintain the code, it's
more work for me. An example of why I think chai doesn't do a good job is the description of eql. It says
"assert that the target is deeply equal to value. It's typically different than assert.deepEqual since it
tests for all keys in the object and its prototype chain http://chaijs.com/api/bdd/#eql

I've already agreed to do this, but I'm glad we finally dragged the "I just don't like it" out of you 😄. In response to the added point there about deep equality, that's another matter of opinion but frankly the I believe chai's take on that to be the "correct" way. When you do this in your tests, you are saying that a plain object { id: 'foo' } is exactly the same, deeply, as the returned Model which is absolutely not the case. If you were to actually compare those two it would fail, and for non-obvious reasons, because the prototypes are different. I have been bitten by this countless times in many different JS projects, it can lead to massive wastes of time.

Have the fixture be able to create a model and store whether the table under the hood is ready or not.
Have this method takes an extra argument to match {init: false}. Model should be recreated, tables
should not. Tests should take the same time or be faster since we currently recreate tables per
"group" of tests (describe) if I'm not mistaken.

You are indeed mistaken. There is only one set of tests that does this currently (the "Models" described section), and while you could certainly rewrite all tests to be that way, it wasn't on my list of things to do. It's a non-trivial amount of work to convert these tests to all run that way, and frankly its generally much easier for a developer to fully create the model in the test they are working on and "just not think about it" (rather than setting test.models.MyModel). Again, if rethinkdb's tableCreate took a shorter amount of time (and it will) you wouldn't even notice a difference, but your tests would actually be clean and isolated.

Do not wait for model to be ready to fire queries. Queries are queued until the model is ready, this is
by design. It's dev friendly. In prod, it is supposed to be a no op. If it is not, you are probably crazy-ish.
In prod you are supposed to pass {init: false} and run a dummy script with your models to create the
indexes (though I admit I never wrote about that).

This "behind the scenes" work is incomplete. The original reason I peppered the code with calls to ready() was because I ran into a few cases (in particular when join tables are automatically created and involved) where the queries didn't in fact wait for that table to be ready, and in turn when the test finished and attempted to clean up and remove the table, it hadn't actually finished creating yet. This is precisely what is wrong with hiding this work behind the scenes, and perhaps we should open a new issue to discuss this in particular. I believe that the default behavior of createModel should be to not touch the database at all; as you say the tables are more than likely already created, you're just informing thinky about how it needs to build its queries.

Anyway, in conclusion you:

  • don't want the preference towards asynchronous behavior
  • don't want chai
  • don't want to clean up and recreate tables between test runs
  • don't want to wait for tables to be ready before test run

additionally, you don't really want the reduction in duplicate code by using for loops but you can live with it.

That basically denies all the points this PR was created to address. I'm fine with that, you're the maintainer, but I can't help but feel that with a slightly more open mind you could dramatically reduce your work load here.

Cheers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants