Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updating js-db to 5.x.x (integrating RocksDB and DB-native concurrency control) #67

Merged
merged 7 commits into from
Aug 19, 2022

Conversation

emmacasolin
Copy link
Contributor

@emmacasolin emmacasolin commented Jul 25, 2022

Description

Updating CI/CD including:

  • Continuous benchmarking
  • Staging branch
  • Advanced type checking

Related:

Tasks

  • 1. Create new staging branch
  • 2. Synchronise changes from js-logger
  • 3. update @matrixai/db to 5.0.1,
  • 4. fix API changes for DB
  • 3. Fix failing tests
  • 4. Integrate fast-check and update any relevant tests.

Final checklist

  • Domain specific tests
  • Full tests
  • Updated inline-comment documentation
  • Lint fixed
  • Squash and rebased
  • Sanity check the final build

@emmacasolin emmacasolin added the procedure Action that must be executed label Jul 25, 2022
@emmacasolin emmacasolin self-assigned this Jul 25, 2022
@emmacasolin
Copy link
Contributor Author

The problem with updating js-logger here (which we need to do in order to update it in PK) is that it also means we need to update the DB, but this also means incorporating all of the new (breaking) changes from 5.0.0 of js-db. I think I've made the correct changes to the source code, but there are several failing tests. This is to be expected though, since js-db has undergone some major changes. Most of the failures are to do with concurrency, which makes sense given the focus of most of the recent changes to the db.

@CMCDragonkai
Copy link
Member

Hold on this PR, the js-db update is more pressing first. And we don't need to update the EFS yet. The js-db 5.0.0 is first being updated in PK. EFS can stay on the original js-db.

@CMCDragonkai
Copy link
Member

What you can do is to merge in the non-js-db and non-js-logger changes in first. There's just a bunch of CI/CD stuff.

Can you split up the commit into a separate PR for just the staging and ci/cd and other chores.

@emmacasolin
Copy link
Contributor Author

Hold on this PR, the js-db update is more pressing first. And we don't need to update the EFS yet. The js-db 5.0.0 is first being updated in PK. EFS can stay on the original js-db.

Ok. EFS won't work inside PK if it's running the old DB though, since the old DB uses the old logger, which conflicts with the new logger used by the new DB in PK, so it will need to be updated here eventually.

@CMCDragonkai
Copy link
Member

Actually you can do it. You just won't be able to pass a "child" logger into the DB that EFS is using. So you can just remove that parameter entirely.

@emmacasolin
Copy link
Contributor Author

Now that I've updated the logger for all of the other libraries though we can no longer pass a logger through to any of the other libraries from EFS. I think this is only an issue for js-workers though, which is used only in the benchmarks and in one of the tests (efsWorker.test.ts), so we could just not pass a logger into those.

@emmacasolin emmacasolin force-pushed the feature-logger-update branch from c3b0381 to 4789d8e Compare July 25, 2022 03:14
@emmacasolin
Copy link
Contributor Author

ee0b9e2 now contains all of the CI/CD changes but doesn't update the logger or DB. As such, it doesn't pass a logger through to any of the workers since js-workers uses a different version of js-logger. All of the tests should be passing on that commit, but they don't pass on the most recent commit due to concurrency changes from the latest DB.

@CMCDragonkai
Copy link
Member

CMCDragonkai commented Jul 25, 2022

Once #68 is merged, rename this PR to focus on js-db 5.x upgrade. Logger changes should be eliminated here. Since that will be put into #68. Although once rebased, you can also remove the comment.

@CMCDragonkai
Copy link
Member

@emmacasolin please update to using 1.3.6 of js-workers. It is now compatible with swc.

The quickfix is to swap from using new X(); to new this(); wherever we have static async creators.

This needs to be fixed in EncryptedFS.ts. Change to:

    const efs = new this({

To make it usable. Then swc should be enabled.

@CMCDragonkai
Copy link
Member

The same fix has to be applied to PK as well under staging branch. Can you do it at the same time as you do you js-logger updates there.

@tegefaulkes
Copy link
Contributor

tegefaulkes commented Jul 28, 2022

0084570 fixes #69. There still are failing tests in this PR but it passed when tested on top of staging.

@CMCDragonkai
Copy link
Member

Creation of temporary directory with os.tmpdir() should still be using path.join as normal.

@CMCDragonkai
Copy link
Member

@emmacasolin this PR needs to be split up. The js-db changes should be moved into a new PR.

This PR should only focus on synchronising changes, updating the js-logger, and the fix that @tegefaulkes just added.

@CMCDragonkai
Copy link
Member

I suspect that the test failures are coming from the js-db changes. The js-db update should be postponed till later. @tegefaulkes

@CMCDragonkai
Copy link
Member

Actually I was confused, the recent fix should go to #68 and #67. Will rename this PR to suit.

@tegefaulkes can you reset the commit you just pushed?

@CMCDragonkai CMCDragonkai changed the title Synchronising changes from js-logger Updating js-db to 5.x.x (integrating RocksDB and DB-native concurrency control) Jul 28, 2022
@CMCDragonkai
Copy link
Member

CMCDragonkai commented Jul 28, 2022

When this is being worked on all:

      // @ts-ignore - version of js-logger is incompatible (remove when js-db updates to 5.* here)

Should be removed from the source, tests and benchmarks as js-logger will be compatible and we'd have completed the upgrade.

Unlike in Polykey where it has a commit that we can revert, this PR won't have that convenience.

@tegefaulkes tegefaulkes force-pushed the feature-logger-update branch from 0084570 to db877c9 Compare July 28, 2022 05:39
@tegefaulkes
Copy link
Contributor

Removed the fix commit from here and pushed it to staging.

@tegefaulkes
Copy link
Contributor

Fast-check can be used to combine a number of concurrency tests together. For example, concurrent file writes using a number of methods can be the same test. for these tests the the result matters less than avoiding transaction conflicts. we still want to check that something has been written however.

@tegefaulkes
Copy link
Contributor

The INodeManager is using a LockBox along side the the database transactions. I think we will want to make use of the transaction locks now.

FileDescriptor has its own Lock for locking on cursor updates. This one should be left alone.

@tegefaulkes tegefaulkes force-pushed the feature-logger-update branch from 8b779f3 to 90454cb Compare August 18, 2022 08:20
@tegefaulkes
Copy link
Contributor

This should be good to merge now. Pending review if desired.

@CMCDragonkai
Copy link
Member

Can you rerun the benchmarks and show the comparison here. Since the db has changed, I'm expecting it should be alot faster.

@CMCDragonkai
Copy link
Member

CMCDragonkai commented Aug 19, 2022 via email

@tegefaulkes tegefaulkes force-pushed the feature-logger-update branch from 90454cb to b20a89f Compare August 19, 2022 02:21
@CMCDragonkai
Copy link
Member

Ok so for all the concurrent tests where you have replaced the test, change to using a regular expression to match the original predicates on the output. I've given some examples above.

For the new tests, this can also be done if you see an easy way to write the regular expression. Prefer this over simply checks like 3.

Then proceed to merge to staging.

On the staging branch, regenerate docs and redo the benchmarks.

@CMCDragonkai
Copy link
Member

CMCDragonkai commented Aug 19, 2022

There's a documented way of doing delaying the actual async functions for the fast check scheduler: dubzzz/fast-check#564. See the relevant commits. I also asked a question upstream to see how it should be done.

The docs also got moved around recently and the relevant commits point to: https://github.com/dubzzz/fast-check/blob/main/packages/fast-check/documentation/RaceConditions.md

There's no inbuilt helper for this.

The docs suggest:

const scheduleCall = <T>(s: Scheduler, f: () => Promise<T>) => {
  s.schedule(Promise.resolve("USE THIS AS A LABEL...")).then(() => f());
}
// Calling doStuff will be part of the task scheduled in s
scheduleCall(s, () => doStuff())

It's very similar to what you already did with the gateFactory.

Using the above method you also don't need a separate factory.

The last interesting thing is fc.scheduledModelRun. But there is no documentation how to use it. It seems to provide a way of model based testing which requires a sort of state machine model first then you're essentially doing transitions in lock step between the abstract and the real machine. This can be used for EFS, but it would only be done at a very high level considering the entire FS state as a model. This would be a completely different set of tests. Best used for more "consistent" systems (like distributed databases). Not really the EFS in this case. We could apply this to certain stateful domains inside PK though. For example the OCL concepts and ACL.

@tegefaulkes
Copy link
Contributor

RIght, just need to squash and checks. Then this will be ready to merge.

@CMCDragonkai
Copy link
Member

CMCDragonkai commented Aug 19, 2022

Did you check the rest of the tests you replaced to ensure that no predicates are missing? I didn't go through all of them.

tegefaulkes and others added 5 commits August 19, 2022 16:01
- `@matrixai/db` 5.0.2 supports `ToString` lock requests
This was needed for the very specific case where `_open` was used to create a file during an existing transaction.
fix: `tests/inodes/INodeManager.test.ts` change to using `await expect().rejects.toThrow`
This just adds some tests that hit the API with a bunch of concurrent operations with random order. This is to check for any db transaction errors.
@tegefaulkes
Copy link
Contributor

Yes, I went through all of the ones I updated.

@tegefaulkes tegefaulkes force-pushed the feature-logger-update branch from b6d426a to 55b2718 Compare August 19, 2022 06:04
@tegefaulkes
Copy link
Contributor

Cleaned up the commit history.

@tegefaulkes
Copy link
Contributor

Just waiting on CI now.

@CMCDragonkai
Copy link
Member

I've asked about the integrating timeouts between fast check and jest: dubzzz/fast-check#3084

@tegefaulkes tegefaulkes merged commit a3ec495 into staging Aug 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
procedure Action that must be executed
Development

Successfully merging this pull request may close these issues.

3 participants