Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: initial implementation #17

Merged
merged 33 commits into from
Feb 13, 2023
Merged

feat: initial implementation #17

merged 33 commits into from
Feb 13, 2023

Conversation

achingbrain
Copy link
Member

@achingbrain achingbrain commented Dec 22, 2022

  • Adds modules for a core library, the interface it implements and an interop suite.

UnixFS support is here: ipfs/helia-unixfs#1 and I've pulled the CLI/RPC parts out to here: ipfs/helia-cli#1

@achingbrain achingbrain changed the title chore: add ci files feat: initial implementation Dec 22, 2022
- Adds modules for a core library, a cli and an RPC client/server
- Creates a `~/.helia` directory on startup
- `helia daemon` starts a node with an RPC-over-libp2p server
- `helia id` prints out node information online or offline
@achingbrain achingbrain force-pushed the feat/initial-implementation branch from 1be2ee9 to 20b0f35 Compare December 22, 2022 16:12
@vmx
Copy link
Member

vmx commented Jan 16, 2023

  • Creates a ~/.helia directory on startup

I wonder if on Linux systems it should really be ~/.helia or rather a XDG Base Directory friendly location. Using $XDG_CONFIG_HOME/helia instead of ~/.helia seems to be the most straightforward thing to do (which would be ~/.config/helia` by default).

There's also $XDG_DATA_HOME, which according to the spec is for "user specific data files" (by default at ~/.local/share/), which could be used for the data. That would probably be the "proper way", but other projects like Chromium e.g. also just store everything at a single directory at $XDG_CONFIG_HOME (except for the cache).

@achingbrain
Copy link
Member Author

Sounds good, though I guess we'd fall back to ~/.helia if the $XDG_* env vars aren't defined.

On Mac OS config/data should be under ~/Library & there's probably an equivalent for Windows too.

@achingbrain achingbrain marked this pull request as ready for review February 2, 2023 23:21
@tinytb tinytb added this to the v1 milestone Feb 3, 2023
Copy link
Contributor

@whizzzkid whizzzkid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First Pass, looks interesting, a few questions before I can form an opinion.

package.json Show resolved Hide resolved
package.json Show resolved Hide resolved
packages/helia/README.md Show resolved Hide resolved
packages/helia/README.md Show resolved Hide resolved
packages/helia/package.json Show resolved Hide resolved
const getFromBitswap = pushable<CID>({ objectMode: true })
const getFromChild = pushable<CID>({ objectMode: true })

void Promise.resolve().then(async () => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uhm, I would appreciate sticking to either async-await or .then let's not mix these.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you await here it blocks the function. The idea is to pull from bitswap and the wrapped blockstore in parallel.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so if you don't await an async function, that's essentially firing and forgetting about it, it should be the same as this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not exactly - if you don't await an async function, and that function throws it will cause an unhanded promise rejection which can crash the process. If you .then a resolved promise, and .catch the .then it will not.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can just wrap it in an additional async/await, no?

async function thatThrows() {
    throw new Error('I just throw!')
}

async function thatWraps() {
    try {
        await thatThrows()
    } catch (e) {
        console.error(e.message);
    }
}

function thatDoesNotWait() {
    thatWraps()
}

thatDoesNotWait()

Since async/await is just syntactic sugar around regular promises, it should just do what regular promises do.

Personally I feel we should not be mixing the two, because:

  • creates dissonance while reading/understand the code.
  • makes debugging lengthy promise chains harder to debug.
  • can sometimes cause the unhandled promise rejection warnings.

I am ok with either .then or async/await (not together ofcourse), However if you feel strongly that mixing is the only right way, let's do it that way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think being consistent with async+await vs .then matters as much as doing the right thing for the code. Personally, I've fixed a few comments @whizzzkid has had regarding consistent async/await but sometimes a .then just feels correct. For syntactic sugar in most languages, it's often required to "reach into the guts" of that sugar to do something more technical. I think that should be okay for special circumstances.

However, when we do "reach into the guts" we should document why it's necessary. i.e. this convo should be documented into why we are doing this Promise.resolve()... instead of simply awaiting things.

Copy link
Member

@SgtPooki SgtPooki Feb 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do think pulling the anon function inside the .then out into it's own defined function is a good call regardless of the .then/await decision.

Comment on lines 108 to 126
for await (const cid of cids) {
if (!(await this.has(cid)) && this.bitswap.isStarted()) {
if (options.progress != null) {
options.progress(new CustomEvent<CID>('fetchFromBitswap', {
detail: cid
}))
}

getFromBitswap.push(cid)
} else {
if (options.progress != null) {
options.progress(new CustomEvent<CID>('fetchFromBlockstore', {
detail: cid
}))
}

getFromChild.push(cid)
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does the order matter here? why not run in parallel?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the order matters here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So can we map to a response array and then run in parallel? that would enable mapping responses to the right index position and make this faster.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not that straightforward - getMany takes a stream of CIDs and returns a stream of blocks - if you treat the input stream like an array, you'll end up waiting for the input stream to end before you emit any blocks which is likely to harm performance and not improve it.

Really you want the underlying blockstore to control the concurrency here - each blockstore implementation may have different performance constraints, so by calling it's getMany function and passing a stream in, that lets the blockstore control the parallelisation instead.

You can do things like map each input value to a function, then use it-parallel with the ordered option to parallelise, but that essentially transforms one getMany into lots of gets which prevents the blockstore from operating in the way that is most efficient.

One thing to consider here is the link between input and output here is broken - we push data into the pushable as fast as possible, and the blockstore pulls from the pushable - if the blockstore is slow and the input stream is never-ending, eventually the process will fall over.

Anyway - this code is ported from js-ipfs, likely there are some performance improvements that can be made but I'd suggest not blocking this PR on that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not blocking, but we should look into this at some point. While order may be important, we may be missing out on a significant performance gain by parallelizing these. Can we re-order them after parallelization?

I imagine the performance gains here would be much more significant than for the 13 appIds I handled at https://github.com/ipfs-shipyard/ignite-metrics/blob/b8c041682ad6f399c91af36ff5a452cec07c88d6/reports/downloadDashboardData.ts#L39-L87

Regardless of blockstore implementation, we should increase our performance as much as possible. This would make a great case-study for some benchmarks in this repo. Slow blockstore.. fast blockstore, spotty blockstore..

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While order may be important, we may be missing out on a significant performance gain by parallelizing these. Can we re-order them after parallelization?

Yes, though again not without tradeoffs - see the paragraph about using it-parallel in my comment.

Comment on lines 134 to 137
yield * merge(
this.bitswap.getMany(getFromBitswap, options),
this.child.getMany(getFromChild, options)
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks like a syntax error, yield*

Suggested change
yield * merge(
this.bitswap.getMany(getFromBitswap, options),
this.child.getMany(getFromChild, options)
)
yield* merge(
this.bitswap.getMany(getFromBitswap, options),
this.child.getMany(getFromChild, options)
)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default should be what this change is, but it looks like it's being overridden somewhere.

Copy link
Member Author

@achingbrain achingbrain Feb 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default is defined by the linting rules which say it can be either though I have to admit I struggle to see why one is better than the other.

That said, I do think they should be consistent. We have async * foo () { ... } everywhere, so we have async function * bar () { ... } everywhere, and consequently yield * foo seems ok to me.

Comment on lines 140 to 164
/**
* Delete a block from the blockstore
*/
async delete (cid: CID, options: AbortOptions = {}): Promise<void> {
await this.child.delete(cid, options)
}

/**
* Delete multiple blocks from the blockstore
*/
async * deleteMany (cids: AsyncIterable<CID> | Iterable<CID>, options: AbortOptions = {}): AsyncGenerator<CID, void, undefined> {
yield * this.child.deleteMany(cids, options)
}

async has (cid: CID, options: AbortOptions = {}): Promise<boolean> {
return await this.child.has(cid, options)
}

async * query (q: Query, options: AbortOptions = {}): AsyncGenerator<{ key: CID, value: Uint8Array }, void, undefined> {
yield * this.child.query(q, options)
}

async * queryKeys (q: KeyQuery, options: AbortOptions = {}): AsyncGenerator<CID, void, undefined> {
yield * this.child.queryKeys(q, options)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

proxy might solve these parity methods

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Historically the performance of JS proxies has been awful.

packages/interop/test/fixtures/create-kubo.ts Outdated Show resolved Hide resolved
Copy link
Member

@SgtPooki SgtPooki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few requested changes regarding type consolidation and cleanup.. also for Helia, we should switch it to a class instead of an object literal.

README.md Outdated Show resolved Hide resolved
packages/helia/README.md Show resolved Hide resolved
"release": "aegir release"
},
"dependencies": {
"@helia/interface": "~0.0.0",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a fan of splitting the interface.. I get the dependency hell that can arise from importing helia directly, but i'd rather deal with those dependency issues than the ones that can arise from maintaining two separate packages and having two consumption paths.

packages/helia/src/index.ts Outdated Show resolved Hide resolved
Comment on lines 32 to 56
export interface HeliaComponents {
libp2p: Libp2p
blockstore: Blockstore
datastore: Datastore
}

/**
* Options used to create a Helia node.
*/
export interface HeliaInit {
/**
* A libp2p node is required to perform network operations
*/
libp2p: Libp2p

/**
* The blockstore is where blocks are stored
*/
blockstore: Blockstore

/**
* The datastore is where data is stored
*/
datastore: Datastore
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should use a common interface and extend/override/omit/pick

packages/interface/src/errors.ts Outdated Show resolved Hide resolved
Comment on lines 1 to 11
export abstract class HeliaError extends Error {
public readonly name: string
public readonly code: string

constructor (message: string, name: string, code: string) {
super(message)

this.name = name
this.code = code
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should instantiate an Error instance that comes with a .trace either in this base, or the extended error types.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's a .trace?

Comment on lines +27 to +41
export interface Helia {
/**
* The underlying libp2p node
*/
libp2p: Libp2p

/**
* Where the blocks are stored
*/
blockstore: Blockstore

/**
* A key/value store
*/
datastore: Datastore
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
export interface Helia {
/**
* The underlying libp2p node
*/
libp2p: Libp2p
/**
* Where the blocks are stored
*/
blockstore: Blockstore
/**
* A key/value store
*/
datastore: Datastore
export interface HeliaBase {
/**
* The underlying libp2p node
*/
libp2p: Libp2p
/**
* Where the blocks are stored
*/
blockstore: Blockstore
/**
* A key/value store
*/
datastore: Datastore
}
export interface Helia extends HeliaBase {

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't follow the change here, why do we need both Helia and HeliaBase types?

export async function createKuboNode (): Promise<Controller> {
return await createController({
kuboRpcModule: kuboRpcClient,
ipfsBin: isNode || isElectronMain ? goIpfs.path() : undefined,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH, i'm not a fan of these conditional args. There are so many variables, i'd rather configure the tests to create the kubo node with explicit configurations from the test itself. These conditional args were a large burden for gettig interface tests to pass when pulling kubo-rpc-client out of ipfs-http-client

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All this is saying is invoke goIpfs.path() if the environment is node or the electron main process.

What is the burden this causes?

packages/interop/test/fixtures/create-kubo.ts Outdated Show resolved Hide resolved
@achingbrain
Copy link
Member Author

Going to merge this to encourage some downstream experimentation - any subsequent internal refactors can be addressed separately.

@achingbrain achingbrain merged commit 343d360 into main Feb 13, 2023
@achingbrain achingbrain deleted the feat/initial-implementation branch February 13, 2023 15:49
github-actions bot pushed a commit that referenced this pull request Mar 23, 2023
## @helia/interface-v1.0.0 (2023-03-23)

### Features

* add bitswap progress events ([#50](#50)) ([7460719](7460719)), closes [#27](#27)
* add pinning API ([#36](#36)) ([270bb98](270bb98)), closes [#28](#28) [/github.com//pull/36#issuecomment-1441403221](https://github.com/ipfs//github.com/ipfs/helia/pull/36/issues/issuecomment-1441403221) [#28](#28)
* initial implementation ([#17](#17)) ([343d360](343d360))

### Bug Fixes

* extend blockstore interface ([#55](#55)) ([42308c0](42308c0))
* make all helia args optional ([#37](#37)) ([d15d76c](d15d76c))
* survive a cid causing an error during gc ([#38](#38)) ([5330188](5330188))
* update block events ([#58](#58)) ([d33be53](d33be53))
* update blocks interface to align with interface-blockstore ([#54](#54)) ([202b966](202b966))

### Dependencies

* update interface-store to 5.x.x ([#63](#63)) ([5bf11d6](5bf11d6))

### Trivial Changes

* add release config ([a1c7ed0](a1c7ed0))
* fix ci badge ([50929c0](50929c0))
* release main ([#62](#62)) ([2bce77c](2bce77c))
* update logo ([654a70c](654a70c))
* update publish config ([913ab6a](913ab6a))
* update release please config ([b52d5e3](b52d5e3))
@github-actions
Copy link
Contributor

🎉 This PR is included in version @helia/interface-v1.0.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

github-actions bot pushed a commit that referenced this pull request Mar 23, 2023
## helia-v1.0.0 (2023-03-23)

### Features

* add bitswap progress events ([#50](#50)) ([7460719](7460719)), closes [#27](#27)
* add pinning API ([#36](#36)) ([270bb98](270bb98)), closes [#28](#28) [/github.com//pull/36#issuecomment-1441403221](https://github.com/ipfs//github.com/ipfs/helia/pull/36/issues/issuecomment-1441403221) [#28](#28)
* initial implementation ([#17](#17)) ([343d360](343d360))

### Bug Fixes

* make all helia args optional ([#37](#37)) ([d15d76c](d15d76c))
* survive a cid causing an error during gc ([#38](#38)) ([5330188](5330188))
* update blocks interface to align with interface-blockstore ([#54](#54)) ([202b966](202b966))
* use release version of libp2p ([#59](#59)) ([a3a7c9c](a3a7c9c))

### Trivial Changes

* add release config ([a1c7ed0](a1c7ed0))
* fix ci badge ([50929c0](50929c0))
* release main ([#62](#62)) ([2bce77c](2bce77c))
* update logo ([654a70c](654a70c))
* update publish config ([913ab6a](913ab6a))
* update release please config ([b52d5e3](b52d5e3))
* use wildcards for interop test deps ([29b4fb0](29b4fb0))

### Dependencies

* update interface-store to 5.x.x ([#63](#63)) ([5bf11d6](5bf11d6))
* update sibling dependencies ([ac28d38](ac28d38))
@github-actions
Copy link
Contributor

🎉 This PR is included in version helia-v1.0.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

This was referenced Jan 8, 2024
achingbrain pushed a commit that referenced this pull request Jan 8, 2024
Bumps [it-last](https://github.com/achingbrain/it) from 2.0.1 to 3.0.1.
- [Release notes](https://github.com/achingbrain/it/releases)
- [Commits](achingbrain/it@it-last-v2.0.1...it-last-v3.0.1)

---
updated-dependencies:
- dependency-name: it-last
  dependency-type: direct:development
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

5 participants