-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Sharing IPFS node across browsing contexts (tabs) on same origin #3022
Comments
Another thing I'm recognizing now is that two different browsing contexts (tabs) could start IPFS node with different configurations. Which needs to be handled in some way. |
I have started hacking on the However I'm realizing that API surface is pretty large, which reminded me of js-ipfs-lite that is a lot slimmer but also well type typed (TS ❤️). This combined with the configuration point above got me wondering if this maybe a good opportunity to consider stripping some of the convenience stuff which could be pulled in separately (maybe even opportunity to converge code bases ?) |
I had some discussion with @hugomrdias last night, about aegir not yet supporting multiple webpack configs where he pointed out shiny package.json exports which could be a better solution going forward. After spending all day yesterday running into walls, I was desperate for some progress so I have come up with a following plan and would really love your feedback @achingbrain
|
|
I think those should not matter for same-origin node sharing unless I'm overlooking something.
Unfortunately those are the most complicated and problematic ones, getting
Complexity is mostly in having jQuery style APIs where functions take pretty much anything.
Could please explain what you mean here, I'm not sure I follow.
He actually filled me in on this today, very exciting stuff! We also took same approach so things should merge nicely. I don't think it's distraction it make navigating large codebase like js-ipfs a lot more manageable.
I don't think that matters because mostly client forwards ArrayBuffers / Blobs to the worker thread which than runs corresponding API calls. |
This is exciting! I'm looking forward to contributing here more as time permits. In the mean time, the minimal API alluded to above is very close/similar to ipfs-lite. There is a (currently on the back burner) TS implementing here: https://github.com/textileio/js-ipfs-lite. Just for reference! |
I have outlined some high level questions I would love to get answered by surveying our community / users. I think it would help to ensure that effort is focused on user needs (in browser context) as opposed to API compatibility with IPFS in runtimes which have different constraints.
If you spot something overlooked or missed please let me know. |
@carsonfarmer thanks for joining the thread and prompt to look closer to the js-ipfs-lite API. I have few questions remarks based on what I see there, which I would like to understand:
|
I would say that, in the interest of sharing the node across multiple browsing contexts, a less modular design might be ideal. Additionally, while there are other considerations for the modular design (allows much smaller packages and dynamic imports if done properly), the bundle size savings are so large that the benefits from reuse would certainly be larger than the benefits of smaller one-off bundles. So to answer the question: not that important for use (though I would still advocate for a modular design in terms of its development). As far as Buffer goes, that is purely convenience. All the existing tooling expects/supports it. But we (Textile) internally have been trying to move most of our code and APIs to Uint8Array. So happy to support that push here as well. |
Thanks for the discussion @Gozala. Here are details as we discussed :)
The only configuration option that we rely on is passing a rendezvous server as a swarm address. Previously this was the only way to connect to a rendezvous server, but a way to do this has been added: #2508 (comment)
We use the following:
I believe that's all, but I might be missing something. The main reason we're using Another thing to note is that we don't use garbage collection right now. This was only recently introduced into js-ipfs. Instead we assume that all data that we add to ipfs will just remain in storage, and sync it from the network if it's not there.
We are currently in a situation where our software package (3box) is run on multiple tabs across multiple domains. There are two main things that could be solved using worker threads:
Don't have strong opinions here. On the security on things like the pinning api on multiple origins I tend to lean towards the side of usability now rather than origin specific pins. This gives us the chance to experiment more.
The question of codecs is important. Initially a requirement from our side would be that we can provide a custom codec to the worker. We are working on codecs for signed and encrypted data, which we will start using as soon as it's ready. |
Thanks @oed for taking time to discuss this with me & for publishing more details here. I would like to followup on few points:
This appears to be a know issue ipfs-shipyard/ipfs-pubsub-room#28 so maybe we should revive conversation about this in that thread.
I put some more thought into this after our conversation. And came to conclusion that in first iteration it would be best to do following:
This should keep pretty much same flexibility as IPFS has today (other than WebRTC) and overcome:
|
@oed You also mentioned some inconvenience with using certain API(s) as I recall it was the fact of having to iterate over things. I'm afraid I lost those details, could you please share specifics. It would be a real help in informing our decisions. |
I believe that's it but I haven't looked too deeply at the pubsub-room implementation. Don't have a strong opinion on the message-port stuff.
Previously we did: const cid = (await ipfs.add(buffer)).hash Now we have to do: const cid = (await (await ipfs.add(buffer)).next()).value.cid Perhaps there is a way to do this using the DAG api? I've been unable to add the same buffer and get the same CID using |
I have encountered obstacles when running js-ipfs/packages/ipfs-http-client/test/dag.spec.js Lines 24 to 26 in 5e0a18a
And I'm guessing raw buffers as well. By looking at what js-ipfs/packages/ipfs-http-client/src/dag/put.js Lines 39 to 46 in 5e0a18a
This is problematic as main thread needs to pass dagNode somehow to the worker thread (ideally without having to copy things). There for some generic way to serialize nodes seems necessary. It is also worth taking this moment to consider following:
|
I believe the IPLD team is working on something along the lines of 3: https://github.com/multiformats/js-multiformats |
@haadcode I would love to get some input from you as well. I am also happy to jump on a call to walk through it more closely. |
Had a discussion with @achingbrain @lidel @mikeal @autonome today where I proposed to loosen up backwards compatibility constraint for this work in favor of:
Hope is that this would provide following benefits:
We have discussed idea of using js-ipfs-lite as a baseline. But collectively we've decided that I would instead propose specific API (inspired by ipfs-lite) which can we can discuss further to reach the consensus (feedback from the community is welcome). |
import Block from '@ipld/block'
import CID from 'cids'
// This in nutshell is `Promise<A>` but with `X` type parameter added to
// communicate errors that may occur. Feel free to ignore the interface definition
// itself.
interface Future<X, A> {
// Maps resolve
then<B>(onfulfilled: (value: A) => B | Future<X, B>, onrejected: (error:X) => B | Future<X, B>):Future<X, B>
// Maps error value
then<Y>(onfulfilled: (value: A) => Future<Y, A>, onrejected: (error:X) => Future<Y, A>):Future<Y, A>
// Maps both error & value
then<Y, B>(onfulfilled: (value: A) => B | Future<Y, B>, onrejected: (error:X) => B | Future<Y, B>):Future<Y, B>
catch<B>(onrejected?: (error:X) => B | Future<never, B>): Future<never, B>
}
interface IPFSService extends CoreService {
dag:DAGService
blob:BlobService
}
interface DAGService {
// Note: path isn't optional higher level API wrap this to add deal with
// param skipping etc. Get returns a `Block` if block can't be (fully)
// resolved that is a `ResolveError` which will contain `remainderPath` and
// `Block` that was resolved during this get.
get(cid:CID, path:string, options?:GetOptions):Future<GetError, Block>
// Note: Block knows about format / hashAlg so those options are gone.
put(block: Block, options?: PutOptions):Future<PutError, CID>
// Note: path isn't optional higher level API can deal with that.
// Instead of async iterator you can request ranges so that:
// 1. Amount of coordination between client & server can be adjusted.
// 2. Avoids dangling references across client / server boundries.
tree(cid: CID, path:string, options?:EnumerateOptions): Future<TreeError, string[]>
}
interface BlobService {
// Note: This is instead of `ipfs.add` whether it should be called `add`
// instead is up for the debate. However it is deliberately proposed with a
// different name because:
// 1. It just produces CID and unwrapping async iterator to get it is
// inconvinience that had benig pointed out.
// 2. It is personal preference to have a function that always works as
// opposed to one that fails due to unsupported input inputs most of the
// time.
// 3. In browser this is almost always what you want to do as opposed to
// stream inputs.
put(blob:Blob, options?:PutBlobOptions):Future<PutBlobError, CID>
// Note: This is in place of `ipfs.cat` whether it should be called `cat`
// instead is up for the debate. It is deliberately proposed with a different
// name because it returns a `Blob` instead of `AsyncIterable<Buffer>`. That
// is because blobs are well supported in browsers can be read in different
// ways and avoids server/client coordination. Furthermore higher level API
// providing API compat could be added on top which would create
// `AsyncIterable`s all on the client that performs range reads with this
// API under the hood.
get(ipfsPath:string, options?:GetBlobOptions):Future<GetBlobError, Blob>
}
interface AbortOptions {
timeout?: number
signal?: AbortSignal
}
interface GetOptions extends AbortOptions {
localResolve?: boolean
}
interface PutOptions extends AbortOptions {
pin?: boolean
}
interface EnumerateOptions extends AbortOptions {
recursive?: boolean
// will skip `offset` entries if provided
offset?: number
// Will include at most `limit` number of entries
limit?: number
}
interface PutBlobOptions extends AbortOptions {
chunker?:string,
cidVersion?: 0|1,
enableShardingExperiment?: boolean,
hashAlg?: string,
onlyHash?: boolean
pin?: boolean
rawLeaves?: boolean
shardSplitThreshold?: boolean
trickle?: boolean
wrapWithDirectory?: boolean
}
interface GetBlobOptions extends AbortOptions {
offset?: number
length?: number
}
type PutBlobError =
| WriteError
type GetBlobError =
| NotFound
| ReadError
type GetError =
| ResolveFailure
| NotFound
| DecodeError
| ReadError
| AbortError
| TimeoutError
type PutError =
| EncodeError
| WriteError
| AbortError
| TimeoutError
type EnumerationError =
| DecodeError
| ResolveFailure
| NotFound
| ReadError
| AbortError
| TimeoutError
interface TreeError extends Error {
error: EnumerationError
// Entries that were succesfully enumerated before encountering error
value: string[]
}
interface ResolveFailure extends Error {
remainderPath: string
value: Block
}
interface NotFound extends Error {}
interface NoSuchPath extends Error {}
interface EncodeError extends Error {}
interface DecodeError extends Error {}
interface WriteError extends Error {}
interface ReadError extends Error {}
interface PartialFound extends Error {}
interface AbortError extends Error {}
interface TimeoutError extends Error {} |
Few extra notes I'd like to make:
|
I want to elaborate bit more on my "transfer vs copy" point. Consider following example that assumes current API: const dagNode = {
version: 1,
message: new TextEncoder().encode("Hello World")
}
const cid = await ipfs.dag.put(dagNode) As things stand today now following assertion will hold: TextDecoder().decode(dagNode.message) == "Hello World" However if things are moved across message channel to the workers that is not going to be the case if underlying dagNode.message.byteLength === 0 This is obviously simplified example. If you imagine that That is why I think it is important to have an API that:
|
@carsonfarmer @oed would good to get your eyes on #3022 (comment) as well. |
To me #3022 (comment) makes sense as an interface to ipfs data @Gozala
|
Hey @oed thanks for taking time. I should have more context, there is some is in the #3022 (comment). I will try to address all of the questions below
Not exactly. New IPLD stack changes things a bit and that imported I do however refer to another abstraction layer in the comments. That is mostly to say that we can provide API compatibility with a current / future JS-IPFS API by wrapping this
There are more details current plan here #3022 (comment) to address specifically this though idea is that
In phase 1 you're operating an IPFS node so you're free to do anything. I am also happy to assist in trying to work out details how to provide some convenience in terms of wiring things up across worker / main thread.
There is also If you are referring to ipfs.pin.* APIs that did not made a list because it is not something you team or Textile team has mentioned as being used. If it is something that 3box depends on I'll make sure to add that.
I am not sure understand what are you referring to as blob API, could you elaborate please ? |
Thanks for clarifying @Gozala, this being a new ipfs interface makes sense.
This is clear, so developers using this new interface are expected to import
Sorry about that, slipped my mind. Pinning is definitely something we need. At various points it might be useful to pin objects that you already have locally.
I'm basically wondering why you can't use |
|
I was imagining that
I'll make sure to incorporate it!
You could technically do it with |
It would need to be added to the exports for core and the http client. @Gozala it would be preferable if Nb. exposing dependencies like this ties our hands around versioning as the module's exposed API essentially becomes our API so when they break, we break, even if the only place they are exposed is via these exports. It also complicates the TS effort as these modules start needing type definitions where they don't already have them. If pressed I'd rather not export anything like this for that reason though I can see the utility.
@oed I'm curious as to why you are pinning, you say further up the thread that you aren't using GC, pinning is really only there to stop blocks from being garbage collected.
Maybe do this instead: const { cid } = await last(ipfs.add(buffer)) You can use it-last or something from streaming-iterables to make working with async iterators less verbose. My feeling on this API is that we've always tried to ensure that the API between ipfs-core and the ipfs-http-api remains the same (within reason). That lets the user switch between ipfs implementations largely by changing their setup only, they should not need to make structural changes to their application code. It also means that we can service both implementations with the same test suite, the same documentation, the same tutorials, and that having a canonical way of performing certain actions is possible which helps immeasurably when responding to user questions here, on discuss, IRC & stack overflow. If we do things like introduce I think for an initial release it would be acceptable to say that not all methods or argument types are accepted, but the API that is supported should have the same methods and arguments types/names/positions. So for example keep the We ship that, perhaps add some notes to the docs about what is supported where and people can start to experiment with it so we can get some feedback on our ideas when they use actual working code. Maybe the performance is bad in the first release, I think that's ok, but I don't think it's ok to have an incompatible API for the message port client because it increases the cognitive load on the user and the maintenance/support/documentation load on the maintainers - we want this to be a first class implementation and not an experimental client API. Footnotes
|
When GC was introduced in js-ipfs it was disabled by default. If this setting has been changed it has not been properly communicated. Please let me know what the default is, could definitely be a problem if this has changed!
Makes sense 👍 |
The only way GC runs is if it is manually triggered with There are no plans to run this on a timer or similar right now, but if that changes you can bet we'll shout about it from the rooftops first! |
Thanks for clarifying @achingbrain 🙏 |
There is undoubtedly a lot of benefits to doing it that way and I would wholeheartedly agree in a broader context. Context here is however crucial. Going with the same interface comes with a tradeoffs, like you can never really change / improve things without disrupting the whole community. In this context we are working in tandem with the group so we can validate all our assumptions and also use this as an opportunity to:
As I tried to capture excessively in the interface comments, having simple bare-bone API does not necessarily mean API incompatibility with the IPFS ecosystem. We could and probably should layer that with the layer that provides API interop. What it merely does provides is a choice to the community to ditch all that complexity or stick with convenience of existing API. We also have 👍 from most prominent will be users of this work that are interested in former choice. And with this context in mind I would argue that there is a benefit an trying things with willing members of community and use that as an opportunity to inform work of js-ipfs proper.
I think proposed API does that to some degree, also why I called out that API is going to be incompatible and masking it under familiar API is just deceiving. It creates false assumption that things are compatible, in practice leading to errors. Worse yet, some of errors could hide under the code paths uncovering issues only in production. Sure we can document differences and assume that users notice or we could just call them different names and be sure that users will notice. |
I would like to offer one extra argument in favor of simplification here. I think this can also help us in "IPFS in browsers natively" endeavor. I do not believe there is any chance API as is could make it into browsers. I think this is also an opportunity to consider what such API could be and even exercise it in close collaborations. In other words inform ourselves in that axis as well. |
As a likely consumer of the proposed work/APIs here, I'd just like to voice an opinion in favor of simplification over compatibility. The IPFS API as it is now is vast and includes APIs that are needed when building and testing IPFS, but are less likely to be used by apps that are simply using IPFS as a library. (Certainly I'm biased here... we only use a small subset of the API. I don't have much experience with other projects beyond the few mentioned previously in this thread... so take that comment with a grain of salt). An extensible comparability layer sounds appealing to me, not only insofar as it helps keep the surface area of the core APIs small and robust. |
I think what I've said is a simplification of the existing API. That is, don't implement it all, just implement the top-level files API (e.g. Introducing a new The API can be improved, redundancy can be removed and it can be made leaner, more intuitive and more straightforward and we should talk about this but I feel it's outside the scope of this piece of work. |
I would like to share some updates on this work:
On the API discussion:
|
Using just the On the libp2p side our long term needs are still a bit unclear, and it likely depends on how well performant the |
This pull request adds 3 (sub)packages: 1. `ipfs-message-port-client` - Provides an API to an IPFS node over the [message channel][MessageChannel]. 2. `ipfs-message-port-server` - Provides an IPFS node over [message channel][MessageChannel]. 3. `ipfs-message-port-protocol` - Shared code between client / server mostly related to wire protocol encoding / decoding. Fixes #3022 [MessageChannel]:https://developer.mozilla.org/en-US/docs/Web/API/MessageChannel Co-authored-by: Marcin Rataj <lidel@lidel.org> Co-authored-by: Alex Potsides <alex@achingbrain.net>
This pull request adds 3 (sub)packages: 1. `ipfs-message-port-client` - Provides an API to an IPFS node over the [message channel][MessageChannel]. 2. `ipfs-message-port-server` - Provides an IPFS node over [message channel][MessageChannel]. 3. `ipfs-message-port-protocol` - Shared code between client / server mostly related to wire protocol encoding / decoding. Fixes #3022 [MessageChannel]:https://developer.mozilla.org/en-US/docs/Web/API/MessageChannel Co-authored-by: Marcin Rataj <lidel@lidel.org> Co-authored-by: Alex Potsides <alex@achingbrain.net>
When site with origin e.g.
foo.com
uses JS-IPFS, new node is bootstrapped for every browser context (tab, iframe) this implies:There is an opportunity to improve this by offloading most of JS-IPFS work into
SharedWorker
in browsers where API is available and fallback to a dedicatedWorker
elsewhere (basically Safari).ServiceWorker
as a better fallback mechanism in the future. Unilke dedicated worker they can be shared, but they also come with enough challenges to justify keeping it out of scope initially.However use of
SharedWorker
implies some trade-offs:Worker script distribution
Worker needs to be instantiated with a separate JS script. Which leaves us with the following options:
Lack of WebRTC in workers
This is a major problem as WebRTC is largely replacing WebSocket transport and requires more research. Pragmatic approach could be to tunnel WebRTC traffic into a worker, however that would still imply reestablishing existing connections which would be a major drawback.
The text was updated successfully, but these errors were encountered: