Skip to content
This repository has been archived by the owner on Feb 12, 2024. It is now read-only.

Should we dial the modularization a bit down? Do we all see the value on having everything as a module? #2222

Closed
2 of 4 tasks
daviddias opened this issue Jul 5, 2019 · 27 comments
Assignees

Comments

@daviddias
Copy link
Member

daviddias commented Jul 5, 2019

Disclaimer: This proposal is not about going monorepo. Please read the full thread :)

I've been thinking a lot on building quality, maintainability and robustness on software and taking a peek at other languages and ecosystems to drive inspiration. One of the thoughts out of that journey is the simple question: Do we all see the value on having everything as a module?

For context, I'm one of the main culprits to which IPFS is so modularized. In fact, when I joined the project in 2015, it was pretty much a massive go-ipfs repo, with the exception of multiformats. I then took on js-ipfs as a greenfield project and applied the best of what I understood on modularization as a value which I believed to add a lot to a protocol stack that prioritizes interoperability, future proofing and maintainability. Looking backwards, I'm happy we all onboarded the modularization train as it enables IPFS to run on Node.js and the Browser, use a multitude of transports, some that hadn't been invented yet, and overall sets IPFS and libp2p on the track to iterate and adapt as systems evolve.

All of that said, I do think that there are some artifacts of this modularization goal that actually add cost without any gains. This is specially true when we focus on parts of the "engine" that are unique and integral to the system as a whole and non usable elsewhere.

One of the clear examples is js-libp2p and js-libp2p-switch in which we spent countless hours doing sync releases, duplicating code (the top level API is pretty much the same), duplicating documentation efforts, duplicating tests and adding overhead to the developers that had to tinker with both modules (and repos) at the same time. js-libp2p-switch is an integral part to the js-libp2p dialing engine and would be better off living inside js-libp2p itself.

There are other examples and looking at things such as #1670, we can quickly gauge how much it costs to have this over-modularization.

If this resonates with others, I'll go ahead and make a short proposal of where I believe we could head to reduce overhead/admin work:

IPFS side

  • Merge into js-ipfs, the following modules:
    • ipfs-multipart
    • js-ipfs-mfs
    • js-ipfs-repo
    • js-ipfs-block-service
    • js-ipfs-unixfs-exporter
    • js-ipfs-unixfs-importer

libp2p side

  • Merge into js-libp2p, the following modules:
    • js-libp2p-connection-manager
    • js-libp2p-identify
    • js-libp2p-keychain
    • js-libp2p-nat-mngr
    • js-libp2p-pnet
    • js-libp2p-ping
    • js-libp2p-switch
    • (potentially, need to think more) js-libp2p-circuit
  • Merge into libp2p-websocket-star, the following repo:
    • libp2p-websocket-star-rendezvous
  • Coalesce the multiple libp2p interfaces into js-libp2p-interfaces, these include:
    • interface-connection
    • interface-transport
    • interface-stream-muxer
    • interface-peer-discovery
    • interface-peer-routing
    • interface-content-routing

Looking forward to learn how people think about such transition.

@jacobheun
Copy link
Contributor

I'm highly in favor of coalescing the core modules for each of these. Even if we still released them individually with something like lerna so that developers who want fine grained control can get it, testing, development and releasing would be significantly easier. I currently use VSCode workspaces to group the modules like this anyway, it just doesn't have the benefit of actually being combined.

Doing that would also potentially enable us to create and more easily deploy up-to-date, preconfigured versions of js-libp2p, etc with each release, to help users get running more quickly.

(potentially, need to think more) js-libp2p-circuit

Switch already includes it, it's a core module imo. People can always turn it off in config, but being able to ship libp2p and ensure it's already dialable over any circuit relay is nice. I think we can have the module discussion in a separate ticket if this moves forward so we don't veer too far from the intended conversation.

@daviddias
Copy link
Member Author

daviddias commented Jul 8, 2019

Switch already includes it, it's a core module imo. People can always turn it off in config, but being able to ship libp2p and ensure it's already dialable over any circuit relay is nice.

Good to know, then it makes total sense then!

@jacobheun feel empowered to move independently in libp2p land. I believe that libp2p can move lot faster and reduce a ton of overhead by adopting this strategy.

For any of these updates. It would be ideal to merge the gittrees so that contributors are preserved.

@dirkmc
Copy link
Contributor

dirkmc commented Jul 9, 2019

I agree that having a lot of different modules adds to maintenance and release overhead, and it can be hard for a new person trying to dig through the code to understand how it works.

On the other hand I think we can do a better job of harnessing the good will of the community to help us out with bug fixes and improvements, and modularization helps us in a couple of important ways:

  • currently it takes around one hour to run the js-ipfs tests.
    I'm concerned that if we include modules like mfs, repo and the importer / exporter it will take even longer.
  • IPFS is really a grab bag of several different technologies. I think it's simpler for someone new to the project to fix the bug for the particular thing they're working on if they can make a change to one smaller part of the code, and run and test just that part.

Ideally we would have

  • tests in js-ipfs that are generalized and run fast
  • tests in the module itself that are specific and more thorough
  • a way for community members to propose additions/updates to documentation without having to do PRs (and go through the build process)

@daviddias
Copy link
Member Author

Fair points. Here are some thoughts

currently it takes around one hour to run the js-ipfs tests.

This is both a good and a bad sign:

  • It suggests that there are probably a ton of opportunities to optimize how tests get set up and how assertions are done. For example, why make everything sequential and not run each suite in parallel?
  • That we probably should rely on CI more to run the multiple test suites.
  • That we have a ton of tests. One should expect a P2P File System that works both on the browser to be tested thoroughly and that translates for it to take time.

Btw, you can also use mocha to select a subset of the tests with test filters:

Test Filters
  --fgrep, -f   Only run tests containing this string                   [string]
  --grep, -g    Only run tests matching this string or regexp           [string]
  --invert, -i  Inverts --grep and --fgrep matches                     [boolean]

Another thing to consider is to take a peek at the latest developments of Jest -- https://jestjs.io --, it would minimize test time by just running the tests that touch code paths that have been changed.

I think it's simpler for someone new to the project to fix the bug for the particular thing they're working on if they can make a change to one smaller part of the code, and run and test just that part.

It's kind of a double edge sword. In one side it is easy because one has to parse less code, at the same time, one might spend hours hacking on something that misses another part of the codebase that will break with that change (it has happened many times before), then putting even a greater amount of stress on the maintainers to not only review and understand how the change affects the module but also how the change will affect the entire stack.

@hugomrdias
Copy link
Member

hugomrdias commented Jul 9, 2019

Merging some repos would be extremely helpful for developer productivity.

@daviddias
Copy link
Member Author

daviddias commented Jul 9, 2019

This proposal is not about going monorepo, that is at the other end of the spectrum. I recommend not using that word otherwise multiple bikesheds will get painted ad eternum :)

All that this proposal is about is about coalescing some of the modules/repos that are pretty much integral (i.e. not swapable) to the stack.

@hugomrdias
Copy link
Member

Bikeshed averted phew 🧐

@jacobheun
Copy link
Contributor

fwiw, go-libp2p did this recently with their core libp2p modules, https://github.com/libp2p/go-libp2p-core. While I haven't worked in the code base, searching through the code when I've been trying to find certain behavior has become significantly easier.

@JustMaier
Copy link

Super happy to see this discussion happening. As someone that jumped into the community pretty recently, having so many separate micro-repos definitely made digging through how things worked more difficult.

@vasco-santos
Copy link
Member

Totally in favor of getting all mentioned libp2p modules into the main repo, getting all the core pieces for libp2p together, which will be helpful for maintenance, releasing and ease the way for new contributors.

Also heavily in favor of having libp2p-interfaces with all our current interfaces. go-libp2p already went on that way, and I believe this will allow us to present the interaction between each piece of libp2p in a much clear way, improving our docs, at the same time as easing the maintenance cost

@daviddias
Copy link
Member Author

daviddias commented Jul 20, 2019

@hugomrdias
Copy link
Member

hugomrdias commented Jul 20, 2019

Big Yay from me, just one suggestion multipart imo should go into js-ipfs. I can take on that one plus whatever to make this happen asap.

@daviddias
Copy link
Member Author

Ah, ipfs-multipart is indeed used on the http-api implementation as well --https://github.com/ipfs/js-ipfs/blob/master/src/http/api/resources/files-regular.js#L3 --, given that it is also used in the http-client, then it makes sense to leave it as a separate module.

@hugomrdias
Copy link
Member

Actually the http-client doesn't use multipart only js-ipfs

@daviddias
Copy link
Member Author

I see, indeed it seems that js-ipfs-http-client has its own multipart script -- https://github.com/ipfs/js-ipfs-http-client/blob/master/src/utils/multipart.js -- rather than requesting it from the module. I had this all switched in my mind. I guess the proposal to merge ipfs-multipart into js-ipfs makes sense then!

@jacobheun
Copy link
Contributor

I started working on merging the libp2p interfaces into a single repo and realized a pain point we will have with tagging the different modules on release. I noted a potential solution with Aegir for this, libp2p/js-libp2p#383 (comment).

@hugomrdias
Copy link
Member

@jacobheun I think the point is to only have one package not several.
The way you described in the comment is basically a monorepo, I can make aegir work with yarn workspaces and lerna but right now I think we should follow @daviddias proposal and just merge repos and packages.
Ppl can still atomically require each interface.

@jacobheun
Copy link
Contributor

@hugomrdias yes, you're right, sorry too many things in my brain.

For clarity, what I am planning to do with the libp2p interfaces modules is to release them as @libp2p/interfaces.

@daviddias
Copy link
Member Author

@jacobheun #2222 (comment) is the correct way. And given that the interfaces are only required in tests, there will be no bloat added to the module final bundle.

One question. why @libp2p/interfaces? AFAIK, we don't use npm namespaces. So perhaps just libp2p-interfaces?

@jacobheun
Copy link
Contributor

And given that the interfaces are only required in tests, there will be no bloat added to the module final bundle

interface-connection is heavily used outside of tests because it provides the actual connection class. Perhaps the right thing to do as a part of this is to separate that to avoid bundle bloat. We should probably be more strict about keeping test suites separate from any implementations.

One question. why @libp2p/interfaces? AFAIK, we don't use npm namespaces. So perhaps just libp2p-interfaces?

Good question, this could actually introduce annoying problems scoping it, so I think you are right that we should just use libp2p-interfaces.

@hugomrdias
Copy link
Member

We just need to require specific interfaces where needed to avoid bundle problems. Just like we do for lodash and pullstreams.
There's no need to require a global index.js that brings everything in fact we shouldn't even have a index.js that re-exports the interfaces, just one file per interface.

@jacobheun
Copy link
Contributor

Let's move this conversation over to the interface-connection issue, libp2p/js-libp2p#383 (comment). The libp2p core changes and ipfs changes really won't be impacted by this, because their use case is quite different as the modules being pulled in are mainly internal and are typically used as properties of the parent (ipfs/libp2p).

@jacobheun
Copy link
Contributor

If anyone has opinions about how the history/etc is handled for combining core modules together, I started a draft PR that just pulls js-libp2p-switch into js-libp2p if you'd like to peruse/comment.

@daviddias
Copy link
Member Author

Update: js-libp2p has shipped the coalescing of the modules with libp2p/js-libp2p#400 ❤️ 🚀

@achingbrain
Copy link
Member

We put a lot of thought into how we can produce libraries that are useful both in IPFS as dependencies but also useful independently in other libraries or even in something like ipfs-lite.

ipfs-inactive/js-ipfs-unixfs-importer#38 (comment)

This initiative does rather seem to conflict with the above. Whether or not anyone actually does use these modules independently is another matter.

@achingbrain
Copy link
Member

FWIW my feeling is that the modules flagged for merging into js-IPFS are pretty low-churn so merging them won't hurt, except sometimes people say "I'd like to create a UnixFS DAG without requiring all of IPFS", which would then be necessary if we merge them in.

The daily pain for me is not dealing with change in these modules but instead managing the interconnectedness of js-ipfs, interface-js-ipfs-core, js-ipfs-http-client and js-ipfsd-ctl.

@achingbrain
Copy link
Member

Closing as this is very stale, and lots of it got done!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
No open projects
Status: Done
Development

No branches or pull requests

7 participants