Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2018 Q4 OKR Planning #5474

Closed
wants to merge 10 commits into from
Closed
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
50 changes: 50 additions & 0 deletions OKR.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
# Quarterly Objectives and Key Results

We try to frame our ongoing work using a process based on quarterly Objectives and Key Results (OKRs). Objectives reflect outcomes that are challenging, but realistic. Results are tangible and measurable.

## 2018 Q4

**go-ipfs handles large datasets (1TB++) without a sweat**
- `PX` - It takes less than 48 hours to transfer 1TB dataset over Fast Ethernet (100Mbps)
- `PX` - It takes less than 12 hours to transfer 200GB sharded dataset over Fast Ethernet (100Mbps)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love these as KRs, but we should separately track the work we will do to achieve these key results. Is this faster bitswap?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that smarter bitswap is part of the way there, but there will be other parts.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Steven says the second metric isn't testing something useful - he proposes:

  • List a sharded directory with 1M entries in 10 mins

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about:

  • List a sharded directory with 1M entries over LAN in under 1 minute, with less than a second to the first entry.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(replacement for the second KR - keep the first)

- `PX` - There is a prototype implementation of GraphSync for go-ipfs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

somewhat dependent on js-ipfs graphsync

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hannahhoward already thinking about speeding up directories and may have knowledge of how to implement (but needs a partner for spec work)

- `PX` - There is a better and more performant datastore module (e.g Badger or better)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

- `P1` - Rewrite pinning data structures to support large data sets / many files performantly
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would explicitally talk about adding "alternative pinning strategies". go-ipfs can already pin very large data sets performantly in terms of the pin data structure (the problems is perhaps verifying and traversing and safely gc-ing after unpinning them). Cluster's need are much more specific and are about adding one extra pinning mode. The wider discussion about the pinning system goes beyond the data structures too, and affects multiple components, so a KR about that would need different wording than this too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Stebalien could you comment about what might be a better phrasing?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(the problems is perhaps verifying and traversing and safely gc-ing after unpinning them).

Unfortunately, the datastructure doesn't support many pins. See: #5221

However, I agree. Part of this goal is to make the pinning datastructures flexible.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just FYI: Personally, I think we should switch to using inode like structures (with small blocks inline in the inode itself) in the Blockstore. This will be a major infrastructure change but I think it will make a lot of things a lot easier down the road, including pinning. I created a placeholder issue for this at #5528, I will flush out the details later today or tomorrow.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, the datastructure doesn't support many pins

Agreed, I was thinking of large datasets as a thing with a single root cid and not on the pinset as a whole. But yes, the pinset itself is expensive to edit when it's very large. 👍

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One simple way of solving this would be transitioning pinset into datastore (backed by DB) directly instead of using ipfs objects for that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cluster wants the ability to pin partial dags - ok with a phrasing like:
"rewrite pinning data set to support large pin sets and flexible pin modes"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kevina interested in contributing to this - not sure about owning per say yet =]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @Kubuxu on using an actual database but, baring that, we could use the new go-ipld-hamt structure (now that we've finally merged the refmt cbor ipld branch).


**The bandwidth usage is reduced significantly and is well kept under control**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of contributing to this objective, but I'm probably not the best person to own it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Stebalien who do you suspect should own these OKRs? In the absence of a more qualified person (and in an effort to move important goals forward), I'm willing to take on some or all ownership. If there's a better owner, then I can work with them towards this objective.

- `PX` - Users can opt out of providing every IPLD node (and only provide root hashes)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Includes modifications to the DHT. Need an owner to coordinate with @Stebalien on design. @magik6k started on this, but need to flesh out the design and spec.

Need to rephrase this to be aiming for a spec and early implementation (.5 is a spec but no implementation started)

- `PX` - "Bitswap improvements around reducing chattiness, decreasing bandwidth usage (fewer dupe blocks), and increasing throughput"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

measure: looking at the number of duplicate blocks and the number of parties we send a want list to when we don't need to
goal: don't want to upload as much as we download to find the content. Don't want to download as many duplicate blocks.

can make larger improvements by changing protocol, but current proposal is internal to feature to improve where we look for data
"reduce number of duplicate blocks by 75%"


**It is a joy to use go-ipfs programatically**
momack2 marked this conversation as resolved.
Show resolved Hide resolved
- `PX` @magik6k - The Core API is finalized and released. Make it easier to import go-ipfs as a package
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is a good priority for this?

- `PX` - go-ipfs-api exposes the new Core API
momack2 marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@diasdavid did you intend to let this also include the remote API protocol in this? Because I'm not sure it's covered in this document?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My plan for this was to start with http api implementation and when the new rpc api becomes a thing support both as some users will likely need http is some setups

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have a proposal for an updated KR that effectively captures that work? Taking PRs! =]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@keks - Do you think the RPC API should be included (even partially) in this document? I don't know anything about that work but is it something that you think should be started in this next quarter?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@magik6k is owner for this whole objective

making go-ipfs-api expose is lower priority (blocked by a lot of go-ipfs changes)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify, I was basically talking about this issue. We'll need to coordinate with the other implementors to find out what we really need, but the issue is that the current HTTP API has a lot of weird edge cases, which is painful for everyone who implements it.
Again, I'm not talking about the RPC methods we expose for others to call (these should mostly be the core api), but about the wire format used to make these calls. If we get that one right, a lot of stuff is going to be easier.

- `PX` - go-ipfs Daemon, Gateway, and cmds library use the new Core API
- `PX` - The legacy non Core API is deprecated and the diagram on go-ipfs README is updated

**go-ipfs becomes a well maintained project**
momack2 marked this conversation as resolved.
Show resolved Hide resolved
- `PX` @eingenito - The Go Contributing Guidelines are updated to contemplate expecations from Core Devs and instructions on how to be an effective contributor (e.g include PR templates)
- `PX` @eingenito - A Lead Maintainer Protocol equivalent is proposed, reviewed by the team, merged and implemented
- `PX` @eingenito - Every issue on https://waffle.io/ipfs/go-ipfs gets triaged (reviewed and labeled following https://github.com/ipfs/pm/blob/master/GOLANG_CORE_DEV_MGMT.md)
momack2 marked this conversation as resolved.
Show resolved Hide resolved
- `PX` - Every package has tests and tests+code coverage are running on Jenkins
- `PX` - There is an up-to-date Architecture Diagram of the Go implementation of IPFS that links packages to subsystems to workflows

**gx becomes a beloved tool by the Go Core Contributors**
momack2 marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Who has made major changes to gx in the past? Is it only @whyrusleeping? Are these actually appropriate for someone else to pick up?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@travisperson will probably take these (and @schomatis has been doing some gx work). We have a long discussion here whyrusleeping/gx#179 culminating with an offline discussion between @whyrusleeping, @travisperson and I (key points here whyrusleeping/gx#179 (comment)).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. @travisperson are you comfortable with my adding your name to these OKRs? Are they pretty accurate as stated, and might they belong in another project?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eingenito ya, these looks good.

- `PX` -
- `PX` -
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Stebalien I know there are a lot of new goals for GX but I really can't tell what these are. Could you describe the ones planned for Q4 as KRs?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@warpfork mentioned this in his retrospective comment - do you know if there's a tracking issue?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The thing I remember hearing is semver based heuristics for flattening dependencies, so you'll get a minimal set of imports of any given package. In his retro @warpfork mentioned changes that keep gx from doing so much import rewriting - I'm not sure how that would work but that sounds neat.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Steven suggested these KRs for gx:

  • You can update a minor version of a transitive dependancy without updating intermediate dependancies
  • Imports are not rewritten in go-ipfs

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@whyrusleeping who had thoughts on this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ping to both @whyrusleeping and @Stebalien to iterate on gx KRs that would measure success of this tool being productive/efficient

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

KR: update a transitive dependency without touching intermediate dependencies
KR: go-ipfs doesn't have checked-in gx paths


**Complete outstanding endeavours and still high priorities from Q3**
- `P0` @kevina - base32 is supported and enabled by default
- `PX` - go-ipfs gets a unixfsV2 prototype
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

before a prototype we should probably have a spec we agree on =]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mikeal wants to hand off spec work to someone else

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mikeal is working on this...

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be more specific, I'm working on the spec, and I can easily hand off parts of the spec once the skeleton lands (should be done next week).

We should be able to start implementing once the (skeleton + data) PRs land, as we don't need special file types for MFS (as far as I know). I'll prioritize getting those landed so that we don't block early implementation work.

I'm not that familiar with the Go side of things but I believe @warpfork and @Stebalien mentioned that the IPLD API's on the Go side need to get revised and I'm not sure if that work is also going to block this implementation. If the goal of the unixfs-v2 implementation is to be somewhat codec agnostic then it probably will. Review and revisions to those APIs are in the IPLD Q4 OKR's.

- `PX` @djdv - IPFS Mount
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@djdv what would be the quantifiable KR for IPFS Mount?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this sound?

Add mutable methods to the new mount implementation and get it building+tested on all of our supported platforms

To recap the goal as it is now, we just want to have a new implementation of mount that takes advantage of all of the other new work that has been done (updated commands lib, CoreAPI, etc.), as well as add functionality such as a read+write MFS mountpoint. And we want to make sure this is actually true for all platforms we support (not just some of them).
More work will have to be done in our foundation itself (Golang) this quarter. Specifically, we need to assess the problems around dynamically linking to our fuse library on *nix, as we've done with Windows this quarter.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@djdv is r+w mount across platforms seems like a lot for a quarter, but I don't know much about it or the current state. Is that an appropriately sized quarterly effort? Or is there something you're focusing on first that makes more sense for this goal?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eingenito
The big task is actually the foundation. The libraries we're using (cgo-fuse, CoreAPI) make this relatively straightforward to implement, but we currently don't have guarantees that it will actually build or run until we fix up issues in Golang. Those are more complicated.

I've got an issue to track the status here:
#5003

All info below is supplementary, feel free to disregard it.
tl;dr this can be simple and slow or complicated and fast
I'm trying to target the latter with relative success so far.


Despite the implementation being a straightforward task, getting it right is where the complications come in. The example I've been using, is an HTTP server.
This is an exercise students can implement in an hour. But to handle a large amount of requests efficiently and account for all the edge cases, can take a lot of effort. Examples being projects like NGINX, Apache, et al.
The initial read-only port took very little time to write but was very slow.

So a bulk of the work in mount itself is focusing on reducing the need to make duplicate requests, handle as much as we can concurrently, and cache aggressively (since we have a lot of dedupe potential).
We don't want to just forward every request to the node (and potentially the network), since we're dealing with an environment that inherently makes a lot of simultaneous+duplicate requests (the operating system and all processes running on it). And we really don't want to have to lock the whole filesystem on each request either.

My current focus is on finalizing the mount implementation first and then getting it building dynamically on other platforms. We likely won't have platform specific problems in our codebase, that's all underneath us. So we'll have a time where the implementation is finished and ready for review, but there's still work to be done in the build area.

As it is right now, we'll always be able to run on other platforms, but we currently need a C compiler at build time. It's possible that we can eliminate this and dynamically link with fuse at runtime, which seems preferable. You'll always have the option to build both though.

A lot of this is already thought out, but the actual implementation of it needs to be finished and refined.


## 2018 Q3

Find the **go-ipfs OKRs** for 2018 Q3 at the [2018 Q3 IPFS OKRs Spreadsheet](https://docs.google.com/spreadsheets/d/19vjigg4locq4fO6JXyobS2yTx-k-fSzlFM5ngZDPDbQ/edit#gid=274358435)

## 2018 Q2

Find the **go-ipfs OKRs** for 2018 Q2 at the [2018 Q2 IPFS OKRs Spreadsheet](https://docs.google.com/spreadsheets/d/1xIhKROxFlsY9M9on37D5rkbSsm4YtjRQvG2unHScApA/edit#gid=274358435)

## 2018 Q1

Find the **go-ipfs OKRs** for 2018 Q1 at the [2018 Q1 IPFS OKRs Spreadsheet](https://docs.google.com/spreadsheets/u/1/d/1clB-W489rJpbOEs2Q7Q2Jf1WMXHQxXgccBcUJS9QTiI/edit#gid=2079514081)