-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IPFS Bandwidth Metrics and Logging enhancements #14
Comments
Hi @obo20, thank you for reviewing the IPFS Project Roadmap and highlighting this need! The latest IPFS WebUI shows in part what you can do with the bandwidth stats that you get out of IPFS today -- https://github.com/ipfs-shipyard/ipfs-webui#ipfs-web-ui -- using the stat API https://github.com/ipfs/interface-ipfs-core/blob/master/SPEC/STATS.md#statsbw. However I hear your feedback that it is hard to understand how much of that bandwidth pressure is caused by a specific file or files. For each file, there are 2 types of traffic:
Both of these piggy back on all the work libp2p does to establish connections, run Peer Discovery and so on which in itself is a good sizable chunk of the traffic. We can consider making it part of the toolkit knowing how much bandwidth a file has been consuming (both fetching and providing to the network) from the Bitswap side. As for the DHT part, given that spreading the provider records goes all together, getting a rigorous number might require a lot of CPU/Memory bound tasks that would be better off spending on handling the requests. That said, one alternative might be just dividing the bandwidth used by the DHT by each file that is being shared with regards to its size (bigger file, more records, larger slice of the bandwidth pie). @obo20 how does this sound? @Stebalien, @alanshaw any extra thoughts? |
We can't really track an absolute "bandwidth consumption" number on a per-file basis as we'd need to track a separate number for every block we're storing (that's a lot of memory). However, we could probably track track active bandwidth (i.e., bandwidth consumed by a file averaged over the last 10s or something).
I'm not sure how to accurately track this without some really invasive modifications. |
I don't imagine the type of functionality we're discussing is something that should be turned on by default. Rather, it would be turned on by nodes who explicitly need that information and are willing to take a slight performance hit to acquire it. We're also fine increasing our storage costs by a percentage to be able to store this information. @daviddias You mention that the peer discovery / connection handling produces a sizable chunk of traffic. Could you elaborate on what you mean by sizable? I had imagined most of our bandwidth logging was going to revolve around purely the delivery of content itself and the connection / peer discovery work was just going to be part of the cost of running a node. @Stebalien You say that tracking a separate number for every block would require a lot of memory. Could you explain why we'd need to store these values in memory? This type of value seems like it should just be incremented in a Database. Also, would it be possible to simply filter tracking of these statistics so that only root hash traffic is logged? |
I'm mostly worried about the disk IO. However, if it were off by default, that might be reasonable. An alternative is to provide some metrics APIs where a user can subscribe to "events" (e.g., sent block X to Y size Z). We already kind of have this with structured logging ( |
I had an interesting conversation with one of our users today that got me thinking of an alternative approach that I wanted to throw out purely for the sake of discussion. Full disclosure, this idea has not been fully explored for feasibility. For awhile I've thought about the feasibility of segmenting off the IPFS data store. By that I mean allow the data store to have sub directories for specific buckets of content. So when I pin content to an IPFS node, I can also provide a "sub-store" name with that content. My initial reasoning for desiring something like this was so that as a node owner I could restrict access to certain sub-sections of my data store to specific users (this would also require access control to be figured out for IPFS), but I suppose the base concept of sub-stores could maybe be used to track bandwidth of data by which sub-store it came from. In the bandwidth use case, each "sub-store" of the data store would be a user, and I could simply query my node each month to find out what that user's sub-store used in bandwidth. I'd imagine that this might take quite a bit of effort to actually accomplish, but who knows, maybe it spurs some other ideas. Let me know if you'd like me to elaborate on this at all. |
Out of curiosity, does anybody know if the Filecoin project is concerned about the issue of bandwidth at all? Without a proper way of managing bandwidth usage, Filecoin users who are paying to store low-bandwidth content will effectively be subsidizing users who are storing extremely high bandwidth content. (Unless the retrieval market aspect of Filecoin somehow solves this issue?) I'd love to hear some perspective from the Filecoin team, as I'd imagine that any solutions that are being considered here will likely also be relevant to the Filecoin project. |
That could definitely be useful for performance but is probably overkill for simple accounting (and would work against deduplication).
Given the trustless nature of Filecoin, the retrieval market charges the user retrieving the file for bandwidth. If the retrieval miner charged the user storing the content, they (the retrieval miner) could arbitrarily inflate bandwidth usage and over charge. However, you bring up a good point. Users storing content will likely want to pay some (trusted) service for bandwidth up-front. This service would effectively act as a pre-payed proxy/cache and would need some way to account for usage. What if we did this through a plugin? go-ipfs supports plugins so we could add a plugin interface for hooking into bitswap. The plugin could be invoked for every block sent (with the CID of the block and the peer ID of the peer requesting the block). You'd then be able to charge users by:
|
@Stebalien I like the concept. It seems like in theory it sounds like this could work. To clarify, would this be something that gets logged to the bitswap ledger and we can query later, or is it something I would have to actively watch in real time? Also, could this querying be filtered by the usual block types ("direct", "recursive", "indirect", "all") ? My ideal scenario would be having the ability to easily query bitswap and get back an array of multi-hashes and an integer representing the amount of times those blocks were sent. It would be cool if I could also easily get back the block size for quick multiplication, but that's something I could manage separately if needed. Or there could be a verbosity flag. Example Response:
In regards to Peer ID, I worry this might be overkill to track this value in combination with the number of blocks sent. It might also be pretty nasty when it comes time to query things. Instead of having a list of Peer ID amounts + a separate list of block amounts (which would give P + B records), we would instead have a record for each Peer ID + Block Hash combination (which would be P * B records). However, this type of data would be pretty powerful, so if I'm missing something and this would actually be trivial to implement, let me know. |
My idea was simply to provide a hook that gets called every time a block is sent to a peer. Bitswap wouldn't remember anything, it would just invoke any hooks registered by plugins and move on.
Those are pinning concepts and don't really exist at the block/bitswap layer. At this layer, we just have a set of blocks and don't have any way to (efficiently) relate these blocks to pins. |
That makes sense. So the plugin itself would log the information, and provide an interface to query it? |
Yes (or anything else it wants, really). |
That sounds like it would work well then! Let me know if you nee me to answer any further questions as the roadmap discussion progresses. Thanks for the great discussion / ideas. |
Apologies for effectively reviving this issue so soon after it was resolved, but this blog entry came out today: https://www.ctrl.blog/entry/ipfs-pin-storage-accounting It brings up some really good points that are quite hard to address without some serious technical overhead. While the post speaks about the difficulties of keeping track of the deduplication in IPFS, For reference, here is what I said earlier in this thread.
Such a sub store could also be utilized for monitoring how much overall data different users are storing on a node. It would work against deduplication between users, but that would be a tradeoff a node provider could make. @Stebalien Do you have any thoughts on the feasibility of something like this? You mentioned sub-stores might actually have some performance benefits, and I'm beginning to see quite a few issues that could be tackled with such a solution. |
I'm talking about a very different use-case. Effectively, tiered caching. For everything else, I'd expect labeling to be more useful. That is, provide some way to annotate blocks with additional information. We've already found needs for this in reference counting, providing, access control, etc. However, that's something that'll take quite a bit of planning. I've suggested the plugin interface because it allows people to experiment with their own solutions without waiting for go-ipfs to merge a general-purpose solution. |
Labeling is something I considered a bit as well, but figured it sounded a bit excessive so I didn't bring it up. Glad to hear that it has other use cases and is something being considered. This will likely solve a lot of core issues that I see coming up in the future.
No worries. I was expecting any general solutions to take some time, I just wanted to get the thought process started. Thanks again for all the information on this. |
side suggestion: I wonder if these metrics may be more appropriate to collect at the ipfs-cluster (which I presume pinata is running) level? I'd imagine that most users requiring these statistics would be large-scale cluster users, and the numbers of interest would be for the entire cluster, not per-daemon this seems like a good opportunity to isolate the metrics code away from the core daemon, and possibly implement the counters in a way that's agnostic of the actual bitswap/provide implementation /cc @hsanjuan @lanzafame |
@parkan I was actually interested in tracking all of this on a per node level. Unfortunately cluster wasn't providing the granularity we wanted for pin management. To elaborate, cluster wouldn't allow us to choose which nodes to pin content on at a per-content level. This presents a problem when some of our users may want to pin content on 2 nodes, but others may want 3 nodes of replication. That also prevents us from pinning content specifically where it's being requested the most. |
hi @obo20 , cluster has almost always allowed to set replication level separately per pinned item. Also we're now adding the possibility of explicitly choosing on which peers things will be pinned (after a user requested it ipfs-cluster/ipfs-cluster#646). We'd be super happy to know what you need most from IPFS Cluster so we can add it, just open an issue or send us a message to ipfs-cluster-wg@ipfs.io. |
Well I feel pretty silly right now. Thank you for that information @hsanjuan. Upon deeper diving into the go-api it does appear that ipfs-cluster supports the replication level per pinned item. Which makes sense, as I suppose ipfs-cluster would need to keep track of that info anyway. Are the api options documented anywhere outside of the github for the ipfs-cluster go package ? This may have been why I initially had the misconception I did. Also, are there any plans for an official js http client? I see this project: https://github.com/te0d/js-ipfs-cluster-api, but unfortunately it appears to have been abandoned. If not, I can certainly fork it and make updates as needed, but obviously officially supported packages are always preferred as I can guarantee they're up to date with the ipfs-cluster project itself. |
Well, perhaps the best is https://godoc.org/github.com/ipfs/ipfs-cluster/api/rest/client#Client (good point, we need to finish https://cluster.ipfs.io/documentation/developer/api/).
We don't have plans nor head count for it. Contribution would be very very welcomed on this front (someone asked for something similar here https://discuss.ipfs.io/t/ipfs-cluster-js-implementation/4706/1) @obo20 let's not hijack this issue though, let's have a thread in Discourse or an issue in the repo to answer all your questions. |
thanks for jumping in @hsanjuan! please link any further discussion from here -- it sounds like cluster is in fact the right place to do this |
Hi, what happened to this thread? I was also looking for granular logs and metrics.
Currently, the best option for this looks like emitting the nginx (which fronts the ipfs gateway) access logs to elasticsearch/kibana. PS: I'm not even sure if elasticsearch/kibana is the right choice for us yet. But using them here as a "placeholder" to explain what I'm trying to do. |
@gmasgras - can you describe the configuration/integration we use for metrics/monitoring via grafana/kibana on the IPFS clusters and gateway? |
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days. |
As my company continues to expand the functionality of our pinning service, I wanted to start a discussion around one of the things that we've found relevant to our roadmap: Metrics and Logging.
Current cloud providers like Digital Ocean / AWS have the ability to charge you based on two things:
With IPFS, you can only really keep track of how much data is being stored, as it's quite difficult to see what kind of usage / traffic is going through your node on a per content basis. I've found a few hacky solutions involving directly monitoring the DHT / logs, but nothing that seems like it would work well in production.
This presents a problem when users want to store content that may have high bandwidth usage (videos, photos, websites). This forces us to bundle storage / estimated bandwidth costs together, which isn't an optimal solution for a multitude of reasons.
The simplest, first pass solution I've came up with is giving IPFS the ability to keep a running tally of how many times hashes on your node have been requested / delivered.
Nice Quality of Life features for this solution would be:
I'd love to hear thoughts from the community on topic, and if anybody has any alternative solutions to the particular problem I mentioned.
The text was updated successfully, but these errors were encountered: