Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add possibility to download IPFS images #2408

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

afbjorklund
Copy link
Member

@afbjorklund afbjorklund commented Jun 9, 2024

@afbjorklund
Copy link
Member Author

afbjorklund commented Jun 9, 2024

The lima.yaml would then look something like:

- location: "ipfs://QmUy3RRqbpsxXYQ7yp4h2koFHdGdVTCfRiqCBLkw1JobUK/ubuntu-24.04-server-cloudimg-amd64.img"
  arch: "x86_64"
  digest: "sha256:32a9d30d18803da72f5936cf2b7b9efcb4d0bb63c67933f17e3bdfd1751de3f3"

(the file name is only used for decompression)

Note that the CID digest is not the file digest:

https://cid.ipfs.tech/#QmUy3RRqbpsxXYQ7yp4h2koFHdGdVTCfRiqCBLkw1JobUK
6278B63498EB92816C50A53202EE3CBEE6FC0F92F97B97CB0AB0A4AE65CCBE38

https://docs.ipfs.tech/concepts/content-addressing/#cids-are-not-file-hashes

(small detail: zQmUy3RRqbpsxXYQ7yp4h2koFHdGdVTCfRiqCBLkw1JobUK in multihash format is the same as
"sha256:6278b63498eb92816c50a53202ee3cbee6fc0f92f97b97cb0ab0a4ae65ccbe38" in text format)

@afbjorklund
Copy link
Member Author

afbjorklund commented Jun 9, 2024

Note: the ipfs tool will output v0 by default, unless using --cid-version 1

v0: QmUy3RRqbpsxXYQ7yp4h2koFHdGdVTCfRiqCBLkw1JobUK
v1: bafybeig5sch22ecfox7gq724rz7uivydwvnnpuqdcnjz72iwelgtrakzui

$ ipfs add --cid-version 0 ubuntu-24.04-server-cloudimg-amd64.img 
added QmUy3RRqbpsxXYQ7yp4h2koFHdGdVTCfRiqCBLkw1JobUK ubuntu-24.04-server-cloudimg-amd64.img
 453.00 MiB / 453.00 MiB [=================================================================================================================] 100.00%
$ ipfs add --cid-version 1 ubuntu-24.04-server-cloudimg-amd64.img 
added bafybeig5sch22ecfox7gq724rz7uivydwvnnpuqdcnjz72iwelgtrakzui ubuntu-24.04-server-cloudimg-amd64.img
 453.00 MiB / 453.00 MiB [=================================================================================================================] 100.00%

The CID version doesn't really matter to Lima, but CIDv0 is deprecated.

https://docs.ipfs.tech/concepts/content-addressing/#cid-versions

@afbjorklund
Copy link
Member Author

afbjorklund commented Jun 10, 2024

Using ipfs cat to get our own progress, and ipfs ls to calculate the size.

Hash                                                        Size     Name
bafybeibnerap2c5tnmqvyftmyxftftjewvo52acwv2gg6thpipyw5zx7fe 45613056 
bafybeidrbml5blbqj2i67x5gujqbwluis6jy5daiso6ir5pakx4hseztum 45613056 
bafybeie56ruaziz43jj3e5iih5r3fyzudeurqbpts6vwigf7nbreoep5de 45613056 
bafybeifva7p55d4szahrrxqjdjqqqlcrcfxm7n2vnpmwucrdkvcexxngmu 45613056 
bafybeie5wduudyjkbcyjvcif7i6gliipapvl5cw4wia3fqoanxunud7dyi 45613056 
bafybeidezz5b2zng3etcwbn4vnu3tn4ao3wrh5qdwzoyo7ywdp7rbymgzq 45613056 
bafybeiaf7vcteys366fmjiclktnzwuqpygpn4u2ftqrl62juaq5sgjw4be 45613056 
bafybeiasvhz6mljylbqwygb7jvp6d6sohaw325yuan5awlo7shjrlslb7m 45613056 
bafybeieiuxw2i5nkelq5tihd2tgmtxjst2ln6rslt35jxslcz4w4mjsnha 45613056 
bafybeiest5wjnmjwyp7vgawfbityfnkgqpr27bfleysbhdrhldcwxw4yq4 45613056 
bafybeif7btzsv7657kfxu7qte6sh6zgpvpsrj7dnou7usj2nwv4zf77ft4 18874368 

So now IPFS address looks the same as HTTP address, with "description":

Downloading the image (ubuntu-24.04-server-cloudimg-amd64.img)
453.00 MiB / 453.00 MiB [---------------------------------] 100.00% 336.20 MiB/s

Instead of the output that you get from ipfs get, that also could change.

Saving file(s) to /home/anders/.cache/lima/download/by-url-sha256/1855c5dccbd6db83ea6c81c276e0440ad9f156584e2ced824290186f1dae563b/data
 453.00 MiB / 453.00 MiB [==================================================================================] 100.00% 0s

@afbjorklund
Copy link
Member Author

At first I was thinking that calculating the digest was unnecessary, since it already has one included in the storage.

But we still want to compare the download with the digest we are expecting, to make sure it's the same image...

@AkihiroSuda
Copy link
Member

Looks good, but needs a documentation

@afbjorklund
Copy link
Member Author

There is a design flaw with this approach. Currently it would look like:

- location: "https://cloud-images.ubuntu.com/releases/24.04/release-20240423/ubuntu-24.04-server-cloudimg-amd64.img"
  arch: "x86_64"
  digest: "sha256:32a9d30d18803da72f5936cf2b7b9efcb4d0bb63c67933f17e3bdfd1751de3f3"
- location: "ipfs://QmUy3RRqbpsxXYQ7yp4h2koFHdGdVTCfRiqCBLkw1JobUK/ubuntu-24.04-server-cloudimg-amd64.img"
  arch: "x86_64"
  digest: "sha256:32a9d30d18803da72f5936cf2b7b9efcb4d0bb63c67933f17e3bdfd1751de3f3"

That means the checksum is duplicated, between the transports. Maybe:

- location: "https://cloud-images.ubuntu.com/releases/24.04/release-20240423/ubuntu-24.04-server-cloudimg-amd64.img"
  arch: "x86_64"
  digest: "sha256:32a9d30d18803da72f5936cf2b7b9efcb4d0bb63c67933f17e3bdfd1751de3f3"
  cid: QmUy3RRqbpsxXYQ7yp4h2koFHdGdVTCfRiqCBLkw1JobUK

Note:

The CID can be calculated, without adding the image to the disk store:

$ sha256sum ubuntu-24.04-server-cloudimg-amd64.img 
32a9d30d18803da72f5936cf2b7b9efcb4d0bb63c67933f17e3bdfd1751de3f3  ubuntu-24.04-server-cloudimg-amd64.img
$ ipfs add --only-hash --quieter ubuntu-24.04-server-cloudimg-amd64.img 
QmUy3RRqbpsxXYQ7yp4h2koFHdGdVTCfRiqCBLkw1JobUK

@afbjorklund
Copy link
Member Author

afbjorklund commented Jul 29, 2024

Unfortunately, Content-Type and Last-Modified are not provided by IPFS...

Current gateway only use heuristics like file magic and relative freshness.

Note: new implementations are supposed to use CID version 1 (not version 0):

$ ipfs add --cid-version=1 --only-hash --quieter ubuntu-24.04-server-cloudimg-amd64.img 
bafybeig5sch22ecfox7gq724rz7uivydwvnnpuqdcnjz72iwelgtrakzui

@afbjorklund afbjorklund marked this pull request as ready for review September 28, 2024 08:37
@afbjorklund
Copy link
Member Author

afbjorklund commented Sep 28, 2024

Looks good, but needs a documentation

Currently it assumes that IPFS Kubo is set up.

i.e. that ipfs add and ipfs get is working

https://docs.ipfs.tech/how-to/kubo-basic-cli/


This might also want to mention some experimental features, like using private networks:

https://github.com/ipfs/kubo/blob/v0.30.0/docs/experimental-features.md#private-networks

For testing purposes, you can use ipfs daemon --offline to avoid connecting to the swarm.

See also docs at: https://github.com/containerd/stargz-snapshotter/blob/main/docs/ipfs.md

@afbjorklund
Copy link
Member Author

Could also add support for IPFS_GATEWAY, as a fallback?

https://blog.ipfs.tech/ipfs-uri-support-in-curl/

If set, it would rewrite any ipfs: into http/https instead...

export IPFS_GATEWAY="http://127.0.0.1:8080"

Signed-off-by: Anders F Björklund <anders.f.bjorklund@gmail.com>
Signed-off-by: Anders F Björklund <anders.f.bjorklund@gmail.com>
Signed-off-by: Anders F Björklund <anders.f.bjorklund@gmail.com>
Signed-off-by: Anders F Björklund <anders.f.bjorklund@gmail.com>
Signed-off-by: Anders F Björklund <anders.f.bjorklund@gmail.com>
Signed-off-by: Anders F Björklund <anders.f.bjorklund@gmail.com>
Signed-off-by: Anders F Björklund <anders.f.bjorklund@gmail.com>
@@ -299,6 +299,10 @@ rosetta:
# 🟢 Builtin default: use name from /etc/timezone or deduce from symlink target of /etc/localtime
timezone: null

# Allow using IPFS for downloading files with a provided CID.
# 🟢 Builtin default: false
ipfs: null
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just use location: cid://<CID> ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or location: ipfs://<CID> ?

Copy link
Member Author

@afbjorklund afbjorklund Oct 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use ipfs: instead of https:, but then it becomes a different URL...

(which is bad because it would require multiple digests, and cache entries)

#2408 (comment)

Also it would make it mandatory to use ipfs (or a ipfs gateway), to download?

By having it as an attribute on the File, it is possible to have it optional (and share digest)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That means the checksum is duplicated, between the transports.

I think this is better than introducing more YAML fields?

Also it would make it mandatory to use ipfs (or a ipfs gateway), to download?

The downloader should just fall back to the next candidate in the []Images

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is possible to do this with the current implementation, i.e. it supports both ways

You can change the location (url), or you can provide a cid for an existing location

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the fields are optional, they should not interfer so much with existing templates?

Probably want to have some more automation / sharing in place, before adding them.

@afbjorklund
Copy link
Member Author

afbjorklund commented Oct 9, 2024

Regarding addressing of ipfs objects, there are some more details here:
https://github.com/ipfs/in-web-browsers/blob/master/ADDRESSING.md

"The four stages of the upgrade path for path addressing."

  1. Current: HTTP-to-IPFS gateway
    e.g. https://ipfs.io/ipfs/$hash
  2. Short term: URL
    e.g. ipfs://$hash
  3. Mid term: URI
    e.g. dweb:/ipfs/$hash
  4. Long term: NURI
    e.g. /ipfs/$hash

So it is simpler to only provide the CID/hash, as additional information?
https://docs.ipfs.tech/concepts/content-addressing/#what-is-a-cid

Since it is a multiformat/multihash, it doesn't need an additional prefix*.

* it actually already has a couple of them, but in a string encoded form
(one can use ipfs cid format, or https://cid.ipfs.io/, to decipher them)

cid: bafybeieipdaxd3fzy3j7syzzxdaqxramk65j7ajzcqgmi6b5jyq4jgbwue
# base32-cidv1-dag-pb-(sha2-256:32:8878C171ECB9C6D3F96339B8C10BC40C57BA9F8139140CC4783D4E21C49836A1)

Signed-off-by: Anders F Björklund <anders.f.bjorklund@gmail.com>
@afbjorklund
Copy link
Member Author

afbjorklund commented Oct 9, 2024

More user-facing documentation (for the website) can go in a second PR. Maybe just use the links above?

https://docs.ipfs.tech/how-to/kubo-basic-cli/

Also needs documentation on how to update images, then again we don't have any docs for sha256 either...

- location: https://github.com/containerd/nerdctl/releases/download/v1.7.6/nerdctl-full-1.7.6-linux-amd64.tar.gz
  arch: x86_64
  digest: sha256:2c841e097fcfb5a1760bd354b3778cb695b44cd01f9f271c17507dc4a0b25606
  cid: bafybeieipdaxd3fzy3j7syzzxdaqxramk65j7ajzcqgmi6b5jyq4jgbwue

Like, when updated to 1.7.7 - how do you update the other fields?

$ wget https://github.com/containerd/nerdctl/releases/download/v1.7.7/nerdctl-full-1.7.7-linux-amd64.tar.gz
...
HTTP request sent, awaiting response... 200 OK
Length: 259844835 (248M) [application/octet-stream]
Saving to: ‘nerdctl-full-1.7.7-linux-amd64.tar.gz’
$ sha256sum nerdctl-full-1.7.7-linux-amd64.tar.gz
a731eac93e8e9dda1a0d76dc1606438deb0668ea7d6bd5c5af436353ed9f65c5  nerdctl-full-1.7.7-linux-amd64.tar.gz
$ ipfs add --only-hash --cid-version=1 --progress=false nerdctl-full-1.7.7-linux-amd64.tar.gz 
added bafybeiexmdvas4d3dy3npvecj3udihifaqndhelpiyjb67zbsm3g5eqlba nerdctl-full-1.7.7-linux-amd64.tar.gz

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

IPFS for downloading images, using decentralized peer-to-peer
2 participants