Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: switch gateway code to new API from go-libipfs #9681

Merged
merged 2 commits into from
Mar 30, 2023

Conversation

aschmahmann
Copy link
Contributor

@aschmahmann aschmahmann commented Mar 1, 2023

Relates to ipfs/boxo#176

  • Adds Range request sharness test for ?format=raw (discussion)
  • Adds HAMT directory listing regression test
  • Adds HAMT Range regression test

@aschmahmann aschmahmann force-pushed the feat/golibipfs-gateway-refactor branch from 70ba1cc to 4685809 Compare March 5, 2023 05:51
@aschmahmann aschmahmann force-pushed the feat/golibipfs-gateway-refactor branch 2 times, most recently from 3f0bff3 to 48178fa Compare March 7, 2023 11:15
@hacdias hacdias force-pushed the feat/golibipfs-gateway-refactor branch 4 times, most recently from 943c471 to 80d8ea2 Compare March 9, 2023 10:12
@hacdias hacdias self-requested a review March 9, 2023 10:12
@lidel
Copy link
Member

lidel commented Mar 9, 2023

@hacdias fyi i've found panic when loading http://127.0.0.1:8080/ipns/invalid.example.net

Click to see details
2023-03-09T22:20:07.487+0100	ERROR	core/server	gateway/handler.go:282	A panic occurred in the gateway handler!
2023-03-09T22:20:07.487+0100	ERROR	core/server	gateway/handler.go:283	runtime error: invalid memory address or nil pointer dereference
goroutine 279 [running]:
runtime/debug.Stack()
	runtime/debug/stack.go:24 +0x65
runtime/debug.PrintStack()
	runtime/debug/stack.go:16 +0x19
github.com/ipfs/go-libipfs/gateway.(*handler).ServeHTTP.func1()
	github.com/ipfs/go-libipfs@v0.6.2-0.20230309103450-2411c38dc5dc/gateway/handler.go:284 +0xea
panic({0x2359bc0, 0x390ada0})
	runtime/panic.go:884 +0x213
github.com/ipfs/go-libipfs/gateway.ImmutablePath.String(...)
	github.com/ipfs/go-libipfs@v0.6.2-0.20230309103450-2411c38dc5dc/gateway/gateway.go:33
github.com/ipfs/go-libipfs/gateway.(*BlocksGateway).getPathRoots(0xc00072e1a0?, {0x2b46aa0, 0xc000174120}, {{0x0?, 0x0?}})
	github.com/ipfs/go-libipfs@v0.6.2-0.20230309103450-2411c38dc5dc/gateway/blocks_gateway.go:292 +0x64
github.com/ipfs/go-libipfs/gateway.(*BlocksGateway).getNode(0xc000a4e120, {0x2b46aa0, 0xc000174120}, {{0x0?, 0x0?}})
	github.com/ipfs/go-libipfs@v0.6.2-0.20230309103450-2411c38dc5dc/gateway/blocks_gateway.go:251 +0x6b
github.com/ipfs/go-libipfs/gateway.(*BlocksGateway).Get(0xc000a4e120, {0x2b46aa0, 0xc000174120}, {{0x0?, 0x0?}})
	github.com/ipfs/go-libipfs@v0.6.2-0.20230309103450-2411c38dc5dc/gateway/blocks_gateway.go:132 +0x74
github.com/ipfs/kubo/core/corehttp.(*offlineGatewayErrWrapper).Get(0x265b79d?, {0x2b46aa0?, 0xc000174120?}, {{0x0?, 0x0?}})
	github.com/ipfs/kubo/core/corehttp/gateway.go:175 +0x6b
github.com/ipfs/go-libipfs/gateway.(*handler).serveDefaults(0xc00018d0a0, {0x2b46aa0, 0xc000920f30}, {0x2b465c0, 0xc000af3f20}, 0xc00072ec68, {{0x0?, 0x0?}}, {{0x0, 0x0}}, ...)
	github.com/ipfs/go-libipfs@v0.6.2-0.20230309103450-2411c38dc5dc/gateway/handler_defaults.go:41 +0x3a6
github.com/ipfs/go-libipfs/gateway.(*handler).getOrHeadHandler(0xc00018d0a0, {0x2b465c0, 0xc000af3f20}, 0xc00072ee70)
	github.com/ipfs/go-libipfs@v0.6.2-0.20230309103450-2411c38dc5dc/gateway/handler.go:407 +0xc54
github.com/ipfs/go-libipfs/gateway.(*handler).ServeHTTP(0xc00018d0a0, {0x2b465c0, 0xc000af3f20}, 0xc00046e900)
	github.com/ipfs/go-libipfs@v0.6.2-0.20230309103450-2411c38dc5dc/gateway/handler.go:290 +0x211
go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp.(*Handler).ServeHTTP(0xc0009de1e0, {0x7fca582551b8?, 0xc002b70fa0}, 0xc002b6a700)
	go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp@v0.32.0/handler.go:191 +0xbd1
net/http.HandlerFunc.ServeHTTP(0x20cba8d?, {0x7fca582551b8?, 0xc002b70fa0?}, 0xc002b70f00?)
	net/http/server.go:2122 +0x2f
net/http.(*ServeMux).ServeHTTP(0x2b469f8?, {0x7fca582551b8, 0xc002b70fa0}, 0xc002b6a700)
	net/http/server.go:2500 +0x149
github.com/ipfs/go-libipfs/gateway.WithHostname.func1({0x7fca582551b8, 0xc002b70fa0}, 0xc002b6a700)
	github.com/ipfs/go-libipfs@v0.6.2-0.20230309103450-2411c38dc5dc/gateway/hostname.go:239 +0x286
net/http.HandlerFunc.ServeHTTP(...)
	net/http/server.go:2122
net/http.HandlerFunc.ServeHTTP(0x7fca582551b8?, {0x7fca582551b8?, 0xc002b70fa0?}, 0xc00072f720?)
	net/http/server.go:2122 +0x2f
net/http.(*ServeMux).ServeHTTP(0x7fca582551b8?, {0x7fca582551b8, 0xc002b70fa0}, 0xc002b6a700)
	net/http/server.go:2500 +0x149
github.com/prometheus/client_golang/prometheus/promhttp.InstrumentHandlerResponseSize.func1({0x7fca582551b8?, 0xc002b70f50?}, 0xc002b6a700)
	github.com/prometheus/client_golang@v1.14.0/prometheus/promhttp/instrument_server.go:288 +0xc5
net/http.HandlerFunc.ServeHTTP(0x19?, {0x7fca582551b8?, 0xc002b70f50?}, 0xc002b70f50?)
	net/http/server.go:2122 +0x2f
github.com/prometheus/client_golang/prometheus/promhttp.InstrumentHandlerRequestSize.func2({0x7fca582551b8?, 0xc002b70f50?}, 0xc002b6a700)
	github.com/prometheus/client_golang@v1.14.0/prometheus/promhttp/instrument_server.go:249 +0x77
net/http.HandlerFunc.ServeHTTP(0xc00072f908?, {0x7fca582551b8?, 0xc002b70f50?}, 0x7fca81a59f18?)
	net/http/server.go:2122 +0x2f
github.com/prometheus/client_golang/prometheus/promhttp.InstrumentHandlerDuration.func2({0x7fca582551b8, 0xc002b70f50}, 0xc002b6a700)
	github.com/prometheus/client_golang@v1.14.0/prometheus/promhttp/instrument_server.go:108 +0xbf
net/http.HandlerFunc.ServeHTTP(0x2b44c10?, {0x7fca582551b8?, 0xc002b70f50?}, 0x3?)
	net/http/server.go:2122 +0x2f
github.com/prometheus/client_golang/prometheus/promhttp.InstrumentHandlerCounter.func1({0x2b44c10?, 0xc002b12a80?}, 0xc002b6a700)
	github.com/prometheus/client_golang@v1.14.0/prometheus/promhttp/instrument_server.go:146 +0xb8
net/http.HandlerFunc.ServeHTTP(0x6b5e00?, {0x2b44c10?, 0xc002b12a80?}, 0x7fca586b5dd8?)
	net/http/server.go:2122 +0x2f
net/http.(*ServeMux).ServeHTTP(0xec9aa0?, {0x2b44c10, 0xc002b12a80}, 0xc002b6a700)
	net/http/server.go:2500 +0x149
github.com/ipfs/kubo/core/corehttp.makeHandler.func1({0x2b44c10?, 0xc002b12a80?}, 0x23e1ec0?)
	github.com/ipfs/kubo/core/corehttp/corehttp.go:54 +0x6b
net/http.HandlerFunc.ServeHTTP(0x0?, {0x2b44c10?, 0xc002b12a80?}, 0xe425ce?)
	net/http/server.go:2122 +0x2f
net/http.serverHandler.ServeHTTP({0xc0003455c0?}, {0x2b44c10, 0xc002b12a80}, 0xc002b6a700)
	net/http/server.go:2936 +0x316
net/http.(*conn).serve(0xc000115050, {0x2b46aa0, 0xc000a45230})
	net/http/server.go:1995 +0x612
created by net/http.(*Server).Serve
	net/http/server.go:3089 +0x5ed

@lidel
Copy link
Member

lidel commented Mar 9, 2023

There is also something wrong with generating HTML for listing big (HAMT-sharded) directories.
The tests we have use small directory, which did not have this problem (so tests did not catch this).

To reproduce:

Need to fix this and add regression test that uses HAMT directory (create CAR fixture with standalone, minimal set of blocks required for enumerating directory).

@hacdias
Copy link
Member

hacdias commented Mar 10, 2023

@lidel I can't reproduce your HAMT issue. I fixed the panic.

image

Commit: 5ad5158e8
Client Version: kubo/0.20.0-dev/5ad5158e8
Protocol Version: ipfs/0.1.0

@lidel
Copy link
Member

lidel commented Mar 10, 2023

@hacdias thanks!

Did you try on empty repo?

I've fetched your changes and http://127.0.0.1:8080/ipfs/bafybeiggvykl7skb2ndlmacg2k5modvudocffxjesexlod2pfvg5yhwrqm/ and the problem is still there. It takes forever to load dir listing on empty repo, node is fetching some stuff in the background, and after fetching ~4k blocks I see the same blank screen as before.

How to reproduce

To reproduce:

$ export IPFS_PATH=/home/lidel/tmp/test-refactor-$(date +"%s") && ipfs init --empty-repo && ipfs daemon 

and then:

$ # confirm local repo has only one block
$ ipfs refs local | wc -l 
1 

$ # measure how long fetch took (with Accept, so test is future-proof against refactors, if we ever change default bahavior)
$ time curl -v  -H "Accept: text/html" http://127.0.0.1:8080/ipfs/bafybeiggvykl7skb2ndlmacg2k5modvudocffxjesexlod2pfvg5yhwrqm/ 
.. 

$ # show how many blocks were fetched in the process
$ ipfs refs local | wc -l 
? # probaby some ridiculous number (>10k)

I suspect the refactor removed all optimizations we did in the past year (#8853 and made even better in #9481) and once again Kubo is fetching root nodes for all 10k of children items in directory. It takes forever to load unless you already have them in your local blockstore, so you have to ipfs init --empty-repo before testing.

How to test this on CI

After fixing upstream, we need to add regression tests to Kubo repo to avoid the problem coming back over and over again.

Perhaps:

  1. Identify which blocks were fetched by 0.18 (with optimization from feat: fast directory listings with DAG Size column #9481)
  • You can get list of blocks by doing ipfs init --empty-repo, loading directory via curl, and then listing CIDs in local store ipfs refs local
  1. Use these CIDs as a static fixture for testing bafybeiggvykl7skb2ndlmacg2k5modvudocffxjesexlod2pfvg5yhwrqm HAMT
    • import them to empty repo before testing with curl
    • run tests with ipfs daemon --offline to ensure no additional blocks will be fetched from the network
  2. If the number of CIDs returned by ipfs refs local is bigger than expected one, fail the test

@hacdias hacdias force-pushed the feat/golibipfs-gateway-refactor branch 3 times, most recently from 6d76870 to 86ab484 Compare March 14, 2023 12:56
@hacdias hacdias self-assigned this Mar 14, 2023
@hacdias hacdias force-pushed the feat/golibipfs-gateway-refactor branch 2 times, most recently from dce71a5 to d0f828b Compare March 15, 2023 11:42
lidel added a commit that referenced this pull request Mar 17, 2023
Copy link
Member

@lidel lidel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • 🟢 HAMT listing looks ok, and we have a good regression test for it
    • @hacdias fetch my changes, I've isolated HAMT tests from the rest and added docs why we do what we do.
    • i did not rebase on master, would do it at the end, after adding tests mentioned below
  • 🔴 We are missing the same level of confidence when it comes to Range requests.
    • @hacdias do you mind adding similar test as for HAMT? (with init --empty-repo and refs counting)
      • with static fixture for two sets (overkill, but also makes sure we do proper range-requests) of 2 bytes from the middle of a big file?
        • Eg. curl http://127.0.0.1:8080/ipfs/bafybeiaysi4s6lnjev27ln5icwm6tueaw2vdykrtjkwiphwekaywqhcjze/wikipedia_en_all_maxi_2021-02.zim -i -H "Range: bytes=2000-2002, 40000000000-40000000002"
      • confirm only minimal set of blocks got fetched
    • @aschmahmann fysa i think this is the only known unknown atm, so once we have test for this, we can move forward with merges.
  • 🟠 ipfs-webui tests partially pass

@lidel lidel changed the title feat: update go-libipfs and switch gateway code to use go-libipfs implementation refactor: switch gateway code to new API from go-libipfs Mar 18, 2023
@hacdias
Copy link
Member

hacdias commented Mar 20, 2023

Note regarding HAMT tests: I've added some already (7d2ae38). I will work on adding/updating them likely tomorrow.

@hacdias hacdias force-pushed the feat/golibipfs-gateway-refactor branch 3 times, most recently from 30e87d2 to cb2acb9 Compare March 21, 2023 10:46
@hacdias hacdias marked this pull request as ready for review March 21, 2023 10:58
Copy link
Member

@lidel lidel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hacdias while we wait for boxo, would you have time to move (refactor) this PR's Range/HAMT tests that do refs counting in:

  • test/sharness/t0115-gateway-dir-listing.sh
  • test/sharness/t0110-gateway.sh

to a single place like test/cli/gateway_range_test.go (you can eyeball prior art in gateway_test.go).

Rationale:

  • We will make @guseggert happy by not adding more sharness 👍
  • Tests based on ipfs refs local are specific to Kubo – @laurentsenta will have less work porting to https://github.com/ipfs/gateway-conformance if we dont have them in sharness.
    • 💭 @laurentsenta this is an early idea for conformance tests (happy to discuss on future call, just a quick braindump here)
      • we won't have access to ipfs refs local, but can create a CAR fixture that is like a Swiss cheese 🧀 – something that has only the blocks for requested range, AND we also ensure all the missing CIDs are unreachable (without providers, so tests will always fail/timeout – this way generic test suite will have no need for counting refs).

@hacdias
Copy link
Member

hacdias commented Mar 23, 2023

all the missing CIDs are unreachable

Hmm, checking if 10k items are not present locally can take some time. We also needs a list of such CIDs (which can certainly be included in the test). I'm looking into this now.

@hacdias hacdias force-pushed the feat/golibipfs-gateway-refactor branch 2 times, most recently from fe8aae6 to fa406cf Compare March 23, 2023 12:35
@hacdias hacdias force-pushed the feat/golibipfs-gateway-refactor branch from fa406cf to d196423 Compare March 23, 2023 12:55
@hacdias hacdias requested a review from lidel March 23, 2023 13:16
lidel
lidel previously requested changes Mar 24, 2023
Copy link
Member

@lidel lidel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, checking if 10k items are not present locally can take some time. We also needs a list of such CIDs (which can certainly be included in the test). I'm looking into this now.

My bad, I'm sorry, I did not mean explicitly checking:

  • If you create a synthetic DAG, that is not provided by any peer on the planet, you won't have to manually check anything.
    The CAR would only have a few blocks from that DAG.
  • If Range request works, means the fetch was correct (limited to minimal set of blocks).
    If it hangs and then timeouts, means it triggered fetch of more blocks than necessary, and missing blocks are nowhere to be found.

This way you don't need to deal with inspecting local blockstore/cache or counting refs.
In Kubo, when we run daemon in offline mode, we get this behavior already, so we only need positive test.

@hacdias mind removing the overly complicated negative ones? (details inline)

test/cli/harness/ipfs.go Outdated Show resolved Hide resolved
test/cli/gateway_range_test.go Outdated Show resolved Hide resolved
test/cli/gateway_range_test.go Outdated Show resolved Hide resolved
test/cli/gateway_range_test.go Outdated Show resolved Hide resolved
@hacdias hacdias force-pushed the feat/golibipfs-gateway-refactor branch 4 times, most recently from 4cd5c43 to 9d632f8 Compare March 29, 2023 08:37
@hacdias hacdias requested a review from lidel March 29, 2023 08:38
@aschmahmann aschmahmann force-pushed the feat/golibipfs-gateway-refactor branch 3 times, most recently from 25f7abb to 1e3ad9c Compare March 29, 2023 20:47
aschmahmann and others added 2 commits March 30, 2023 14:14
Co-Authored-By: Marcin Rataj <lidel@lidel.org>
Co-Authored-By: Henrique Dias <hacdias@gmail.com>
@hacdias hacdias enabled auto-merge (squash) March 30, 2023 13:12
@hacdias hacdias merged commit 353dd49 into master Mar 30, 2023
@hacdias hacdias deleted the feat/golibipfs-gateway-refactor branch March 30, 2023 13:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Archived in project
Development

Successfully merging this pull request may close these issues.

3 participants