Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix swr client inconsistencies #289

Merged
merged 6 commits into from
Nov 2, 2023

Conversation

conico974
Copy link
Contributor

This PR aims to solve an issue regarding inconsistencies between rsc (or json) data and html data for ISR.

Until now next set the cache-control header to s-maxage=X, stale-while-revalidate with X being the revalidate value.
This cause inconsistencies between rsc and html data due to cdn caching especially for big revalidate value.

For example with revalidate=3600, rsc and html could be out of sync by a maximum of 3599 seconds. It is assuming someone accessed the html at t=0 and no one accessed the rsc until t=3599.
This behaviour is exacerbated with app dir since every <Link /> pointing to the same route from a different route will result in a different rsc call and thus a different cdn cache entry.

This PR also fix the 5 min deduplication window of the SQS queue.

@changeset-bot
Copy link

changeset-bot bot commented Oct 18, 2023

🦋 Changeset detected

Latest commit: d6a10af

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 4 packages
Name Type
open-next Minor
app-pages-router Patch
app-router Patch
tests-unit Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@vercel
Copy link

vercel bot commented Oct 18, 2023

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
open-next ✅ Ready (Inspect) Visit Preview 💬 Add feedback Nov 2, 2023 6:47pm

@khuezy
Copy link
Contributor

khuezy commented Oct 18, 2023

Thx, I'll try this soon. My ISR is completely broken - it gets stuck w/ s-maxage=2
I don't know why it's stuck and failing to revalidate. I have a test app w/ the exact same code and it works there.
The only difference is the test app is accessed through the CF url vs a custom domain.

export const revalidate = 300

const ISRPage = async () => {
  const bacon = await fetch('http://worldtimeapi.org/api/timezone/America/Los_Angeles', {
    next: {
      tags: ['bacon']
    }
  })
  const data = await bacon.json()
  return (
    <hgroup>
      <h1>ISR Page</h1>
      <h3>{`Last generated at : ${data.utc_datetime}`}</h3>
    </hgroup>
  )
}

export default ISRPage

@conico974
Copy link
Contributor Author

Thx, I'll try this soon. My ISR is completely broken - it gets stuck w/ s-maxage=2 I don't know why it's stuck and failing to revalidate. I have a test app w/ the exact same code and it works there. The only difference is the test app is accessed through the CF url vs a custom domain.

export const revalidate = 300

const ISRPage = async () => {
  const bacon = await fetch('http://worldtimeapi.org/api/timezone/America/Los_Angeles', {
    next: {
      tags: ['bacon']
    }
  })
  const data = await bacon.json()
  return (
    <hgroup>
      <h1>ISR Page</h1>
      <h3>{`Last generated at : ${data.utc_datetime}`}</h3>
    </hgroup>
  )
}

export default ISRPage

That's weird, i'm wondering if the fetch cache is not screwing things here.
Could you try moving the revalidate into fetch directly.

@khuezy
Copy link
Contributor

khuezy commented Oct 18, 2023

Yea, I'll try that next. Right now I'm deploying the test app through a custom domain to parity the broken one.
The initial ISR does have the correct revalidate time, it's when the time expires and I reload, then it gets stuck w/ 2s

@khuezy
Copy link
Contributor

khuezy commented Oct 18, 2023

The custom domain isn't the problem...

@khuezy
Copy link
Contributor

khuezy commented Oct 18, 2023

I deleted the _cache files (714.3 MB worth of cached data, @conico974 we should clean up old cache after each deployment) and my revalidation worked... so it looks like the S3 lookup is failing or doing something strange.

@conico974
Copy link
Contributor Author

I deleted the _cache files (714.3 MB worth of cached data, @conico974 we should clean up old cache after each deployment) and my revalidation worked... so it looks like the S3 lookup is failing or doing something strange.

I don't understand how this could happen, but that's definitely something that we need to do.
I'll try to take a look tomorrow to see if I can find something about the old data.

@khuezy
Copy link
Contributor

khuezy commented Oct 18, 2023

I added isr to my middleware matcher to exclude the /isr page and the revalidate works... which is strange b/c for the /isr path, it just returns NextResponse.next({})

@khuezy
Copy link
Contributor

khuezy commented Oct 18, 2023

I undid my changes and it's working all of a sudden... nothing changed... very weird edge case which I don't know how to reproduce.

@khuezy
Copy link
Contributor

khuezy commented Oct 18, 2023

Wait I am able to reproduce, /isr that goes through middleware doesn't revalidate. But when I exclude it from middleware, it works.

@khuezy
Copy link
Contributor

khuezy commented Oct 18, 2023

I think I'm able to "fix" it... sometimes my /isr page is return 307 instead of 200 for some reason... when I hard code the res.statusCode = 200 //result.response.status, then ISR works...
Why is the middleware run method returning 307?

One thing I noticed w/ this PR is that b/c lastModified is set to the last valid: globalThis.lastModified, if I have 2 ISR pages, then the 2nd one that gets revalidated uses the 1st's time... not the current time... is that intentional?

@conico974
Copy link
Contributor Author

conico974 commented Oct 19, 2023

I think I'm able to "fix" it... sometimes my /isr page is return 307 instead of 200 for some reason... when I hard code the res.statusCode = 200 //result.response.status, then ISR works... Why is the middleware run method returning 307?

It should not. You could try adding some log before and after calling middleware to see if the 307 is coming from the middleware. It's either caught by some other part of your middleware, or it's a bug in the middleware. Maybe we need to handle the redirects in the revalidation function

One thing I noticed w/ this PR is that b/c lastModified is set to the last valid: globalThis.lastModified, if I have 2 ISR pages, then the 2nd one that gets revalidated uses the 1st's time... not the current time... is that intentional?

I don't see how this could happen, the globalThis.lastModified is used in 2 places, for the SQS queue and for the fixIsrHeaders. In those cases, it should be a cache HIT or STALE so the globalThis.lastModified should be the correct one
Ok i've missed something, we should put this under a if (headers["x-nextjs-cache"] === "STALE"

@khuezy
Copy link
Contributor

khuezy commented Oct 19, 2023

I think I'm able to "fix" it... sometimes my /isr page is return 307 instead of 200 for some reason... when I hard code the res.statusCode = 200 //result.response.status, then ISR works... Why is the middleware run method returning 307?

It should not. You could try adding some log before and after calling middleware to see if the 307 is coming from the middleware. It's either caught by some other part of your middleware, or it's a bug in the middleware. Maybe we need to handle the redirects in the revalidation function

One thing I noticed w/ this PR is that b/c lastModified is set to the last valid: globalThis.lastModified, if I have 2 ISR pages, then the 2nd one that gets revalidated uses the 1st's time... not the current time... is that intentional?

I don't see how this could happen, the globalThis.lastModified is used in 2 places, for the SQS queue and for the fixIsrHeaders. In those cases, it should be a cache HIT or STALE so the globalThis.lastModified should be the correct one Ok i've missed something, we should put this under a if (headers["x-nextjs-cache"] === "STALE"

It was returning 307 for the initial response, then afterwards, each reload to /isr returns 200. Since the 307 is saved, revalidation fails b/c it expects 200... We're not returning 307 anywhere in our code so it looks like a Nextjs issue. Let me see if I can pinpoint the 307 in their repo.

Can we detect 307 and set it as 200?

@conico974
Copy link
Contributor Author

It was returning 307 for the initial response, then afterwards, each reload to /isr returns 200. Since the 307 is saved, revalidation fails b/c it expects 200... We're not returning 307 anywhere in our code so it looks like a Nextjs issue. Let me see if I can pinpoint the 307 in their repo.

The only place where we could return a 307 in our code is during next-config redirect (but then it should not reach further) or inside middleware. NextResponse.redirect is also doing a 307 i think.

Can we detect 307 and set it as 200?

I don't think we could since you could decide to return a 307 from middleware.
What's the location headers of those 307 redirect ?

@khuezy
Copy link
Contributor

khuezy commented Oct 19, 2023

I see, handleRewrites potentially returns 307 from the routes-manifest... but I checked mine and there's only 308. Where is the 307 coming from? In my middleware, I'm only returning NextResponse.next(), could that somehow be 307?

@khuezy
Copy link
Contributor

khuezy commented Oct 19, 2023

It was returning 307 for the initial response, then afterwards, each reload to /isr returns 200. Since the 307 is saved, revalidation fails b/c it expects 200... We're not returning 307 anywhere in our code so it looks like a Nextjs issue. Let me see if I can pinpoint the 307 in their repo.

The only place where we could return a 307 in our code is during next-config redirect (but then it should not reach further) or inside middleware. NextResponse.redirect is also doing a 307 i think.

Can we detect 307 and set it as 200?

I don't think we could since you could decide to return a 307 from middleware. What's the location headers of those 307 redirect ?

Thanks, this helped me see the problem. I'm calling next-auth's getToken method and that fails on the first load, which does call NextResponse.redirect as you've said. This isn't an open-next issue, I need to see why nextauth is failing to get the token on the initial request, the redirected event is able to get the token though.

edit: it isn't a nextauth issue, the cookies are coming in empty on the initial request

@conico974
Copy link
Contributor Author

edit: it isn't a nextauth issue, the cookies are coming in empty on the initial request

Haven't we already solved that issue ?
I'm wondering, do we need to bypass next-auth anyway for ISR pages?
The middleware will only run the first time for the end user, the rest of the time, it will run only for the revalidation function or the cloudfront SWR request.
At least we should add something in the doc about it

@khuezy
Copy link
Contributor

khuezy commented Oct 19, 2023

We fixed a related issue with the responseStreaming and set-cookie but I'm not sure if we handled this case. But I think you're right about static and middleware... does it make sense to run static content through middleware? If not, then open-next should check and skip the middleware handler.

@khuezy
Copy link
Contributor

khuezy commented Oct 19, 2023

I added a bunch of logs and the convertFromAPIGatewayProxyEventV2 won't have the event.cookies when the requestContext.http.method is HEAD... should we cache HEAD methods?
Or rather, should we run middleware for HEAD?

if (method === 'HEAD') return internalEvent in handleMiddleware fixes it if HEAD isn't suppose to run in middleware.

I ran next dev and HEAD does go through middleware... can we skip the cache.set if it's HEAD?

Maybe we can add method !== 'HEAD' here: https://github.com/sst/open-next/blob/main/packages/open-next/src/adapters/routing/middleware.ts#L121

Or maybe the proper fix is to have my middleware call NextResponse.next() when the method is HEAD

Copy link
Contributor

@khuezy khuezy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lastModified properly works w/ the 2nd isr call now, thanks!

@conico974
Copy link
Contributor Author

I added a bunch of logs and the convertFromAPIGatewayProxyEventV2 won't have the event.cookies when the requestContext.http.method is HEAD... should we cache HEAD methods? Or rather, should we run middleware for HEAD?

if (method === 'HEAD') return internalEvent in handleMiddleware fixes it if HEAD isn't suppose to run in middleware.

Not entirely sure about that, people's api route could use HEAD and may also need to go through middleware.

I ran next dev and HEAD does go through middleware... can we skip the cache.set if it's HEAD?

Maybe we can add method !== 'HEAD' here: https://github.com/sst/open-next/blob/main/packages/open-next/src/adapters/routing/middleware.ts#L121

Or maybe the proper fix is to have my middleware call NextResponse.next() when the method is HEAD

There is one easy fix for this one. We could use the x-isr or x-prerender-revalidate header to bypass the middleware.

The x-isr is used only for our implementation of ISR and the x-prerender-revalidate is also used for res.revalidate.
If we decide to bypass the middleware, my take on this is that we should use x-isr so that it only mess with our call.
Someone might want to directly call the server function with HEAD and x-prerender-revalidate and want to handle auth on the middleware.

@khuezy
Copy link
Contributor

khuezy commented Oct 20, 2023

That's be great to include.

@conico974
Copy link
Contributor Author

@khuezy Have you tried the middleware bypass ? If it works for you, i think we're ready to release 2.3

@khuezy
Copy link
Contributor

khuezy commented Nov 2, 2023

Let me retest

@khuezy
Copy link
Contributor

khuezy commented Nov 2, 2023

@conico974 it works, ship it!

@conico974 conico974 merged commit 22e3e47 into opennextjs:main Nov 2, 2023
3 checks passed
@github-actions github-actions bot mentioned this pull request Nov 2, 2023
@conico974 conico974 deleted the fix/swr-client-inconsistencies branch October 29, 2024 10:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants