Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve contentful rich text experience #24221

Closed
wardpeet opened this issue May 19, 2020 · 22 comments
Closed

Improve contentful rich text experience #24221

wardpeet opened this issue May 19, 2020 · 22 comments
Assignees
Labels
topic: source-contentful Related to Gatsby's integration with Contentful type: feature or enhancement Issue that is not a bug and requests the addition of a new feature or enhancement.

Comments

@wardpeet
Copy link
Contributor

Summary

I'm not super familiar with contentful rich text. For now this is just a placholder issues.

  • rich text causes a full rebuild (I think that's inevitable because it touches code)
  • random error appears
@wardpeet wardpeet added the type: feature or enhancement Issue that is not a bug and requests the addition of a new feature or enhancement. label May 19, 2020
@gatsbot gatsbot bot added the status: triage needed Issue or pull request that need to be triaged and assigned to a reviewer label May 19, 2020
@wardpeet wardpeet added topic: source-contentful Related to Gatsby's integration with Contentful and removed status: triage needed Issue or pull request that need to be triaged and assigned to a reviewer labels May 19, 2020
@joshduck
Copy link

joshduck commented May 21, 2020

We’ve seen that when building our site, we frequently encounter out of memory errors when working with Contentful data that has Rich Text fields.

This seems to be a fundamental architectural issue with gatsby-source-contentful plugin. Gatsby does not use Contentful’s GraphQL design where text structure and links are distinct fields that can be queried independently (represented as text and links fields in Contentful GraphQL). Instead it implements Rich Text as a single field which holds a tree with any embedded Contentful nodes eagerly resolved and inlined directly. Because this eager inlining happens recursively, a Rich Text field will contain all the data for anything embedded in it, and all the data embedded in those nodes, etc, etc.

Our content model includes embedded links between blog post nodes for SEO purposes, which means that Gatsby could hypothetically end up eagerly fetching and serialising hundreds of nested nodes in a single rich text field. We have seen that blog posts frequently contain multiple gigabytes of data.

This could be fixed by implementing Rich Text fields in the same way that Contentful does (using two fields). This would allow the consumer to specify which fields they need from embedded nodes. GraphQL, by design, does not allow users to construct queries that would recursively fetch data.

@amandasage
Copy link

Hi @joshduck checking if https://www.contentful.com/developers/docs/tutorials/general/rich-text-and-gatsby/ will help resolve some of the issues you're seeing.

@axe312ger axe312ger self-assigned this May 27, 2020
@joshduck
Copy link

@amandasage, no. That's what we're using right now.

The problem is that the JSON object referenced in that documentation page (bodyRichText.json) can contain way too much data (gigabytes) and crash the process.

Out fix so far has been to patch gatsby-source-contentful to limit the depth of the recursion.

@amandasage
Copy link

amandasage commented May 28, 2020

Hey @joshduck I will open a ticket on Contentful's side to look at memory issues + sync for preview

@axe312ger axe312ger changed the title Improve contentful richt text experience Improve contentful rich text experience Jun 10, 2020
@axe312ger
Copy link
Collaborator

I am currently actively working on this issue and want to share my current status to the community:

Summary

Complex data models in combination with Contentful rich text can cause issues. The code can become slow, the worst case will crash the build process.

Why does this happen?

The root of the issue is how resolving references works in combination with Contentful Rich Text.

This PR ensures newly referenced entries will be resolvable: #15084

At the moment, the rich text document gets serialized as a string and stored into graphql.

As soon an entry within a rich text node leads somehow to a circular reference pattern, the string resolving can crash the build process.

Fixes that did not work

As quick-win I tried to replace the serializer with other packages form npm or node internals. All led to similar issues.

And even when we would fix it that way, we can't be sure which data we will receive in the resolved references as all these serializers will add circular in some way.

How we can fix it

To fix this issue, we have to change the way Gatsby renders Contentful Rich Text Documents.

As Gatsby developer, I want to have full control of which data will be resolved, at best straight in my graphQl query.

When rendering the Rich Text Field, I just want to pass the Rich Text Document to a function and output the result via JSX.

This how it may work:

  1. Instead of storing resolved Rich Text Document Data in GraphQL, we will store the raw unresolved document in GraphQL. This will already let our build process pass. But now we only have the raw Contentful Link available when rendering rich text.
  2. Next, we add a subfield to the rich text graphql field references. A query could look like:
richTextField {
  references {
    ... on ContentfulTypeA {
      contentful_id # properly required
      title
    }
    ... on ContentfulTypeB {
      contentful_id # properly required
      title
      customField
    }
  }
}
  1. For rendering, we are going to introduce the biggest change. The current best practice is to use the official Rich Text Renderer directly with the GraphQL field result. We expose a new function from the source plugin, which will wrap documentToReactComponents and takes care of resolving the references based on our custom GraphQL Query data.

This is how using Contentful Rich Text could look like:

import React from 'react'
import * as propTypes from 'prop-types'
import { renderRichText } from 'gatsby-source-contentful'

function PageTemplate({ data }) {
  const { title, richTextField } = data.contentfulPage

  return (
    <div>
      <h1>{title}</h1>
      {renderRichText(richTextField)}
    </div>
  )
}

PageTemplate.propTypes = {
  data: propTypes.object.isRequired,
}

export default PageTemplate

export const pageQuery = graphql`
  query pageQuery($id: String!) {
    contentfulPage(id: { eq: $id }) {
      title
      richTextField {
        references {
          ... on ContentfulTypeA {
            contentful_id
            title
          }
          ... on ContentfulTypeB {
            contentful_id
            title
            customField
          }
        }
      }
    }
  }
`

I'll open a PR soon to share my current status of the code :)

@daniellangnet
Copy link

daniellangnet commented Jun 16, 2020

@joshduck thanks for sharing your insight on this! I'm having the same problem and wondering how you went about patching gatsby-source-contentful to limit the depth of recursion? In case that patch is public somewhere, would greatly appreciate it.

@axe312ger excited about your solution. One other issue you might inadvertently fix is that currently the resolved entries that are part of the serialized JSON do not include any incoming link references. For example, a model Subcategory would normally contain a field lc_category which is an array of all the Category entries that contain a reference to this subcategory. My hope is that with your approach of resolving the referenced entries outside the JSON structure and being able to define GraphQL query to decide which fields to resolve, it will be possible to also get these incoming links.

@me4502
Copy link
Contributor

me4502 commented Jun 16, 2020

@daniellangnet

@joshduck thanks for sharing your insight on this! I'm having the same problem and wondering how you went about patching gatsby-source-contentful to limit the depth of recursion? In case that patch is public somewhere, would greatly appreciate it.

Our patch is quite heavily intertwined with a few other patches making it hard to publish, but the main thing that we do is to strip out anything beyond a certain depth within the prepareRichTextNode function in normalize.js within gatsby-source-contentful. You'll have to experiment with the exact number as what it requires can vary per site/data type.

@daydream05
Copy link
Contributor

daydream05 commented Jun 16, 2020

One of the biggest issue that a lot of us encountered with Rich Text and Gatsby is that there was no easy way to use Gatsby Image inside rich text. Since rich text only returns the image object and we cannot query for the fluid/fixed props that we're used to when working with contentful image, we had to resort to hacky ways.

The current solutions in the wild are:

  1. Querying all contentful assets and filtering inside the react to rich text renderer. (source)
  2. Grabbing the images from rich text and using those to images as filters when querying all contentful assets. (source)

The former causes performance issues while the latter requires a lot of boiler plate code.

I found that there's actually a simpler solution which uses the APIs inside gatsby-source-contentful to generate fluid or fixed props without having to use GraphQL.

I have successfully used it in production and so far haven't had any issues.

I created a PR for it but here's how it would look like:

import GatsbyImg from 'gatsby-image'
import { resolveFluid } from 'gatsby-source-contentful'

const options = {
    [BLOCKS.EMBEDDED_ASSET]: (node) => {
        const { file, title } = data.target.fields
        const image = {
            file: file["en-US"]
        }
        const fluidProps = resolveFluid(image, { maxWidth: 600 })
        return <GatsbyImage fluid={fluidProps} alt={title['en-US']} />
    }
}

Here's the gist if you want to implement it now.

@axe312ger
Copy link
Collaborator

The PR is was talking about above: #24905

axe312ger added a commit that referenced this issue Jun 30, 2020
* Allow circular references
* Improve performance and reduce RAM footprint
* Query referenced entries and assets via GraphQL

fixes #24221

BREAKING CHANGE:

* Entities references in Rich Text fields are no more automatically resolved
* Use the `raw` subfield instead of `json`
* Use GraphQL to define your referenced data with the new `references` field
* Removes the `resolveFieldLocales` as the new `references` field automatically resolves locales
* To render Rich Text fields unse the new `renderRichText()` function from `gatsby-source-contentful/rich-text`
axe312ger added a commit that referenced this issue Jul 8, 2020
* Allow circular references
* Improve performance and reduce RAM footprint
* Query referenced entries and assets via GraphQL

fixes #24221

BREAKING CHANGE:

* Entities references in Rich Text fields are no more automatically resolved
* Use the `raw` subfield instead of `json`
* Use GraphQL to define your referenced data with the new `references` field
* Removes the `resolveFieldLocales` as the new `references` field automatically resolves locales
* To render Rich Text fields unse the new `renderRichText()` function from `gatsby-source-contentful/rich-text`
axe312ger added a commit that referenced this issue Jul 17, 2020
* Allow circular references
* Improve performance and reduce RAM footprint
* Query referenced entries and assets via GraphQL

fixes #24221

BREAKING CHANGE:

* Entities references in Rich Text fields are no more automatically resolved
* Use the `raw` subfield instead of `json`
* Use GraphQL to define your referenced data with the new `references` field
* Removes the `resolveFieldLocales` as the new `references` field automatically resolves locales
* To render Rich Text fields unse the new `renderRichText()` function from `gatsby-source-contentful/rich-text`
@disintegrator
Copy link
Contributor

For rendering, we are going to introduce the biggest change. The current best practice is to use the official Rich Text Renderer directly with the GraphQL field result. We expose a new function from the source plugin, which will wrap documentToReactComponents and takes care of resolving the references based on our custom GraphQL Query data.

@axe312ger this seems problematic if we want to render the rich text differently for various use cases. As an example, rich text used as the body of a blog post will have use different react components to rich text used in the description area of a card widget. Currently, code consuming rich text can directly pass render customizations to documentToReactComponents. We have a RichText component today with the following API:

import {
  documentToReactComponents,
  Options,
} from "@contentful/rich-text-react-renderer";
import { Document } from "@contentful/rich-text-types";

export interface RichTextProps {
  document: Document;
  renderMark?: Options["renderMark"];
  renderNode?: Options["renderNode"];
  renderText?: Options["renderText"];
}

const RichText: React.FC<RichTextProps> = props => {
  // calls documentToReactComponents with any options passed into props
}

Question: How do you suggest we apply customizations to rich text rendering in those cases?

@axe312ger
Copy link
Collaborator

axe312ger commented Jul 23, 2020

@disintegrator we just released an update that covers your suggestions as well. See the code at:

#25249

@disintegrator
Copy link
Contributor

disintegrator commented Jul 28, 2020

Thanks for the update @axe312ger

For anyone that comes across memory issues with gatsby-source-contentful. This is likely because of how rich text is handled. In our gatsby project the only property we use out of rich text fields is json. For example, we typically write this:

{
	contentfulBlogPost(id: {eq: $id}) {
		title
		body {
			json
		}
	}
}

A lot of the work that Gatsby does around rich text is on inferring the type for the content field. This is a deeply nested object that corresponds to the parsed rich text AST from contentful API. It seems overkill that the parsed object is attached to rich text nodes and this is manifesting in a longer and longer bootstrap phase and high memory usage when building gatsby projects as you create more content in the CMS. Additionally, each rich text field is getting its own graphql type which also seemed unnecessary for us. We use gatsby-plugin-schema-snapshot to persist the graphql types generated due to gatsby-source-contentful and I noticed around ~4200 (!) inferred graphql types out of ~4500 total types were related to rich text! The json field is not inferred - it's a schema extension with a resolver function that gatsby-source-contentful adds.

All in all, there is a lot of wasted work and memory going into data we never look at in our project.

Our solution

We use patch-package to modify gatsby-source-contentful like so:

Below, I've attached the patch-package patch file that targets gatsby-source-contentful @ 2.3.32 with the changes I listed. I hope it helps you overcome slow builds and OOM errors until the upcoming rebuild of gatsby-source-contentful! 😄

gatsby-source-contentful+2.3.32.patch.zip

@axe312ger
Copy link
Collaborator

@disintegrator did you find a chance to try the new version of gatsby-source-contentful@next with your project?

See: #25249

@disintegrator
Copy link
Contributor

disintegrator commented Jul 29, 2020

@axe312ger that's next on my list to evaluate. I need a quick solution to my problem based on the released version of the contentful plugin.

axe312ger added a commit that referenced this issue Jul 29, 2020
* Allow circular references
* Improve performance and reduce RAM footprint
* Query referenced entries and assets via GraphQL

fixes #24221

BREAKING CHANGE:

* Entities references in Rich Text fields are no more automatically resolved
* Use the `raw` subfield instead of `json`
* Use GraphQL to define your referenced data with the new `references` field
* Removes the `resolveFieldLocales` as the new `references` field automatically resolves locales
* To render Rich Text fields unse the new `renderRichText()` function from `gatsby-source-contentful/rich-text`
@daydream05
Copy link
Contributor

daydream05 commented Jul 31, 2020

Sorry if I missed it. Does this PR #24905 also fix the issue with missing content #10592?

The current solution is to clear cache but since I'm running on Gatsby cloud, I end up with missing data whenever I make slight changes on a rich text field.

axe312ger added a commit that referenced this issue Aug 5, 2020
* Allow circular references
* Improve performance and reduce RAM footprint
* Query referenced entries and assets via GraphQL

fixes #24221

BREAKING CHANGE:

* Entities references in Rich Text fields are no more automatically resolved
* Use the `raw` subfield instead of `json`
* Use GraphQL to define your referenced data with the new `references` field
* Removes the `resolveFieldLocales` as the new `references` field automatically resolves locales
* To render Rich Text fields unse the new `renderRichText()` function from `gatsby-source-contentful/rich-text`
@axe312ger
Copy link
Collaborator

@daydream05 yes please have a look at the new version from #24905 and give feedback :)

@daydream05
Copy link
Contributor

Started using it for a new client project and everything thing looks good so far! Great work on this @axe312ger . Waited for this rich text update for so long.

One issue though regarding loading multiple contentful spaces in the same Gatsby app, it seems that it’s broken.

(The content models for the 2nd space is not loading any attributes or data, just the name of the content model)

I’ll create an issue tomorrow.

@samsherwood
Copy link

samsherwood commented Sep 25, 2020

Thank you, @disintegrator

Just wanted to say 'thank you' for attaching this. I've been beating my head against a wall for weeks now trying to track down a solution.


I spoke too soon. Applied the patch this morning and while schema generation was much faster, everything dies after when trying to visit a page or right before.

Going to give migration an attempt!

axe312ger added a commit to axe312ger/gatsby that referenced this issue Oct 2, 2020
* Allow circular references
* Improve performance and reduce RAM footprint
* Query referenced entries and assets via GraphQL

fixes gatsbyjs#24221

BREAKING CHANGE:

* Entities references in Rich Text fields are no more automatically resolved
* Use the `raw` subfield instead of `json`
* Use GraphQL to define your referenced data with the new `references` field
* Removes the `resolveFieldLocales` as the new `references` field automatically resolves locales
* To render Rich Text fields unse the new `renderRichText()` function from `gatsby-source-contentful/rich-text`
@bsgreenb
Copy link

Also want to say thanks to @axe312ger who has been pushing through these issues to build a better Rich Text framework.

@axe312ger
Copy link
Collaborator

@bsgreenb thanks! you are welcome! :)

We pushing further, release is coming closer :)

axe312ger added a commit to axe312ger/gatsby that referenced this issue Oct 15, 2020
* Allow circular references
* Improve performance and reduce RAM footprint
* Query referenced entries and assets via GraphQL

fixes gatsbyjs#24221

BREAKING CHANGE:

* Entities references in Rich Text fields are no more automatically resolved
* Use the `raw` subfield instead of `json`
* Use GraphQL to define your referenced data with the new `references` field
* Removes the `resolveFieldLocales` as the new `references` field automatically resolves locales
* To render Rich Text fields unse the new `renderRichText()` function from `gatsby-source-contentful/rich-text`
axe312ger added a commit that referenced this issue Oct 16, 2020
* Allow circular references
* Improve performance and reduce RAM footprint
* Query referenced entries and assets via GraphQL

fixes #24221

BREAKING CHANGE:

* Entities references in Rich Text fields are no more automatically resolved
* Use the `raw` subfield instead of `json`
* Use GraphQL to define your referenced data with the new `references` field
* Removes the `resolveFieldLocales` as the new `references` field automatically resolves locales
* To render Rich Text fields unse the new `renderRichText()` function from `gatsby-source-contentful/rich-text`
axe312ger added a commit that referenced this issue Oct 29, 2020
* Allow circular references
* Improve performance and reduce RAM footprint
* Query referenced entries and assets via GraphQL

fixes #24221

BREAKING CHANGE:

* Entities references in Rich Text fields are no more automatically resolved
* Use the `raw` subfield instead of `json`
* Use GraphQL to define your referenced data with the new `references` field
* Removes the `resolveFieldLocales` as the new `references` field automatically resolves locales
* To render Rich Text fields unse the new `renderRichText()` function from `gatsby-source-contentful/rich-text`
@pvdz pvdz closed this as completed in a256346 Nov 9, 2020
@alana314
Copy link

alana314 commented Dec 7, 2020

Hi, thanks for this. We're having trouble querying raw and references.
gatsby version: 2.29.0-next.3
gatsby-source-contentful: 2.29.0
We're getting a couple errors:
Cannot query field "raw" on type "contentfulArticleContentRichTextNode".
Cannot query field "references" on type "contentfulArticleContentRichTextNode".
error ENOENT: no such file or directory, open '.cache/json/_.json'

Also, we're using the plugin gatsby-plugin-snapshot, would it be possible to share what the RichTextNode should look like in schema.gql now with these changes?

@pvdz
Copy link
Contributor

pvdz commented Dec 7, 2020

@Jordan314 please file a new issue, this one has been merged. But one thing you should try first is to update to the newest version of gatsby-source-contentful (4.2.0).

I'll close this issue as it has been resolved and merged. If you were inclined to post a reply to it please open a new issue and refer to this one. Thank you.

@gatsbyjs gatsbyjs locked and limited conversation to collaborators Dec 7, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
topic: source-contentful Related to Gatsby's integration with Contentful type: feature or enhancement Issue that is not a bug and requests the addition of a new feature or enhancement.
Projects
None yet
Development

No branches or pull requests