Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support recipe schema.org metadata from CAPI in LinkedData #26931

Merged
merged 22 commits into from
Mar 18, 2024

Conversation

cemms1
Copy link
Contributor

@cemms1 cemms1 commented Feb 27, 2024

Co-authored-by: frederickobrien frederick.obrien@guardian.co.uk

Recipe structured data

Part of guardian/dotcom-rendering#10532, this is the last in a chain of PRs that will get Schema.org markup rendering on the website for recipes.

A schemaOrg field has been added to the content field of CAPI responses (see guardian/content-api-models#237 and https://github.com/guardian/content-api/pull/2858 for more context). What arrives at Frontend is mostly valid schema, though through a quirk of Thrift field naming '_atType' and '_atContext' need to be turned into '@type' and '@context'.

Once this is done the schema is converted to a LinkedData object and appended to the more generic article schema (or linked data as it's labelled here) already being generated and is passed on to Dotcom as before. Testing has been added to ensure data is being transformed as expected.

image

One upside of this approach is that we don't have to touch Dotcom, though an open question remains about where is the most appropriate place to transform CAPI data into Schema.org markup. Frontend? CAPI itself? There's a bit of both now. With time - and hopefully the SEO value of this established - it might be worth centralising the Schema.org/linked open data generation.

Most of the changed files - in data/database - are due to an unrelated issue concerning (non)generated data files.

To test deploy this branch to CODE and use https://validator.schema.org/ to check whether valid schema is being included in the ld+json blobs of articles containing recipe elements. #26931 (comment) contains a few example though there are hundreds by now.

@cemms1 cemms1 force-pushed the add-recipe-schema-org-metadata branch from 82b6df7 to 5ecae05 Compare February 27, 2024 15:46
@cemms1 cemms1 changed the base branch from main to sponsorship-package February 28, 2024 15:23
@cemms1 cemms1 force-pushed the add-recipe-schema-org-metadata branch from 5ecae05 to d42f52d Compare February 28, 2024 15:23
@cemms1 cemms1 force-pushed the add-recipe-schema-org-metadata branch from dc0477c to 803f1d1 Compare February 28, 2024 16:29
Base automatically changed from sponsorship-package to main February 29, 2024 13:15
@cemms1 cemms1 force-pushed the add-recipe-schema-org-metadata branch from 2346feb to ece9562 Compare March 1, 2024 11:30
@cemms1
Copy link
Contributor Author

cemms1 commented Mar 1, 2024

Update of things still outstanding:

Ideally we'd also get a test file working to check the logic for returning recipe schema org data.

A possible improvement at this point might be to decode and encode the JSON object from schemaOrg.recipe rather than to stringify it and parse again.

@cemms1 cemms1 force-pushed the add-recipe-schema-org-metadata branch 2 times, most recently from 4c30bae to 1aa82d6 Compare March 6, 2024 12:27
@cemms1
Copy link
Contributor Author

cemms1 commented Mar 7, 2024

Where we've got to:

  • We can recognise when an article has a Some schemaOrg.recipe field
  • We need to convert every _atType and every atContext within the SchemaRecipe data structure to @type and @context because of a limitation with thrift not allowing the @ symbol.
    • One way of doing this might be to use Play Json but in order to do this we will need to fully define our output type (?) in order to define the Json.reads and Json.writes functions. This is not ideal due to wanting to pass the entire model response straight through, only changing these two fields throughout the object.
    • It would be useful to understand if this is possible from any Scala experts before we take too much time trying various methods in order to achieve this
    • If it will take too much effort/time or require a high level of schema definition within the frontend repo it would be worth evaluating whether CAPI is really the right place for this data, having frontend as the only real "client" of this

@frederickobrien
Copy link
Contributor

frederickobrien commented Mar 15, 2024

Testing this in CODE and it's looking really good, valid schema coming through for a selection of single and multi-recipe articles:

Using https://validator.schema.org/ to confirm the ld+json blob is valid. No errors or warnings found as yet

image

@frederickobrien frederickobrien changed the title [WIP] Linked data for recipes Linked data for recipes Mar 15, 2024
@frederickobrien frederickobrien marked this pull request as ready for review March 15, 2024 16:41
@frederickobrien frederickobrien requested a review from a team as a code owner March 15, 2024 16:41
@cemms1 cemms1 force-pushed the add-recipe-schema-org-metadata branch from 218c86c to 5d98055 Compare March 18, 2024 10:24
@cemms1 cemms1 changed the title Linked data for recipes Support recipe schema.org metadata from CAPI in LinkedData Mar 18, 2024
@cemms1 cemms1 linked an issue Mar 18, 2024 that may be closed by this pull request
@cemms1 cemms1 merged commit 213c2bc into main Mar 18, 2024
3 checks passed
@cemms1 cemms1 deleted the add-recipe-schema-org-metadata branch March 18, 2024 11:35
@prout-bot
Copy link
Collaborator

Seen on ADMIN-PROD (merged by @cemms1 12 minutes and 13 seconds ago)

@prout-bot
Copy link
Collaborator

Seen on FRONTS-PROD (merged by @cemms1 14 minutes and 23 seconds ago)

@rtyley
Copy link
Member

rtyley commented Mar 20, 2024

I'm not sure if this is important, but I wanted to share that I received this warning this morning from Google Search Console ("New Recipes structured data issues detected"):

image

Top critical issues* : Missing field 'name'

It looks like this is actually only on one url:

https://www.theguardian.com/food/2021/mar/21/joe-trivelli-recipes-asparagus-and-potato-schiacciata-cherry-ice-cream-sandwiches

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support Basecamp
6 participants