Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Algolia includes result from different locales but redirecting shows 404 #5880

Open
6 of 7 tasks
code-masala opened this issue Nov 5, 2021 · 27 comments · Fixed by #6407
Open
6 of 7 tasks

Algolia includes result from different locales but redirecting shows 404 #5880

code-masala opened this issue Nov 5, 2021 · 27 comments · Fixed by #6407
Labels
bug An error in the Docusaurus core causing instability or issues with its execution domain: search Related to the search feature, usually Algolia

Comments

@code-masala
Copy link

Have you read the Contributing Guidelines on issues?

Prerequisites

  • I'm using the latest version of Docusaurus.
  • I have tried the npm run clear or yarn clear command.
  • I have tried rm -rf node_modules yarn.lock package-lock.json and re-installing packages.
  • I have tried creating a repro with https://new.docusaurus.io.
  • I have read the console error message carefully (if applicable).

Description

https://domain.com/ur/hello
if this route is given by algolia search When i redirect then it will 404.When I refresh the page with same route It works

Steps to reproduce

Step1-Open algolia search bar
Step2-Write something for search
Step3-click

Expected behavior

When I click on search data I have to redirect to the respective pages

Actual behavior

Give 404 first . If I manually refresh then It works fine

Your environment

  • Public source code:
  • Public site URL:
  • Docusaurus version used:
  • Environment name and version (e.g. Chrome 89, Node.js 16.4):
  • Operating system and version (e.g. Ubuntu 20.04.2 LTS):

Reproducible demo

No response

Self-service

  • I'd be willing to fix this bug myself.
@code-masala code-masala added bug An error in the Docusaurus core causing instability or issues with its execution status: needs triage This issue has not been triaged by maintainers labels Nov 5, 2021
@Josh-Cena
Copy link
Collaborator

Can't reproduce. Can you try reproducing this on the Docusaurus site? https://docusaurus.io/

I tried with https://docusaurus.io/zh-CN/ and Algolia worked correctly. It could be a problem with the page itself instead of Algolia—is it a doc page, or a custom page?

@Josh-Cena Josh-Cena added status: needs more information There is not enough information to take action on the issue. and removed status: needs triage This issue has not been triaged by maintainers labels Nov 5, 2021
@code-masala
Copy link
Author

{ "index_name": "sample", "start_urls": ["https://domain.com/"], "sitemap_urls": ["https://domain.com/sitemap.xml"], "sitemap_alternate_links": true, "stop_urls": ["/tests"], "selectors": { "lvl0": { "selector": "(//ul[contains(@class,'menu__list')]//a[contains(@class, 'menu__link menu__link--sublist menu__link--active')]/text() | //nav[contains(@class, 'navbar')]//a[contains(@class, 'navbar__link--active')]/text())[last()]", "type": "xpath", "global": true, "default_value": "Documentation" }, "lvl1": "header h1", "lvl2": "article h2", "lvl3": "article h3", "lvl4": "article h4", "lvl5": "article h5, article td:first-child", "lvl6": "article h6", "text": "article p, article li, article td:last-child" }, "strip_chars": " .,;:#", "custom_settings": { "separatorsToIndex": "_", "attributesForFaceting": ["language", "version", "type", "docusaurus_tag"], "attributesToRetrieve": [ "hierarchy", "content", "anchor", "url", "url_without_anchor", "type" ] }, "conversation_id": ["833762294"], "nb_hits": 46250 }

@code-masala
Copy link
Author

I want to search based on language selected like in https://docusaurus.io/

@Josh-Cena
Copy link
Collaborator

@code-masala That query payload is far from enough for me to figure out what's wrong. Do you have a reproducible demo? A published site?

@code-masala
Copy link
Author

code-masala commented Nov 7, 2021

@Josh-Cena
what i want if the selected language is french(fr) then only french data from algolia is fetched

@Josh-Cena
Copy link
Collaborator

image

This should already be the case

@code-masala
Copy link
Author

@Josh-Cena can you help me to figure out that what change I have to do in config.json
as this thing is not explained in internet as I am try.

@Josh-Cena
Copy link
Collaborator

As I said, I can't help you much without having a site to look at. I've never observed the behavior you described.

@slorber
Copy link
Collaborator

slorber commented Nov 10, 2021

Give 404 first . If I manually refresh then It works fine

It's not clear what you mean here, please show at least a screenshot

@code-masala we can't really help on this without inspecting your real live site URL and your algolia index config.

We'll re-open once it's provided.

@slorber slorber closed this as completed Nov 10, 2021
@casionone
Copy link

casionone commented Jan 18, 2022

I also have the same problem .
website:https://linkis.staged.apache.org/
when the first time to do serach ,it will be 404 , refresh in the chrome then work
The url is https://linkis.staged.apache.org/zh-CN/community/how-to-contribute/#12-%E5%8A%9F%E8%83%BD%E4%BA%A4%E6%B5%81%E5%AE%9E%E7%8E%B0%E9%87%8D%E6%9E%84

image

@Josh-Cena
Copy link
Collaborator

Test

@casionone Weirdly, I can't reproduce it at all...

@casionone
Copy link

760d14dc-4a40-4e23-8698-931579541342.mp4

@Josh-Cena

@Josh-Cena
Copy link
Collaborator

Thanks. This is because Algolia uses an SPA redirect but the en locale and zh-Hans locale are two different SPAs.

@Josh-Cena Josh-Cena reopened this Jan 18, 2022
@Josh-Cena Josh-Cena removed the status: needs more information There is not enough information to take action on the issue. label Jan 18, 2022
@Josh-Cena Josh-Cena changed the title Algolia with multiple language route give 404 have to refresh then work Algolia includes result from different locales but redirecting shows 404 Jan 18, 2022
@slorber
Copy link
Collaborator

slorber commented Jan 19, 2022

@casionone the issue is that your English site is presenting Chinese search results.

This also seems to be the original problem:

what i want if the selected language is french(fr) then only french data from algolia is fetched


We can see in the network tab that your site is not sending any facetFilter:

image

You should enable contextualSearch: true, this will include relevant "filters" sent in search queries, including filtering on the current language.

https://docusaurus.io/docs/search#contextual-search


It's a good time to enable this feature by default, will do that in #6407

@slorber
Copy link
Collaborator

slorber commented Jan 19, 2022

However, we still need to fix some i18n edge cases, because the "Recent" search results are shared across languages, and we can still get a 404 in some situations (like clicking on a recent Chinese search result while on the English site).

I can reproduce this on the Docusaurus prod site, and contextualSearch won't fix it.

@Josh-Cena Josh-Cena linked a pull request Jan 25, 2022 that will close this issue
@slorber
Copy link
Collaborator

slorber commented Jan 26, 2022

Re-opening because there are still edge cases to fix, see my comment above

@slorber slorber reopened this Jan 26, 2022
@Josh-Cena Josh-Cena added the domain: search Related to the search feature, usually Algolia label Mar 29, 2022
@sergeyol
Copy link

sergeyol commented Jan 12, 2023

Hi @slorber, @Josh-Cena, I've have the same problem as @casionone.

We have intentionally disabled contextualSearch to be able to search in different languages, and now are receiving "Page not found" when selecting a search result with different locale.

It is possible to try it here: https://orange-field-06c9e2f03.azurestaticapps.net/

Is there any quick workaround possible? Maybe inject something like "pathname://" to search results?

When doing the search from an already localized page, it is even worse, as it appends a second locale to the URL:

https://orange-field-06c9e2f03.azurestaticapps.net/en-US/uk-UA/docs/hrp/doc-approval-workflow

I found an issue #4723 and PR #6731 regarding this, but it is unfinished for a long time.

@slorber
Copy link
Collaborator

slorber commented Jan 18, 2023

@sergeyol by default we assume search results are part of the current single-page application, and are navigated using history.push("/newPath").

When using i18n, each locale is a different SPA site and we should use window.location.href instead of history.push so that we can transition from one SPA to another. Unfortunately we can't seamlessly transition from one localized site to another with the more dynamic SPA navigation.

We have an externalUrlRegexp that maybe could be a solution?

CleanShot 2023-01-18 at 19 40 45@2x

The upcoming version 2.3 should also allow you to wrap the SearchBar and provide a custom transformItems (search result items) props to do more advanced things with JS code. See #8461 (comment)

@sergeyol
Copy link

Thank you @slorber, indeed, I was not aware of that parameter, and it does the job for me. I've added our whole domain for now. Will also wait for 2.3 version, sounds promising.

@tri-chu
Copy link

tri-chu commented Apr 26, 2023

Regarding the edge case for contextualSearch mentioned above, I'm still able to reproduce it in version 2.4. However, the docusaurus prod website doesn't seem to have this problem. Is there a PR for this fix that's not released?

@slorber
Copy link
Collaborator

slorber commented Apr 27, 2023

@tri-chu this issue is quite old and contextualSearch not new. I don't think we fixed anything related to that recently.

If you have issues it's very difficult for me to help if you don't share your crawler config and your live production URL to see the problem myself.

@slorber
Copy link
Collaborator

slorber commented Apr 27, 2023

Going to close this because contextualSearch is now enabled by default.

The only remaining edge case ("recent search hit in Chinese while you are on the English site", see #5880 (comment)) is now very unlikely to happen unless the user decide for some reason to disable contextual search, which I wouldn't particularly recommend.

@slorber slorber closed this as not planned Won't fix, can't repro, duplicate, stale Apr 27, 2023
@tri-chu
Copy link

tri-chu commented Apr 27, 2023

Hi @slorber we're still hitting that problem pretty consistently on docusaurus 2.4 with contextual search turned on on our website here https://www.8thwall.com/docs/

We don't use the new Algolia Crawler but the legacy crawler with this config instead

{
  "index_name": "8thwall-docs-prod",
  "start_urls": [
    "https://www.8thwall.com/docs/"
  ],
  "sitemap_urls": [
    "https://www.8thwall.com/docs/sitemap.xml"
  ],
  "selectors": {
    "lvl0": {
      "selector": "(//ul[contains(@class,'menu__list')]//a[contains(@class, 'menu__link menu__link--sublist menu__link--active')]/text() | //nav[contains(@class, 'navbar')]//a[contains(@class, 'navbar__link--active')]/text())[last()]",
      "type": "xpath",
      "global": true,
      "default_value": "Documentation"
    },
    "lvl1": "header h1",
    "lvl2": "article h2",
    "lvl3": "article h3",
    "lvl4": "article h4",
    "lvl5": "article h5, article td:first-child",
    "text": "article p, article li, article td:last-child"
  },
  "strip_chars": " .,;:#",
  "custom_settings": {
    "separatorsToIndex": "_",
    "attributesForFaceting": [
      "language",
      "version",
      "type",
      "docusaurus_tag"
    ],
    "attributesToRetrieve": [
      "hierarchy",
      "content",
      "anchor",
      "url",
      "url_without_anchor",
      "type"
    ]
  }
}

@tri-chu
Copy link

tri-chu commented Apr 27, 2023

Here is a screen recording of our issue:

Screen.Recording.2023-04-27.at.2.24.58.PM.mov

@slorber slorber reopened this Apr 28, 2023
@slorber
Copy link
Collaborator

slorber commented Apr 28, 2023

Thanks @tri-chu , you are right this edge case of "recent searches" is still common when user switch language.

@shortcuts is there a way to sandbox each language regarding recent searches?

This looks stored in localStorage under __DOCSEARCH_RECENT_SEARCHES__8thwall-docs-prod, is there a way for us to provide a different storage key for each locale, or this is hardcoded?

I saw this in the DocSearch docs but nothing to customize the localStorage key

CleanShot 2023-04-28 at 10 56 14@2x

https://docsearch.algolia.com/docs/api/#disableuserpersonalization

Also curious: what are favorites? is there a way to add a page as favorite now? 🤔

@shortcuts
Copy link
Contributor

@shortcuts is there a way to sandbox each language regarding recent searches?

This looks stored in localStorage under __DOCSEARCH_RECENT_SEARCHES__8thwall-docs-prod, is there a way for us to provide a different storage key for each locale, or this is hardcoded?

Actually it's already stored with the correct key, we use the objectID, which is the full URL of the page, so it should consider locales, but on the GET I believe we retrieve all of the saved searches 🤔 we could introduce some logic here https://github.com/algolia/docsearch/blob/main/packages/docsearch-react/src/stored-searches.ts#L79

Also curious: what are favorites? is there a way to add a page as favorite now? 🤔

Yes, a user can add recent searches to favorite, it's also a local storage thing

Screenshot 2023-04-28 at 12 18 26

@slorber
Copy link
Collaborator

slorber commented May 4, 2023

Oh, forgot about this favorite feature :D


@shortcuts I'm not sure to understand what you mean here

What I see in practice is that a single local storage key is used for all the Docusaurus localized sites:

CleanShot 2023-05-04 at 12 10 39@2x

I don't really understand what the objectID is and how it could be used to filter the results that we get from the localStorage

What I'd like is the ability to provide my own storage key, so that we have more than 1 storage value:

  • __DOCSEARCH_RECENT_SEARCHES__docusaurus-2-fr
  • __DOCSEARCH_RECENT_SEARCHES__docusaurus-2-en
  • __DOCSEARCH_FAVORITE_SEARCHES__docusaurus-2-fr
  • __DOCSEARCH_FAVORITE_SEARCHES__docusaurus-2-en

Does it make sense?

That also looks simpler easier to reason about and more performant because you are working with smaller recent search storage objects.


The alternative for Docusaurus could be to not sandbox locales, but make it possible for Docusaurus to know if a search it is from another localized site, so that we can decide if we should navigate SPA (/fr/doc1 => /fr/doc2) or MPA (/fr/doc1 => '/en/doc2`).

Technically we could probably do that already, but it's more complicated, and there are possible fancy edge cases if we build this by inspecting the search hit URL language prefix (convoluted: /fr => page about France on English website VS /fr/ root of the French site). It would be easier and more reliable if we could assign an i18n locale to each stored search hit.


To the the ability to pass custom storage keys is simpler and probably good enough

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug An error in the Docusaurus core causing instability or issues with its execution domain: search Related to the search feature, usually Algolia
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants