Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Additional search views for the frontend #410

Closed
6 tasks done
Tracked by #2251
obulat opened this issue Feb 16, 2023 · 37 comments
Closed
6 tasks done
Tracked by #2251

Additional search views for the frontend #410

obulat opened this issue Feb 16, 2023 · 37 comments
Assignees
Labels
🕹 aspect: interface Concerns end-users' experience with the software 🌟 goal: addition Addition of new feature 🧭 project: thread An issue used to track a project and its progress 🧱 stack: api Related to the Django API 🧱 stack: frontend Related to the Nuxt frontend

Comments

@obulat
Copy link
Contributor

obulat commented Feb 16, 2023

Start Date ETA Project Lead Actual Ship Date
2023-04-01 2023-11-30 @fcoveram, @obulat TBD

Description

To improve the search experience on openverse.org, we want to add more ways of displaying search results:

  • add a landing page for providers. These pages would only display media from the selected provider and will also serve as a landing page for the provider on Openverse.
  • add pages for tags, making them clickable. These pages would probably look similar to provider landing pages and would show only the media with the selected tag.
  • creator landing pages would be similar to the previous pages and would show all the media in Openverse by a specific creator.

As a start, we can create and implement a single template for all three types of pages.

Documents

Issues

Prior Art

@obulat obulat added the 🧭 project: thread An issue used to track a project and its progress label Feb 16, 2023
@obulat obulat moved this to Not Started in Openverse Project Tracker Feb 16, 2023
@zackkrida
Copy link
Member

@WordPress/openverse-maintainers, @panchovm has expressed interest in leading this project and is going to start the kickoff post next week. He'll be AFK some days at the start of April which was the original estimated start date for this project.

@zackkrida zackkrida moved this from Not Started to In Kickoff in Openverse Project Tracker Apr 10, 2023
@fcoveram
Copy link
Contributor

fcoveram commented May 15, 2023

Project Proposal was merged (#1890) and project is ready to continue with Implementation Plans.

@openverse-bot
Copy link
Collaborator

Hi @fcoveram, this project has not received an update comment in 14 days. Please leave an update comment as soon as you can. See the documentation on project updates for more information.

@openverse-bot
Copy link
Collaborator

Hi @fcoveram, this project has not received an update comment in 14 days. Please leave an update comment as soon as you can. See the documentation on project updates for more information.

@fcoveram
Copy link
Contributor

A new design (i2) was shared in #2113. Waiting for responses to continue with the design discussion

@openverse-bot
Copy link
Collaborator

Hi @fcoveram, this project has not received an update comment in 14 days. Please leave an update comment as soon as you can. See the documentation on project updates for more information.

@fcoveram
Copy link
Contributor

fcoveram commented Jul 6, 2023

As noted here, designs were approved due to no discrepant comments. I will prepare the assets for their hands-off and comment again once done.

As discussed during the prioritization meeting, @obulat will lead the implementation and will write the involved plan.

@fcoveram
Copy link
Contributor

@openverse-bot
Copy link
Collaborator

Hi @fcoveram, this project has not received an update comment in 14 days. Please leave an update comment as soon as you can. See the documentation on project updates for more information.

@fcoveram
Copy link
Contributor

fcoveram commented Aug 2, 2023

Implementation plan started a few days ago and has been discussed in #2676

@zackkrida zackkrida moved this from In RFC to In Progress in Openverse Project Tracker Aug 2, 2023
@obulat
Copy link
Contributor Author

obulat commented Nov 18, 2023

Update 2023-11-18

Done

The API changes have been merged 🎉

Some endpoints you can check in staging (with only the subset of data):
SMK images: https://api-staging.openverse.engineering/v1/images/source/smk/
Images that are attributed to "Anonymous, Italian, 18th century artist" at the Metropolitan Museum of Arts: https://api-staging.openverse.engineering/v1/images/source/met/creator/Anonymous,%20Italian,%2018th%20century%20artist
Audio tagged with "istanbul": https://api-staging.openverse.engineering/v1/audio/tag/istanbul/

In progress

There are two open PRs adding the changes to the Nuxt frontend.

  1. Updates to the single result page: Add source, creator and tag links to the single result page media info #3338
  2. Pages for the collections Add additional search view pages to the Nuxt app #3140

To do

#3365 updated the result labels for the collections (e.g, "Over 10000 results found with this tag"), but I will also open a discussion to finalize the wording in line with the comments in the original VCollectionHeader PR

After the discussion about tag normalization in #3342, it is also necessary to remove lower-casing the image tags in the ingestion server:

lower_tag = tag["name"].lower()

The tag is lower-cased here only to check if it is in the denylist, the saved tag name keeps the original case. So, no change is needed here.

@AetherUnbound AetherUnbound moved this from 🚧 In Progress to ⏸ On Hold in Openverse Project Tracker Jan 3, 2024
@AetherUnbound AetherUnbound moved this from ⏸ On Hold to 🚧 In Progress in Openverse Project Tracker Feb 7, 2024
@obulat
Copy link
Contributor Author

obulat commented Feb 8, 2024

Update 2024-02-07

Done

In progress

To do

Updates to the tags display, creator view results label, analytics events and the scrolling behavior.

Questions

Analytics event

#3619 raises a question about analytics events that were not planned for in the Implementation Plan.

Currently, we have two events for the search view (REACH_RESULT_END and LOAD_MORE_RESULTS) that send the following properties:

{ /** The media type being searched */
    searchType: SearchType
    /** The search term */
    query: string
    /** The current page of results the user is on,
     * *before* loading more results.. */
    resultPage: number
}

We need to update these events to use both on the regular search and on the additional search views.

I think adding something like strategy (used in the search store in Nuxt, with values of default for regular search, and tag | creator | source for the additional search views. This way, we could re-use the query property to send the value of tag, source or creator (as a source/creator string) with the event.
Does this shape make sense, @WordPress/openverse-frontend ?
Would adding a new property to the event be okay and allow us to compare the new events with the old ones? Or would we need to back-fill the old events with the strategy property?

Deep pagination

We currently limit search pagination depth to 5 pages. This was done to ensure stability of the API because deep pagination together with the dead link removal was making some searches make a very high number of ES/db requests.

Additional search views are always sorted by the newly-added to Openverse, which means that they are safer as there should be much fewer links on the top pages.

In the current implementation of the additional search views, we show the "10000 results for this source" result label, but the users can only browse the first 100 (5 pages). I think we should increase the page depth for the additional search views, but not remove it entirely (because otherwise it would lead to scraping, and would cause the same problems as the search page had before the 5 page limit was set). However, we need to somehow convey it to the users that we are only showing the first N results.

@WordPress/openverse-api, what do you think about the depth of pagination: should we increase it for the additional search views, and if so, to what number?
@WordPress/openverse-frontend, @fcoveram, do you agree with somehow displaying how many results the user can actually browse (out of the more than 10000 results), and if so, how?

@fcoveram
Copy link
Contributor

fcoveram commented Feb 8, 2024

In the current implementation of the additional search views, we show the "10000 results for this source" result label, but the users can only browse the first 100 (5 pages).

I don't fully understand this. As a user, I can continuously load more results without any problem.

do you agree with somehow displaying how many results the user can actually browse (out of the more than 10000 results), and if so, how?

I don't think is necessary to convey exactly how many results are. Edge cases will reach out the 10,000 items. The label intends to communicate the results dimension; let's say, whether it surfaced 10, 100, or 1,000 results. The key word is over X results and I'm drawn to keep it as an indicator of massiveness.

@obulat
Copy link
Contributor Author

obulat commented Feb 9, 2024

I don't fully understand this. As a user, I can continuously load more results without any problem.

Try going to https://staging.openverse.org/image/tag/cat and clicking the "Load more" button 20 times. After the twentieth click, the button disappears. This is caused by the API's MAX_PAGINATION_DEPTH setting:

MAX_PAGINATION_DEPTH = 20

Admittedly, 20 pages is a lot, and not many users would hit this limit. However, I experienced one of the collection views only being five pages deep, which was a much more confusing experience. It was probably caused by dead links in the results.

So, if we leave the 20 page limit, we only expose 400 results for any collection (out of the 10 000 in the label).

@fcoveram
Copy link
Contributor

I understand. My UX approach is allowing users to continue seeing results, but I would like to hear what other folks think over what options we have and what is a good practice.

@sarayourfriend
Copy link
Collaborator

sarayourfriend commented Feb 12, 2024

I don't think we should change page depth rules. They're already complex (auth vs non-auth) and adding more exceptions doesn't feel necessary. Viewing tags and sources isn't about browsing the entire thing. We have no way to support that technically, particularly for large sources (otherwise deep pagination would cause performance issues for all users).

In the current implementation of the additional search views, we show the "10000 results for this source" result label, but the users can only browse the first 100 (5 pages). I think we should increase the page depth for the additional search views, but not remove it entirely (because otherwise it would lead to scraping, and would cause the same problems as the search page had before the 5 page limit was set). However, we need to somehow convey it to the users that we are only showing the first N results.

Can we change the copy to "showing the top results from this source" or something like that. I do not think we need to communicate the precise number of theoretically available results in non-search view, as it's only vaguely relevant if you can filter those down further. The precise number we actually surface is likewise unnecessary, for the same reason (you can't filter them on those pages). We can't get a precise number anyway (due to dead link issues) and it isn't useful information on those pages, because again, it's only useful if you can further filter them, which you cannot. These pages are about browsing, not searching. As such, we don't need to give any other information. If we want to display the stats for a source when it's available, we can do that. "Showing the top results of works from this source". Only relevant for sources though. Even that is overkill in my opinion, for the reasons related to search. I wouldn't block the completion of this project on that even if we decided to add that. It's also not relevant to the other two types of views.

I could see an argument that it's useful statistical data for research, but we don't make accurate data available for a variety of reasons, and there are much better ways of deriving that data if we did want to anyway.

Once we're able to implement in-house popularity sorting, we can clarify "top results from this source" (or whatever) to mean "most popular results from this source" or "most popular results with the tag " and so forth. Knowing that that will be a future possibility makes me even more resistent to the idea of trying to work out a way of accurately representing the precise extent of data available on those pages (in addition to the issues I shared above regarding the impossibility of accuracy and the general uselessness of the data compared to a regular search where it gives useful context).

Anyway, to summarize, my preference is to make no changes on the API side, and just to clarify the copy by removing unnecessary and inaccurate information.

@fcoveram
Copy link
Contributor

Can we change the copy to "showing the top results from this source" or something like that. I do not think we need to communicate the precise number of theoretically available results in non-search view, as it's only vaguely relevant if you can filter those down further.

I like this approach and the copy.

Even that is overkill in my opinion, for the reasons related to search. I wouldn't block the completion of this project on that even if we decided to add that

+1 to this

@zackkrida
Copy link
Member

While participating in the discussion at #3825 it occurred to me that one of the largest benefits of these additional search views, outside of ease of sharing and bookmarking by users, is SEO. Unfortunately, we've missed discussing SEO considerations in the project proposal (myself included) and the implementation plan.

Are we setting good meta titles and descriptions for these pages? Will we allow search engines to crawl them?

@sarayourfriend
Copy link
Collaborator

Is there a way to support crawling on pages with query parameters? I'm not familiar with the SEO considerations for those kinds of pages. My understanding generally was that dynamically generated pages, like search results, wouldn't be suitable for SEO. But there are some parts of it that could work for SEO, if crawlers skipped the results section.

@zackkrida
Copy link
Member

@sarayourfriend I think I remember that you've already discussed this with @obulat elsewhere, but yes, search engines don't treat pages with query params or paths differently. They basically consider URL structure an implementation detail.

A good example of this is the Google search results for "stock photos dogs" which shows Adobe's stock photography site:

image

This result uses the URL https://stock.adobe.com/search?k=dog.

Also something to consider: Many of these stock sites store popular searches in dynamically-generated sitemaps. Unsplash has extensive sitemaps with thousands of entries for popular searches, landing pages, collections, and so on, for example.

In the future, we could query analytics at application build time or dynamically with caching to generate sitemaps containing the most popular search terms, collections, and so on. That would be a nice follow-up project, distinct from this one.

@obulat
Copy link
Contributor Author

obulat commented Mar 19, 2024

Update 2024-03-19

Done and in progress

The Implementation Plan for Additional search views was updated with the conclusions from discussion in #3825.

The frontend has converted collection views to use query parameters instead of path parameters. This required additional changes in how media is fetched (#3835 PR under review). The API changes are also under review in #3887.

Two changes to the frontend were also merged:

To Do

  • Analytics events updates.
  • Collection pages SEO (og meta data)

@obulat
Copy link
Contributor Author

obulat commented Mar 27, 2024

Possible deployment/launch plan (from #3887 comment)

  1. Switch the frontend additional_search_views flag on, without fully removing the flag, and set SHOW_COLLECTION_DOCS env variable in the API to True in prod. This means that the users on the frontend can see the additional search views as fully-launched, and the API users can start using them with the unstable__ parameters.
  2. Test how well both API and frontend work in prod for ~ 1 week. If needed, we can revert using the frontend flag and the API env variable.
  3. If everything is successful, add the stable parameters to the frontend.
  4. Stabilize the API parameters (remove the conditionals from the additional search views documentation and the unstable__ prefix from tag and collection parameters)
  5. Remove the additional frontend parameters and the feature flag.

@sarayourfriend
Copy link
Collaborator

sarayourfriend commented Mar 27, 2024

Stabilize the API parameters (remove the conditionals from the additional search views documentation and the unstable__ prefix from tag and collection parameters)

FWIW, I don't think we necessarily need to immediately stabilise new API parameters after launch and verification of the basic feature in production, and it's only to our benefit to move slowly on doing so. If you're okay with holding off a month or so on it to give time for things to settle and see how the new parameters work in practice, I'd recommend it. I'm definitely glad we've kept the sensitivity parameters and return value unstable, as it's given us flexibility to make changes to the behaviour without worrying about API version commitments (for which we still don't have concrete definitions).

Just a suggestion, anyway.

@obulat
Copy link
Contributor Author

obulat commented Apr 8, 2024

Update 2024-04-08

Done

The last PRs of the project implementation was merged:

#3979 fixed the bug in collection fetching that was found during testing of other changes.

In progress

The project will be launched once the following two PRs are merged:

To do

After the issues from "In progress" section are deployed and the new views are thus launched, we will need to test the views in prod for a while.
Then, after we determine that the views work well, we will can start fixing the rest of the issues in the milestone to clean up the temporary changes (feature flag in the frontend and the API unstable__ parameters).

@zackkrida
Copy link
Member

@WordPress/openverse-frontend I am going to take on this PR (#4084) and the launching of this project while Olga is AFK.

@zackkrida zackkrida moved this from 🚧 In Progress to 🚢 Shipped in Openverse Project Tracker Apr 15, 2024
@fcoveram
Copy link
Contributor

The feature is already online, but there are two pending tasks listed in the initial comment. Are we ready to mark them as done @obulat or are some work still in progress?

@obulat
Copy link
Contributor Author

obulat commented Apr 18, 2024

@fcoveram, I created a separate milestone for the cleanup tasks, and moved all the pending tasks there. The Additional search views milestone is completed 🎉

@zackkrida zackkrida moved this from 🚢 Shipped to ✅ Success in Openverse Project Tracker Apr 24, 2024
@github-project-automation github-project-automation bot moved this from ✅ Success to 🚢 Shipped in Openverse Project Tracker Apr 30, 2024
@obulat obulat moved this from 🚢 Shipped to ✅ Success in Openverse Project Tracker May 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🕹 aspect: interface Concerns end-users' experience with the software 🌟 goal: addition Addition of new feature 🧭 project: thread An issue used to track a project and its progress 🧱 stack: api Related to the Django API 🧱 stack: frontend Related to the Nuxt frontend
Projects
Status: ✅ Success
Development

No branches or pull requests

5 participants