Add support for Embeddings setup for sites with high numbers of terms #633

jeffpaul · 2023-12-06T21:37:25Z

Is your enhancement related to a problem? Please describe.

Coming out of UAT of #622, one thing that was identified was sites that might have thousands of terms that getting those processed in vector space by Embeddings could hit an API limit, time out, run long. We should look into options to support an exceedingly high number of terms with a retry mechanism, batching, etc.

Designs

n/a

Describe alternatives you've considered

n/a

Code of Conduct

I agree to follow this project's Code of Conduct

dkotter · 2023-12-07T20:50:38Z

There's also the issue of comparing each term to a post once you reach thousands of items. We'll need some sort of background processing, queue management system to handle both scenarios.

For assigning terms to a post I think we'll also need to consider what the UI looks like. We don't want someone to have to just sit and wait until all the comparisons are done but we also need to show the progress. Here's some of my initial thoughts on that:

When the Embeddings feature is active for a taxonomy that has lots of terms (what constitutes lots? We'd have to do some testing to figure this out but I'm thinking 500+) we alert users that either when saving or when clicking the Classify button (which I think only works right now for the NLU feature, so another thing we'll need to add), processing will happen in the background
Probably within the ClassifAI panel we show a progress bar that automatically updates
This processing happens in the background so a user can leave the post and go to other parts of the sites (or close the browser all together) and when they come back to the post it will update with the latest progress
Alert a user once the process is done and then they can go and review the assigned terms (do we auto assign terms? Do we show them in our new modal UI and allow them to pick the next time they load the post? Something else?)

dkotter · 2024-07-29T18:20:52Z

Closing this out has this was handled in #779

jeffpaul added the type:enhancement New feature or request. label Dec 6, 2023

jeffpaul added this to the Future Release milestone Dec 6, 2023

jeffpaul added this to Open Source Practice Dec 6, 2023

github-project-automation bot moved this to Incoming in Open Source Practice Dec 6, 2023

jeffpaul moved this from Incoming to Backlog in Open Source Practice Dec 6, 2023

dkotter closed this as completed Jul 29, 2024

github-project-automation bot moved this from Backlog to Done in Open Source Practice Jul 29, 2024

jeffpaul modified the milestones: Future Release, 3.1.0 Aug 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for Embeddings setup for sites with high numbers of terms #633

Add support for Embeddings setup for sites with high numbers of terms #633

jeffpaul commented Dec 6, 2023

dkotter commented Dec 7, 2023

dkotter commented Jul 29, 2024

Add support for Embeddings setup for sites with high numbers of terms #633

Add support for Embeddings setup for sites with high numbers of terms #633

Comments

jeffpaul commented Dec 6, 2023

Is your enhancement related to a problem? Please describe.

Designs

Describe alternatives you've considered

Code of Conduct

dkotter commented Dec 7, 2023

dkotter commented Jul 29, 2024