Fix failures to compute embeddings too much context & Shorten RAG Results #279
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fix failures to compute embeddings because of computing too much context.
Prior to this document we were computing the embeddings using the entire notebook.
This would lead to context exceeded errors on longer documents.
This had two negative impacts
for the current document
Don't include the full Document in the RAG example
In Example.Query we were including the full document which would then be injected into the context when generating new suggests
This can end up using a lot of tokens and potentially confusing the agent when generating new suggestions
Use a simple algorithm to shorten the example. The aglorithm is as follows
Results
Other
This PR also refactors the code to share code for computing embeddings between the learner and the Agent
to minimize risk of training and serving skew.
Treat learner failures as permanent failures rather than retrying. If we see concrete examples of retryable errors than we can add retries
Fix Error Learning From Examples: Can't compute embedding because of too much context #260