Migrate RAG pipeline to async processing. #2345

austintlee · 2024-04-21T23:48:44Z

Description

Use the async version of the search pipeline process to avoid blocking (remote) calls.

Original bug - opensearch-project/OpenSearch#10248.

Issues Resolved

[BUG] Conversational search random timeout exception #2334

Check List

[x ] New functionality includes testing.
- [x ] All tests pass
New functionality has been documented.
- New functionality has javadoc added
[x ] Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Austin Lee <austin@aryn.ai>

austintlee · 2024-04-21T23:49:21Z

cc: @reta @msfroh

austintlee · 2024-04-21T23:53:14Z

@msfroh Can you take a look at my code changes? I have a really basic question about exception handling in processResponseAsync - if I want to throw things like IllegalArgumentException/InvalidInputException, do I just throw it and let it propagate out of the method or is there some contract I need to follow in this async world?

ylwu-amzn · 2024-04-22T06:01:55Z

...g/opensearch/searchpipelines/questionanswering/generative/GenerativeQAResponseProcessor.java

        }
+        final int timeout = t;
+        log.info("Timeout for this request: {} seconds.", timeout);


Seems unnecessary to print this on info level. Move to debug level?

Changed to debug.

ylwu-amzn · 2024-04-22T06:02:31Z

...g/opensearch/searchpipelines/questionanswering/generative/GenerativeQAResponseProcessor.java

+        if (conversationId != null && !Strings.hasText(conversationId)) {
+            throw new IllegalArgumentException("Empty conversation_id is not allowed.");
+        }
+        // log.info("LLM question: {}, LLM model {}, conversation id: {}", llmQuestion, llmModel, conversationId);


Remove this line?

ylwu-amzn · 2024-04-22T06:03:16Z

...g/opensearch/searchpipelines/questionanswering/generative/GenerativeQAResponseProcessor.java

-        List<Interaction> chatHistory = (conversationId == null)
-            ? Collections.emptyList()
-            : memoryClient.getInteractions(conversationId, interactionSize);
+        log.info("Using interaction size of {}", interactionSize);


Move to debug level?

ylwu-amzn · 2024-04-22T06:03:26Z

...g/opensearch/searchpipelines/questionanswering/generative/GenerativeQAResponseProcessor.java

-        try {
-            ChatCompletionOutput output = llm
-                .doChatCompletion(
+        // log.info("system_prompt: {}", systemPrompt);


Remove these two lines?

ylwu-amzn · 2024-04-22T06:05:22Z

...g/opensearch/searchpipelines/questionanswering/generative/GenerativeQAResponseProcessor.java

+        } else {
+            final Instant memoryStart = Instant.now();
+            memoryClient.getInteractions(conversationId, interactionSize, ActionListener.wrap(r -> {
+                log.info("getInteractions complete. ({})", getDuration(memoryStart));


Remove this line or move to debug level?

Changed to debug.

ylwu-amzn · 2024-04-22T06:06:08Z

...g/opensearch/searchpipelines/questionanswering/generative/GenerativeQAResponseProcessor.java

+        llm.doChatCompletion(input, new ActionListener<>() {
+            @Override
+            public void onResponse(ChatCompletionOutput output) {
+                log.info("doChatCompletion complete. ({})", getDuration(chatStart));


Remove this line or move to debug level?

Changed to debug.

ylwu-amzn · 2024-04-22T06:10:52Z

...ain/java/org/opensearch/searchpipelines/questionanswering/generative/llm/DefaultLlmImpl.java

+                    .getMlModelTensors()
+                    .get(0)
+                    .getDataAsMap();
+                // log.info("dataAsMap: {}", dataAsMap.toString());


remove this line?

Zhangxunmt · 2024-04-22T17:59:48Z

...g/opensearch/searchpipelines/questionanswering/generative/GenerativeQAResponseProcessor.java

+        SearchResponse response,
+        PipelineProcessingContext requestContext,
+        ActionListener<SearchResponse> responseListener
+    ) {
        log.info("Entering processResponse.");


This log can be removed right?

Changed to debug.

Zhangxunmt

A lot of "onFailure(Exception e)" implementations in the ActionListeners do not include log.error(). Should we add more logs for errors? This was brought up in the earlier security review too.

ylwu-amzn

Thanks for the quick fix

* Migrate RAG pipeline to async processing. Signed-off-by: Austin Lee <austin@aryn.ai> * Address reviewer comments. Signed-off-by: Austin Lee <austin@aryn.ai> --------- Signed-off-by: Austin Lee <austin@aryn.ai> (cherry picked from commit 4b26ebf)

msfroh · 2024-04-23T13:48:40Z

...g/opensearch/searchpipelines/questionanswering/generative/GenerativeQAResponseProcessor.java

@@ -128,14 +141,15 @@ public SearchResponse processResponse(SearchRequest request, SearchResponse resp
        }
        String conversationId = params.getConversationId();

+        if (conversationId != null && !Strings.hasText(conversationId)) {
+            throw new IllegalArgumentException("Empty conversation_id is not allowed.");


Have you managed to test this?

You should probably invoke responseListener.onFailure(). Otherwise, the current thread may throw and the listener would sit there waiting for a response.

Yes, I have a test for this, but it does not go through the REST layer. I may need an IT test.

msfroh · 2024-04-23T13:51:41Z

...g/opensearch/searchpipelines/questionanswering/generative/GenerativeQAResponseProcessor.java

-                    log.error("Context " + contextField + " not found in search hit " + hits[i]);
-                    // TODO throw a more meaningful error here?
-                    throw new RuntimeException();
+                    throw new RuntimeException("Context " + contextField + " not found in search hit " + hits[i]);


Similarly, you need to make sure that this exception gets propagated to the listener. (I don't remember if that's covered by ActionListener.wrap(). Maybe?)

I'll check and also test.

msfroh · 2024-04-23T15:45:16Z

@msfroh Can you take a look at my code changes? I have a really basic question about exception handling in processResponseAsync - if I want to throw things like IllegalArgumentException/InvalidInputException, do I just throw it and let it propagate out of the method or is there some contract I need to follow in this async world?

Shoot -- I didn't see your question before this got merged. (I spent most of yesterday traveling to a conference.)

The general contract in the async world is that every possible code path needs to notify the listener exactly once or else it will wait indefinitely.

If you have a response, you must give it to the listener.
If there's a failure, you must notify the listener (probably via onFailure).
If there's no response, you must notify the listener that there's no response.

…pensearch-project#2349)

Migrate RAG pipeline to async processing.

59193e9

Signed-off-by: Austin Lee <austin@aryn.ai>

austintlee requested review from b4sjoo, dhrubo-os, jngz-es, model-collapse, rbhavna, ylwu-amzn, zane-neo, Zhangxunmt, HenryL27, samuel-oci and xinyual as code owners April 21, 2024 23:48

austintlee had a problem deploying to ml-commons-cicd-env April 21, 2024 23:48 — with GitHub Actions Error

austintlee had a problem deploying to ml-commons-cicd-env April 21, 2024 23:48 — with GitHub Actions Failure

austintlee had a problem deploying to ml-commons-cicd-env April 21, 2024 23:48 — with GitHub Actions Error

austintlee had a problem deploying to ml-commons-cicd-env April 21, 2024 23:49 — with GitHub Actions Failure

austintlee had a problem deploying to ml-commons-cicd-env April 21, 2024 23:49 — with GitHub Actions Error

austintlee mentioned this pull request Apr 21, 2024

Switch to async search pipeline processing. #1445

Closed

2 tasks

ylwu-amzn reviewed Apr 22, 2024

View reviewed changes

Zhangxunmt reviewed Apr 22, 2024

View reviewed changes

austintlee had a problem deploying to ml-commons-cicd-env April 22, 2024 18:39 — with GitHub Actions Error

austintlee had a problem deploying to ml-commons-cicd-env April 22, 2024 18:39 — with GitHub Actions Failure

austintlee had a problem deploying to ml-commons-cicd-env April 22, 2024 18:39 — with GitHub Actions Error

austintlee had a problem deploying to ml-commons-cicd-env April 23, 2024 00:40 — with GitHub Actions Failure

austintlee had a problem deploying to ml-commons-cicd-env April 23, 2024 00:40 — with GitHub Actions Error

austintlee had a problem deploying to ml-commons-cicd-env April 23, 2024 00:40 — with GitHub Actions Failure

austintlee had a problem deploying to ml-commons-cicd-env April 23, 2024 00:40 — with GitHub Actions Error

ylwu-amzn approved these changes Apr 23, 2024

View reviewed changes

austintlee temporarily deployed to ml-commons-cicd-env April 23, 2024 01:42 — with GitHub Actions Inactive

Zhangxunmt approved these changes Apr 23, 2024

View reviewed changes

austintlee temporarily deployed to ml-commons-cicd-env April 23, 2024 02:32 — with GitHub Actions Inactive

ylwu-amzn merged commit 4b26ebf into opensearch-project:main Apr 23, 2024
13 checks passed

ylwu-amzn added backport 2.x backport 2.13 labels Apr 23, 2024

opensearch-trigger-bot bot mentioned this pull request Apr 23, 2024

[Backport 2.x] Migrate RAG pipeline to async processing. #2349

Merged

opensearch-trigger-bot bot mentioned this pull request Apr 23, 2024

[Backport 2.13] Migrate RAG pipeline to async processing. #2350

Merged

b4sjoo pushed a commit that referenced this pull request Apr 23, 2024

Migrate RAG pipeline to async processing. (#2345) (#2349)

afb9279

b4sjoo pushed a commit that referenced this pull request Apr 23, 2024

Migrate RAG pipeline to async processing. (#2345) (#2350)

83acad3

msfroh reviewed Apr 23, 2024

View reviewed changes

mingshl added v2.14.0 enhancement New feature or request labels Apr 30, 2024

dhrubo-os pushed a commit to dhrubo-os/ml-commons that referenced this pull request May 17, 2024

Migrate RAG pipeline to async processing. (opensearch-project#2345) (o…

ff883b5

…pensearch-project#2349)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Migrate RAG pipeline to async processing. #2345

Migrate RAG pipeline to async processing. #2345

austintlee commented Apr 21, 2024

austintlee commented Apr 21, 2024

austintlee commented Apr 21, 2024

ylwu-amzn Apr 22, 2024

austintlee Apr 23, 2024

ylwu-amzn Apr 22, 2024

austintlee Apr 23, 2024

ylwu-amzn Apr 22, 2024

austintlee Apr 23, 2024

ylwu-amzn Apr 22, 2024

austintlee Apr 23, 2024

ylwu-amzn Apr 22, 2024

austintlee Apr 23, 2024

ylwu-amzn Apr 22, 2024

austintlee Apr 23, 2024

ylwu-amzn Apr 22, 2024

austintlee Apr 23, 2024

Zhangxunmt Apr 22, 2024

austintlee Apr 23, 2024

Zhangxunmt left a comment

ylwu-amzn left a comment

msfroh Apr 23, 2024

austintlee Apr 23, 2024

msfroh Apr 23, 2024

austintlee Apr 23, 2024

msfroh commented Apr 23, 2024

Migrate RAG pipeline to async processing. #2345

Migrate RAG pipeline to async processing. #2345

Conversation

austintlee commented Apr 21, 2024

Description

Issues Resolved

Check List

austintlee commented Apr 21, 2024

austintlee commented Apr 21, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Zhangxunmt left a comment

Choose a reason for hiding this comment

ylwu-amzn left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

msfroh commented Apr 23, 2024