Skip to content

Commit

Permalink
API: move to api-version=2 for faster streaming LLM responses (#5446)
Browse files Browse the repository at this point in the history
Previously, the Cody IDE client used `api-version=1` for LLM chat
responses, which sent the full LLM response on every streaming change.
This was problematic because it meant we were sending a lot of redundant
traffic between the IDE client and the remote Sourcegraph instance
eventually resulting in a slower user experience.

Now, we use `api-version=2`, which only sends the delta between chunks.
This significantly reduces the amount of characters we're sending over
the wire. For example, for a chat response with 3k output tokens, we
were previously processing up to 1.8m tokens(!!) while now we only
process 3k tokens.

Don't merge until sourcegraph/sourcegraph#293
goes live.

## Test plan

Tested locally and confirmed that both api-version=1 and api-version=2
work as expected.

- [x] Update all HTTP recordings to reflect the new API. This should
give us good test coverage.
- [x] Manually confirm the web extension is still working. Cody Web has
no automated tests, but I ran the demo locally and took this screenshot
of the delta encoding in action
![CleanShot 2024-09-12 at 10 25
52@2x](https://github.com/user-attachments/assets/4189fad2-f23a-4d83-ac1c-96aa462099a2)


<!-- Required. See
https://docs-legacy.sourcegraph.com/dev/background-information/testing_principles.
-->

## Changelog

* Cody now uses a new LLM API that offers faster performance, especially
for long chat responses. This improvement is only enabled for Claude
models at this point.
<!-- OPTIONAL; info at
https://www.notion.so/sourcegraph/Writing-a-changelog-entry-dd997f411d524caabf0d8d38a24a878c
-->
  • Loading branch information
olafurpg authored Sep 12, 2024
1 parent e2f1841 commit 26b0a64
Show file tree
Hide file tree
Showing 23 changed files with 906 additions and 996 deletions.
362 changes: 181 additions & 181 deletions agent/recordings/customCommandsClient_509552979/recording.har.yaml

Large diffs are not rendered by default.

479 changes: 85 additions & 394 deletions agent/recordings/defaultClient_631904893/recording.har.yaml

Large diffs are not rendered by default.

90 changes: 45 additions & 45 deletions agent/recordings/document-code_965949506/recording.har.yaml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading

0 comments on commit 26b0a64

Please sign in to comment.