feat(chat): filter namespace messages from history if it exists in metadata VSCODE-611 #866

gagik · 2024-11-05T16:58:38Z

Scrub the namespace messages from the history when we have identified and are building the messages to send to the ai. We already include the namespace in the query assistant message. We might want to move these to be additions to the user's message instead (in both query and schema).

Description

Checklist

New tests and/or benchmarks are included
Documentation is changed or added
I have signed the MongoDB Contributor License Agreement (https://www.mongodb.com/legal/contributor-agreement)

Motivation and Context

Bugfix
New feature
Dependency update
Misc

Open Questions

Dependents

Types of changes

Backport Needed
Patch (non-breaking change which fixes an issue)
Minor (non-breaking change which adds functionality)
Major (fix or feature that would cause existing functionality to change)

…de into gagik/add-test-filtering

…de into gagik/one-database-handling

…-js/vscode into gagik/no-database-or-collection-error

…ngodb-js/vscode into gagik/filter-namespace

…e-or-collection-error

…-namespace

…abase-or-collection-error

…ngodb-js/vscode into gagik/filter-namespace

…-namespace

Anemy

Nice, left a couple code quality suggestions.

Anemy · 2024-11-11T16:01:03Z

src/participant/prompts/promptBase.ts

@@ -163,16 +165,27 @@ export abstract class PromptBase<TArgs extends PromptArgsBase> {
  protected getHistoryMessages({


This function is starting to get a bit long and hard to follow all of the things happening in it.
It already has the // eslint-disable-next-line complexity which is usually an indicator that we should break it into a few functions, even if it comes with a slightly hit to performance (like running through the messages multiple times).
Should we do that now? Break this function into multiple parts where each is doing a certain thing? That'll also make it more easily unit testable.

More like a question about this area in code and maybe a bit of a request :) if you will refactor this code, could we split the history into something like getUserHistoryMessages and getAssistantHistoryMessages. When testing this functionality I found it difficult to print the history content because it is already wrapped into the vscode.LanguageModelChatMessage format when it is returned here: https://github.com/mongodb-js/vscode/blob/main/src/participant/prompts/promptBase.ts#L130

Maybe, it could be something like:

const messages = [ vscode.LanguageModelChatMessage.Assistant(this.getAssistantPrompt(args)), vscode.LanguageModelChatMessage.Assistant(this.getAssistantHistoryMessages()), vscode.LanguageModelChatMessage.User(this.getUserHistoryMessages()), vscode.LanguageModelChatMessage.User(prompt), ];

Or something like that. The idea here is that we have some unformatted string value to print to see what message we send to the model.

The ordering of the messages here is important, although the last message will always be the user prompt and the first is the assistant prompt, the user history and assistant history aren't sequential.
We do currently log information about the messages we are sending to the model:

vscode/src/participant/participant.ts

Line 143 in d5a9345

messages: modelInput.messages.map(

@alenakhineika we can add something there that will log them in their entirety, and not just the metadata, if an environment variable is set. How does that sound?

I am also dealing changing getHistoryMessages in VSCODE-632 so I think it'll be best to deal with some refactoring for it in the PR there.

src/test/suite/participant/participant.test.ts

gagik · 2024-11-12T10:47:02Z

For the sake of keeping this simple to review, going to merge and do any greater potential refactoring work in the PR for VSCODE-632.

gagik added 30 commits October 29, 2024 17:41

Filter long / invalid prompts

a54d67d

Try helper include

6ff0d8d

Use content value

c9aa589

Use a cleaner test

698051f

delete old helper

bf05722

Add explainer

2445dc5

WIP

ff0c1cc

Move around dependencies

f9dd536

Remove grep

09c7414

Use firstCall

aada8e1

Add test filtering

eb2c7dc

Update CONTRIBUTING.md

bcd260b

Escape the environment variable

a9beef1

Merge branch 'gagik/add-test-filtering' of github.com:mongodb-js/vsco…

2e4bd70

…de into gagik/add-test-filtering

Fix wording

33c995e

Add schema tests

8582193

align tests and use a stub

a8bc30d

Add saving to metadata

5ddc7fb

Move things

06435e8

Better org

5a58171

Merge branch 'gagik/add-test-filtering' of github.com:mongodb-js/vsco…

fdbabfc

…de into gagik/one-database-handling

simplify tests and picking logic

f23e19f

typos

274fe17

Align with broken test

54885d6

Add error info and tests

cfb9d61

wrap l10n

f080016

Merge branch 'gagik/one-no-collection-handling' of github.com:mongodb…

190f4f6

…-js/vscode into gagik/no-database-or-collection-error

wrap in l10n

182dc15

remove settings change

a341e45

Merge branch 'gagik/no-database-or-collection-error' of github.com:mo…

bec38d8

…ngodb-js/vscode into gagik/filter-namespace

gagik added 3 commits November 7, 2024 13:48

Merge branch 'gagik/one-no-collection-handling' into gagik/no-databas…

c92cdcb

…e-or-collection-error

Move tests to parameterized

985ed49

Merge branch 'main' of github.com:mongodb-js/vscode into gagik/filter…

f012622

…-namespace

gagik changed the base branch from main to gagik/no-database-or-collection-error November 7, 2024 21:08

gagik changed the base branch from gagik/no-database-or-collection-error to gagik/one-no-collection-handling November 7, 2024 21:09

gagik changed the base branch from gagik/one-no-collection-handling to main November 7, 2024 21:34

gagik added 5 commits November 7, 2024 22:37

delete mistaken duplicate

aba650c

cleanup tests

650bd79

fix expected history

0843c64

revert

6cb9ad3

Merge branch 'main' into gagik/filter-namespace

0bbeeb2

gagik marked this pull request as ready for review November 8, 2024 08:51

gagik requested review from Anemy and alenakhineika November 8, 2024 08:53

gagik changed the base branch from main to gagik/one-no-collection-handling November 8, 2024 12:16

gagik changed the base branch from gagik/one-no-collection-handling to main November 8, 2024 12:16

gagik added 2 commits November 8, 2024 14:47

Merge branch 'main' of github.com:mongodb-js/vscode into gagik/filter…

377b6bd

…-namespace

Merge branch 'main' of github.com:mongodb-js/vscode into gagik/no-dat…

853eccf

…abase-or-collection-error

gagik changed the base branch from main to gagik/no-database-or-collection-error November 8, 2024 13:53

gagik added 2 commits November 8, 2024 14:54

Merge branch 'gagik/no-database-or-collection-error' of github.com:mo…

86a33f2

…ngodb-js/vscode into gagik/filter-namespace

small cleanup

4ad4c5f

gagik force-pushed the gagik/filter-namespace branch from 6ed4488 to 4ad4c5f Compare November 8, 2024 15:53

Base automatically changed from gagik/no-database-or-collection-error to main November 8, 2024 17:21

Merge branch 'main' of github.com:mongodb-js/vscode into gagik/filter…

0a8b085

…-namespace

alenakhineika approved these changes Nov 11, 2024

View reviewed changes

Anemy reviewed Nov 11, 2024

View reviewed changes

Use helpers for tests

3c63e76

gagik merged commit dd80613 into main Nov 12, 2024
6 checks passed

gagik deleted the gagik/filter-namespace branch November 12, 2024 10:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(chat): filter namespace messages from history if it exists in metadata VSCODE-611 #866

feat(chat): filter namespace messages from history if it exists in metadata VSCODE-611 #866

gagik commented Nov 5, 2024 •

edited

Loading

Anemy left a comment •

edited

Loading

Anemy Nov 11, 2024

alenakhineika Nov 11, 2024

Anemy Nov 11, 2024 •

edited

Loading

gagik Nov 12, 2024 •

edited

Loading

gagik commented Nov 12, 2024

		@@ -163,16 +165,27 @@ export abstract class PromptBase<TArgs extends PromptArgsBase> {
		protected getHistoryMessages({

feat(chat): filter namespace messages from history if it exists in metadata VSCODE-611 #866

feat(chat): filter namespace messages from history if it exists in metadata VSCODE-611 #866

Conversation

gagik commented Nov 5, 2024 • edited Loading

Description

Checklist

Motivation and Context

Open Questions

Dependents

Types of changes

Anemy left a comment • edited Loading

Choose a reason for hiding this comment

Anemy Nov 11, 2024

Choose a reason for hiding this comment

alenakhineika Nov 11, 2024

Choose a reason for hiding this comment

Anemy Nov 11, 2024 • edited Loading

Choose a reason for hiding this comment

gagik Nov 12, 2024 • edited Loading

Choose a reason for hiding this comment

gagik commented Nov 12, 2024

gagik commented Nov 5, 2024 •

edited

Loading

Anemy left a comment •

edited

Loading

Anemy Nov 11, 2024 •

edited

Loading

gagik Nov 12, 2024 •

edited

Loading