Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Obs AI Assistant] Add uuid to knowledge base entries to avoid overwriting accidentally #191043

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
52 commits
Select commit Hold shift + click to select a range
e627b85
[Obs AI Assistant] Add uuid to knowledge base entries to avoid overwr…
sorenlouv Aug 22, 2024
ede9593
Fix issues
sorenlouv Aug 22, 2024
4f5cb94
Fix user instruction test
sorenlouv Aug 26, 2024
bafa18b
Merge branch 'main' into add-uuid-to-kb-entries-to-avoid-overwriting
neptunian Aug 27, 2024
14854d2
Fix test
sorenlouv Aug 27, 2024
f6b8303
Cleanup
sorenlouv Aug 27, 2024
f88d826
Improve types
sorenlouv Aug 27, 2024
f42f1ec
i18n
sorenlouv Aug 27, 2024
b718ae5
Remove unused imports
sorenlouv Aug 28, 2024
59a9362
Re-add `ShortIdTable`
sorenlouv Aug 28, 2024
51cb688
Change `recall` to not return entries as nested object
sorenlouv Aug 28, 2024
51495d2
Rename `groupId` back to `docId` to reduce diff
sorenlouv Aug 28, 2024
5fdcc4d
Add test for shortIdTable
sorenlouv Aug 28, 2024
d470e39
Revert change to reduce diff
sorenlouv Aug 28, 2024
fb7370b
Keep original id
sorenlouv Aug 28, 2024
c63a8f9
Omit “User” suffix from API client methods
sorenlouv Aug 28, 2024
a6d55ed
Type fix
sorenlouv Aug 28, 2024
b3f7d3a
Fix categorization bug
sorenlouv Aug 28, 2024
b20bd1b
Revert changes to files
sorenlouv Aug 28, 2024
0f531e4
Add functional test
sorenlouv Aug 29, 2024
63b171d
Merge branch 'main' of github.com:elastic/kibana into add-uuid-to-kb-…
sorenlouv Aug 29, 2024
08aa578
i18n
sorenlouv Aug 29, 2024
ad425be
Revert tsconfig changes
sorenlouv Aug 29, 2024
6da17f3
Merge branch 'main' of github.com:elastic/kibana into add-uuid-to-kb-…
sorenlouv Aug 29, 2024
90b5484
Include KB entries when they contain contradicting info
sorenlouv Aug 29, 2024
923681c
[CI] Auto-commit changed files from 'node scripts/build_plugin_list_d…
kibanamachine Aug 29, 2024
03e8fe6
Fix tsc and jest
sorenlouv Aug 30, 2024
9b2cf9b
Improve functional test
sorenlouv Aug 30, 2024
5ea4ac2
Merge branch 'main' into add-uuid-to-kb-entries-to-avoid-overwriting
sorenlouv Aug 31, 2024
91c84d2
Merge branch 'main' into add-uuid-to-kb-entries-to-avoid-overwriting
sorenlouv Sep 2, 2024
b7d2f8e
Merge branch 'main' of github.com:elastic/kibana into add-uuid-to-kb-…
sorenlouv Oct 30, 2024
1d6beef
Merge branch 'main' of github.com:elastic/kibana into add-uuid-to-kb-…
sorenlouv Oct 31, 2024
fc0a970
Fix tsc
sorenlouv Oct 31, 2024
0aaf657
Fix serverless test
sorenlouv Oct 31, 2024
b0e2781
Remove `doc_id`
sorenlouv Nov 5, 2024
035a054
Remove queue logic
sorenlouv Nov 5, 2024
6091e81
Merge branch 'main' into add-uuid-to-kb-entries-to-avoid-overwriting
sorenlouv Nov 5, 2024
996e3d8
Fall back to doc_id
sorenlouv Nov 5, 2024
94996c8
Remove log
sorenlouv Nov 5, 2024
e1ae033
Remove unused knowledge_base/knowledge_base.spec.ts
sorenlouv Nov 5, 2024
993745e
Fix lint issues
sorenlouv Nov 5, 2024
4022dab
Fix serverless test
sorenlouv Nov 5, 2024
3671085
Fix broken api test
sorenlouv Nov 5, 2024
5708359
fix tests
sorenlouv Nov 5, 2024
6e448b8
Fix test
sorenlouv Nov 6, 2024
3114e7d
Improve type
sorenlouv Nov 6, 2024
094f94c
Fix functional test
sorenlouv Nov 6, 2024
744a279
Merge branch 'main' into add-uuid-to-kb-entries-to-avoid-overwriting
sorenlouv Nov 6, 2024
55e5800
Remove queue for task manager types
sorenlouv Nov 6, 2024
3f715be
Merge branch 'main' into add-uuid-to-kb-entries-to-avoid-overwriting
sorenlouv Nov 6, 2024
efa7bf2
editorUser -> editor
sorenlouv Nov 7, 2024
732f4ae
Fix summarisation test
sorenlouv Nov 7, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -82,8 +82,8 @@ export type ConversationUpdateRequest = ConversationRequestBase & {
export interface KnowledgeBaseEntry {
'@timestamp': string;
id: string;
title?: string;
text: string;
doc_id: string;
confidence: 'low' | 'medium' | 'high';
is_correction: boolean;
type?: 'user_instruction' | 'contextual';
Expand All @@ -96,12 +96,12 @@ export interface KnowledgeBaseEntry {
}

export interface Instruction {
doc_id: string;
id: string;
text: string;
}

export interface AdHocInstruction {
doc_id?: string;
id?: string;
Copy link
Member Author

@sorenlouv sorenlouv Aug 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doc_id can be used by the LLM to lookup entries. I see no reason to expand that concept to instructions. instructions can still have pre-determined id's - they do not have to be UUIDs. See the lens docs for an example of this

text: string;
instruction_type: 'user_instruction' | 'application_instruction';
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,16 @@
import { ShortIdTable } from './short_id_table';

describe('shortIdTable', () => {
it('generates a short id from a uuid', () => {
const table = new ShortIdTable();

const uuid = 'd877f65c-4036-42c4-b105-19e2f1a1c045';
const shortId = table.take(uuid);

expect(shortId.length).toBe(4);
expect(table.lookup(shortId)).toBe(uuid);
});

it('generates at least 10k unique ids consistently', () => {
const ids = new Set();

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -52,9 +52,9 @@ const schema: RootSchema<RecallRanking> = {
},
};

export const RecallRankingEventType = 'observability_ai_assistant_recall_ranking';
export const recallRankingEventType = 'observability_ai_assistant_recall_ranking';

export const recallRankingEvent: EventTypeOpts<RecallRanking> = {
eventType: RecallRankingEventType,
eventType: recallRankingEventType,
schema,
};
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
* 2.0.
*/

import { KnowledgeBaseType } from '../../common/types';
import { v4 } from 'uuid';
import type { FunctionRegistrationParameters } from '.';
import { KnowledgeBaseEntryRole } from '../../common';

Expand All @@ -14,6 +14,7 @@ export const SUMMARIZE_FUNCTION_NAME = 'summarize';
export function registerSummarizationFunction({
client,
functions,
resources,
}: FunctionRegistrationParameters) {
functions.registerFunction(
{
Expand All @@ -28,10 +29,10 @@ export function registerSummarizationFunction({
parameters: {
type: 'object',
properties: {
id: {
title: {
type: 'string',
description:
'An id for the document. This should be a short human-readable keyword field with only alphabetic characters and underscores, that allow you to update it later.',
'A human readable title that can be used to identify the document later. This should be no longer than 255 characters',
},
text: {
type: 'string',
Expand All @@ -54,29 +55,31 @@ export function registerSummarizationFunction({
},
},
required: [
'id' as const,
'title' as const,
'text' as const,
'is_correction' as const,
'confidence' as const,
'public' as const,
],
},
},
(
{ arguments: { id, text, is_correction: isCorrection, confidence, public: isPublic } },
async (
{ arguments: { title, text, is_correction: isCorrection, confidence, public: isPublic } },
signal
) => {
const id = v4();
resources.logger.debug(`Creating new knowledge base entry with id: ${id}`);

return client
.addKnowledgeBaseEntry({
entry: {
doc_id: id,
role: KnowledgeBaseEntryRole.AssistantSummarization,
id,
title,
text,
is_correction: isCorrection,
type: KnowledgeBaseType.Contextual,
confidence,
public: isPublic,
role: KnowledgeBaseEntryRole.AssistantSummarization,
confidence,
is_correction: isCorrection,
labels: {},
},
// signal,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ const chatCompleteBaseRt = t.type({
]),
instructions: t.array(
t.intersection([
t.partial({ doc_id: t.string }),
t.partial({ id: t.string }),
Copy link
Member Author

@sorenlouv sorenlouv Aug 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's still possible to overwrite existing instructions by specifying the id

t.type({
text: t.string,
instruction_type: t.union([
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
import { notImplemented } from '@hapi/boom';
import { nonEmptyStringRt, toBooleanRt } from '@kbn/io-ts-utils';
import * as t from 'io-ts';
import { v4 } from 'uuid';
import { FunctionDefinition } from '../../../common/functions/types';
import { KnowledgeBaseEntryRole } from '../../../common/types';
import type { RecalledEntry } from '../../service/knowledge_base_service';
Expand Down Expand Up @@ -114,19 +115,19 @@ const functionRecallRoute = createObservabilityAIAssistantServerRoute({
throw notImplemented();
}

return client.recall({ queries, categories });
const entries = await client.recall({ queries, categories });
return { entries };
},
});

const functionSummariseRoute = createObservabilityAIAssistantServerRoute({
endpoint: 'POST /internal/observability_ai_assistant/functions/summarize',
params: t.type({
body: t.type({
id: t.string,
title: t.string,
text: nonEmptyStringRt,
confidence: t.union([t.literal('low'), t.literal('medium'), t.literal('high')]),
is_correction: toBooleanRt,
type: t.union([t.literal('user_instruction'), t.literal('contextual')]),
public: toBooleanRt,
labels: t.record(t.string, t.string),
}),
Expand All @@ -142,22 +143,20 @@ const functionSummariseRoute = createObservabilityAIAssistantServerRoute({
}

const {
title,
confidence,
id,
is_correction: isCorrection,
type,
text,
public: isPublic,
labels,
} = resources.params.body;

return client.addKnowledgeBaseEntry({
entry: {
title,
confidence,
id,
doc_id: id,
id: v4(),
is_correction: isCorrection,
type,
text,
public: isPublic,
labels,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,16 +9,12 @@ import type {
MlDeploymentAllocationState,
MlDeploymentState,
} from '@elastic/elasticsearch/lib/api/types';
import pLimit from 'p-limit';
import { notImplemented } from '@hapi/boom';
import { nonEmptyStringRt, toBooleanRt } from '@kbn/io-ts-utils';
import * as t from 'io-ts';
import { createObservabilityAIAssistantServerRoute } from '../create_observability_ai_assistant_server_route';
import {
Instruction,
KnowledgeBaseEntry,
KnowledgeBaseEntryRole,
KnowledgeBaseType,
} from '../../../common/types';
import { Instruction, KnowledgeBaseEntry, KnowledgeBaseEntryRole } from '../../../common/types';

const getKnowledgeBaseStatus = createObservabilityAIAssistantServerRoute({
endpoint: 'GET /internal/observability_ai_assistant/kb/status',
Expand Down Expand Up @@ -108,18 +104,8 @@ const saveKnowledgeBaseUserInstruction = createObservabilityAIAssistantServerRou
}

const { id, text, public: isPublic } = resources.params.body;
return client.addKnowledgeBaseEntry({
entry: {
id,
doc_id: id,
text,
public: isPublic,
confidence: 'high',
is_correction: false,
type: KnowledgeBaseType.UserInstruction,
labels: {},
role: KnowledgeBaseEntryRole.UserEntry,
},
return client.addUserInstruction({
entry: { id, text, public: isPublic },
});
},
});
Expand Down Expand Up @@ -153,26 +139,29 @@ const getKnowledgeBaseEntries = createObservabilityAIAssistantServerRoute({
},
});

const knowledgeBaseEntryRt = t.intersection([
t.type({
id: t.string,
title: t.string,
text: nonEmptyStringRt,
}),
t.partial({
confidence: t.union([t.literal('low'), t.literal('medium'), t.literal('high')]),
is_correction: toBooleanRt,
public: toBooleanRt,
labels: t.record(t.string, t.string),
role: t.union([
t.literal(KnowledgeBaseEntryRole.AssistantSummarization),
t.literal(KnowledgeBaseEntryRole.UserEntry),
t.literal(KnowledgeBaseEntryRole.Elastic),
]),
}),
]);

const saveKnowledgeBaseEntry = createObservabilityAIAssistantServerRoute({
endpoint: 'POST /internal/observability_ai_assistant/kb/entries/save',
params: t.type({
body: t.intersection([
t.type({
id: t.string,
text: nonEmptyStringRt,
}),
t.partial({
confidence: t.union([t.literal('low'), t.literal('medium'), t.literal('high')]),
is_correction: toBooleanRt,
public: toBooleanRt,
labels: t.record(t.string, t.string),
role: t.union([
t.literal('assistant_summarization'),
t.literal('user_entry'),
t.literal('elastic'),
]),
}),
]),
body: knowledgeBaseEntryRt,
}),
options: {
tags: ['access:ai_assistant'],
Expand All @@ -184,27 +173,15 @@ const saveKnowledgeBaseEntry = createObservabilityAIAssistantServerRoute({
throw notImplemented();
}

const {
id,
text,
public: isPublic,
confidence,
is_correction: isCorrection,
labels,
role,
} = resources.params.body;

const entry = resources.params.body;
return client.addKnowledgeBaseEntry({
entry: {
id,
text,
doc_id: id,
confidence: confidence ?? 'high',
is_correction: isCorrection ?? false,
type: 'contextual',
public: isPublic ?? true,
labels: labels ?? {},
role: (role as KnowledgeBaseEntryRole) ?? KnowledgeBaseEntryRole.UserEntry,
confidence: 'high',
is_correction: false,
public: true,
labels: {},
role: KnowledgeBaseEntryRole.UserEntry,
...entry,
},
});
},
Expand Down Expand Up @@ -235,12 +212,7 @@ const importKnowledgeBaseEntries = createObservabilityAIAssistantServerRoute({
endpoint: 'POST /internal/observability_ai_assistant/kb/entries/import',
params: t.type({
body: t.type({
entries: t.array(
t.type({
id: t.string,
text: nonEmptyStringRt,
})
),
entries: t.array(knowledgeBaseEntryRt),
}),
}),
options: {
Expand All @@ -253,18 +225,29 @@ const importKnowledgeBaseEntries = createObservabilityAIAssistantServerRoute({
throw notImplemented();
}

const entries = resources.params.body.entries.map((entry) => ({
doc_id: entry.id,
confidence: 'high' as KnowledgeBaseEntry['confidence'],
is_correction: false,
type: 'contextual' as const,
public: true,
labels: {},
role: KnowledgeBaseEntryRole.UserEntry,
...entry,
}));

return await client.importKnowledgeBaseEntries({ entries });
const status = await client.getKnowledgeBaseStatus();
if (!status.ready) {
throw new Error('Knowledge base is not ready');
}

const limiter = pLimit(5);

const promises = resources.params.body.entries.map(async (entry) => {
return limiter(async () => {
return client.addKnowledgeBaseEntry({
entry: {
confidence: 'high',
is_correction: false,
public: true,
labels: {},
role: KnowledgeBaseEntryRole.UserEntry,
...entry,
},
});
});
});

await Promise.all(promises);
},
});

Expand All @@ -273,8 +256,8 @@ export const knowledgeBaseRoutes = {
...getKnowledgeBaseStatus,
...getKnowledgeBaseEntries,
...saveKnowledgeBaseUserInstruction,
...getKnowledgeBaseUserInstructions,
...importKnowledgeBaseEntries,
...getKnowledgeBaseUserInstructions,
...saveKnowledgeBaseEntry,
...deleteKnowledgeBaseEntry,
};
Loading