Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Use existing Knowledge Base #638

Open
wants to merge 17 commits into
base: v2
Choose a base branch
from

Conversation

fsatsuki
Copy link
Contributor

Issue #, if available:
None

Description of changes:
You can also use a manually created knowledge base or another chatbot knowledge base.
This PR depends on #607

Screenshot 2024-12-10 at 10 45 58

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@axelpina
Copy link

Hi,

There's a typo in has_exist_knowlednge_base_id -> has_exist_knowledge_base_id.

Copy link
Contributor

@statefb statefb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

今回もコメントが多く申し訳ありません。ご確認お願いします!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

このファイルのdiffにはどのような意図がありますか?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CDKの方でCfnOutputの出力順序を変更しています。
新規にKBを作成するルートと、既存のKBを使うルートで処理の分岐が発生し、CfnOutputの順番が変動するためです。
そのため、常に配列の0番目を取得する stack_output[0] では目的の値が取れなくなったためコードを変更しています。

self.assertEqual(
content[1].body,
base64.b64encode(content[1].body).decode(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

このdiffにはどのような意図がありますか?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

この処理を入れることでエラーが解消されます。

[error log]

bash-5.2$ cd /projects/bedrock-claude-chat/backend && poetry run python tests/test_repositories/test_conversation.py
..F.
======================================================================
FAIL: test_store_and_find_conversation (__main__.TestConversationRepository.test_store_and_find_conversation)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/projects/bedrock-claude-chat/backend/tests/test_repositories/test_conversation.py", line 302, in test_store_and_find_conversation
    self.assertEqual(
AssertionError: b"\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR\x00[155 chars]\x82" != 'iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAQAAAC1[48 chars]YII='

----------------------------------------------------------------------
Ran 4 tests in 0.024s

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

internet toolおよびconverse apiをモックされている理由を教えていただけますか?スロットリング制限回避のためでしょうか。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

すいません、これ削除します。

@@ -26,6 +33,22 @@ def create_test_private_bot(
bedrock_knowledge_base=None,
bedrock_guardrails=None,
):
if bedrock_knowledge_base is None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BotModel.bedrock_knowledge_baseはNone許容だと思いますが、ここでNoneを回避し作成している意図を教えていただけますか?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

テストコードのデバッグ中に試行錯誤した時のものでした。
テストが通るようになっているため、元に戻します。

startIngestionJob.next(getIngestionJob).next(checkIngestionJobStatus)
new sfn.Choice(this, "CheckDataSourceId")
.when(
sfn.Condition.not(sfn.Condition.stringEquals("$.DataSourceId", "")),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DataSourceIDの取得により分岐をなくせないか検討ください!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sfnの中で流通させるデータ構造を変更することで実装できました。

@@ -548,6 +556,7 @@ def fetch_all_bots_by_user_id(
sync_status=bot.sync_status,
has_knowledge=bot.has_knowledge(),
has_agent=bot.is_agent_enabled(),
has_exist_knowledge_base_id=bot.has_exist_knowledge_base_id(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alias変更不要?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

has_exist_knowledge_base_idを削除

@@ -641,6 +650,7 @@ def fetch_bot_summary(user_id: str, bot_id: str) -> BotSummaryOutput:
owned=True,
sync_status=bot.sync_status,
has_knowledge=bot.has_knowledge(),
has_exist_knowledge_base_id=bot.has_exist_knowledge_base_id(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BotSummaryにhas_exist_knowledge_base_idは不要、has_knowledgeで十分かと思いました

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

has_exist_knowledge_base_idを削除

@@ -266,7 +267,7 @@ def chat(
if display_citation:
instructions.append(PROMPT_TO_CITE_TOOL_RESULTS)

elif bot.has_knowledge():
elif bot.has_knowledge() or bot.has_exist_knowledge_base_id():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(ここで良いのかわかりませんが)has_knowledgeはメソッドの意味合い的にkb_idを保持するかどうかも包含すべきかなと思います。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

has_exist_knowledge_base_idを削除
def has_knowledge() を変更して、既存のKBを使う時もTrueを返すように変更します。

@@ -38,12 +39,12 @@ def handler(event, context):
elif output["OutputKey"] == "GuardrailVersion":
guardrail_version = output["OutputValue"]

result = []
for data_source_id in data_source_ids:
# If no data sources exist, create a result with just the KnowledgeBaseId
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DataSource分岐不要?(BotStack側のコメント参照)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

調査中

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

削除できました。

kb = KnowledgeBase.fromKnowledgeBaseAttributes(this, 'MyKnowledgeBase', {
knowledgeBaseId: props.existKnowledgeBaseId,
executionRoleArn: executionRoleArn
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DataSourceを取得しStackOutuputするようにすれば、他の分岐がなくなり可読性が高くなるかと思いますが、いかがでしょううか?

(動作確認していませんが)たとえば下記のように取得し、fromDataSourceIDなどのメソッドを利用し実装可能かどうかご検討ください

const customResource = new AwsCustomResource(stack, 'ListDataSources', {
    onCreate: {
        service: 'Bedrock',
        action: 'listDataSources',
        parameters: {
            knowledgeBaseId: 'your-knowledge-base-id',
        },
        physicalResourceId: AwsCustomResource.PhysicalResourceId.of('ListDataSources'),
    },
    policy: AwsCustomResourcePolicy.fromSdkCalls({ resources: AwsCustomResourcePolicy.ANY_RESOURCE }),
});

const dataSourceIds = customResource.getResponseField('dataSources');

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sfnの中で流通させるデータ構造を変更することで条件分岐をなくすことができました。

@statefb statefb self-assigned this Dec 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants