Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/bedrockagent knowledge base #36783

Merged

Conversation

markoskandylis
Copy link
Contributor

@markoskandylis markoskandylis commented Apr 8, 2024

Description

New resource for bedrock Agent Knowledge Base

https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent_CreateKnowledgeBase.html#bedrock-agent_CreateKnowledgeBase-request-storageConfiguration

Relations

Relates #35543

References

Output from Acceptance Testing

make testacc TESTARGS='-run=TestAccBedrockAgent_serial/KnowledgeBase' PKG=bedrockagent  ACCTEST_PARALLELISM=1
==> Checking that code complies with gofmt requirements...
TF_ACC=1 go1.21.8 test ./internal/service/bedrockagent/... -v -count 1 -parallel 1  -run=TestAccBedrockAgent_serial/KnowledgeBase -timeout 360m
=== RUN   TestAccBedrockAgent_serial
=== PAUSE TestAccBedrockAgent_serial
=== CONT  TestAccBedrockAgent_serial
=== RUN   TestAccBedrockAgent_serial/KnowledgeBase
=== RUN   TestAccBedrockAgent_serial/KnowledgeBase/basic
=== RUN   TestAccBedrockAgent_serial/KnowledgeBase/disappears
=== PAUSE TestAccBedrockAgent_serial/KnowledgeBase/disappears
=== RUN   TestAccBedrockAgent_serial/KnowledgeBase/update
=== RUN   TestAccBedrockAgent_serial/KnowledgeBase/tags
=== CONT  TestAccBedrockAgent_serial/KnowledgeBase/disappears
--- PASS: TestAccBedrockAgent_serial (201.13s)
    --- PASS: TestAccBedrockAgent_serial/KnowledgeBase (164.01s)
        --- PASS: TestAccBedrockAgent_serial/KnowledgeBase/basic (40.48s)
        --- PASS: TestAccBedrockAgent_serial/KnowledgeBase/update (57.05s)
        --- PASS: TestAccBedrockAgent_serial/KnowledgeBase/tags (66.48s)
        --- PASS: TestAccBedrockAgent_serial/KnowledgeBase/disappears (37.12s)
PASS
ok  	github.com/hashicorp/terraform-provider-aws/internal/service/bedrockagent	207.281s

...

Copy link

github-actions bot commented Apr 8, 2024

Community Note

Voting for Prioritization

  • Please vote on this pull request by adding a 👍 reaction to the original post to help the community and maintainers prioritize this pull request.
  • Please see our prioritization guide for information on how we prioritize.
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request.

For Submitters

  • Review the contribution guide relating to the type of change you are making to ensure all of the necessary steps have been taken.
  • For new resources and data sources, use skaff to generate scaffolding with comments detailing common expectations.
  • Whether or not the branch has been rebased will not impact prioritization, but doing so is always a welcome surprise.

@github-actions github-actions bot added size/XL Managed by automation to categorize the size of a PR. documentation Introduces or discusses updates to documentation. tests PRs: expanded test coverage. Issues: expanded coverage, enhancements to test infrastructure. tags Pertains to resource tagging. repository Repository modifications; GitHub Actions, developer docs, issue templates, codeowners, changelog. generators Relates to code generators. github_actions Pull requests that update Github_actions code service/bedrockagent Issues and PRs that pertain to the bedrockagent service. labels Apr 8, 2024
@terraform-aws-provider terraform-aws-provider bot added needs-triage Waiting for first response or review from a maintainer. partner Contribution from a partner. labels Apr 8, 2024
@markoskandylis markoskandylis marked this pull request as ready for review April 8, 2024 14:23
@justinretzolk justinretzolk added new-resource Introduces a new resource. and removed needs-triage Waiting for first response or review from a maintainer. labels Apr 8, 2024
@ewbankkit ewbankkit self-assigned this Apr 12, 2024
@terraform-aws-provider terraform-aws-provider bot added the prioritized Part of the maintainer teams immediate focus. To be addressed within the current quarter. label Apr 12, 2024
@ewbankkit
Copy link
Contributor

@markoskandylis I keep getting 403 errors:

% make testacc TESTARGS='-run=TestAccBedrockAgent_serial/KnowledgeBase/basic' PKG=bedrockagent
==> Checking that code complies with gofmt requirements...
TF_ACC=1 go1.21.8 test ./internal/service/bedrockagent/... -v -count 1 -parallel 20  -run=TestAccBedrockAgent_serial/KnowledgeBase/basic -timeout 360m
=== RUN   TestAccBedrockAgent_serial
=== PAUSE TestAccBedrockAgent_serial
=== CONT  TestAccBedrockAgent_serial
=== RUN   TestAccBedrockAgent_serial/KnowledgeBase
=== RUN   TestAccBedrockAgent_serial/KnowledgeBase/basic
    knowledge_base_test.go:29: Step 1/2 error: Error running apply: exit status 1
        
        Error: creating Amazon Agents for Amazon Bedrock Knowledge Base ("tf-acc-test-24179290447777774"): operation error Bedrock Agent: CreateKnowledgeBase, https response error StatusCode: 400, RequestID: b3bad187-e8e2-40fc-8797-5096938c831d, ValidationException: The knowledge base storage configuration provided is invalid... Request failed: [security_exception] 403 Forbidden
        
          with aws_bedrockagent_knowledge_base.test,
          on terraform_plugin_test.tf line 94, in resource "aws_bedrockagent_knowledge_base" "test":
          94: resource "aws_bedrockagent_knowledge_base" "test" {
        
        operation error Bedrock Agent: CreateKnowledgeBase, https response error
        StatusCode: 400, RequestID: b3bad187-e8e2-40fc-8797-5096938c831d,
        ValidationException: The knowledge base storage configuration provided is
        invalid... Request failed: [security_exception] 403 Forbidden
--- FAIL: TestAccBedrockAgent_serial (374.97s)
    --- FAIL: TestAccBedrockAgent_serial/KnowledgeBase (374.97s)
        --- FAIL: TestAccBedrockAgent_serial/KnowledgeBase/basic (374.97s)
FAIL
FAIL	github.com/hashicorp/terraform-provider-aws/internal/service/bedrockagent	385.489s
FAIL
make: *** [testacc] Error 1

@markoskandylis
Copy link
Contributor Author

markoskandylis commented Apr 12, 2024

Hi @ewbankkit did you create the indexes in openseach ?Sadly this is not created automatically though the API

@ewbankkit
Copy link
Contributor

@markoskandylis No, trying to do everything hermetically.
We'll need to use the OpenSearch Provider: https://registry.terraform.io/providers/opensearch-project/opensearch/latest/docs.
Next week...

@ewbankkit
Copy link
Contributor

@markoskandylis I am having problems creating any OpenSearch index using the Terraform provider (seems OK via the AWS Console):

provider "aws" {}

data "aws_caller_identity" "current" {}
data "aws_partition" "current" {}

resource "aws_opensearchserverless_security_policy" "test_encryption" {
  name = "ewbankkit-test"
  type = "encryption"

  policy = jsonencode({
    "Rules" = [
      {
        "Resource" = [
          "collection/ewbankkit-test"
        ],
        "ResourceType" = "collection"
      }
    ],
    "AWSOwnedKey" = true
  })
}

resource "aws_opensearchserverless_security_policy" "test_network" {
  name = "ewbankkit-test"
  type = "network"

  policy = jsonencode([{
    "Rules" = [
      {
        "Resource" = [
          "collection/ewbankkit-test"
        ],
        "ResourceType" = "collection"
      }
    ],
    "AllowFromPublic" = true
  }])
}

resource "aws_opensearchserverless_access_policy" "test" {
  name = "ewbankkit-test"
  type = "data"

  policy = jsonencode([
    {
      "Rules" = [
        {
          "ResourceType" = "index",
          "Resource" = [
            "index/ewbankkit-test/*"
          ],
          "Permission" = [
            "aoss:*"
          ]
        }
      ],
      "Principal" = [
        "${data.aws_caller_identity.current.arn}"
      ]
    },
    {
      "Rules" = [
        {
          "Resource" = [
            "collection/ewbankkit-test"
          ],
          "ResourceType" = "collection"
          "Permission" = [
            "aoss:*"
          ]
        }
      ],
      "Principal" = [
        "${data.aws_caller_identity.current.arn}"
      ]
    }
  ])
}

resource "aws_opensearchserverless_collection" "test" {
  name = "ewbankkit-test"
  type = "VECTORSEARCH"

  depends_on = [aws_opensearchserverless_security_policy.test_encryption, aws_opensearchserverless_security_policy.test_network, aws_opensearchserverless_access_policy.test]
}

provider "opensearch" {
  url = aws_opensearchserverless_collection.test.collection_endpoint
}

resource "opensearch_index" "test" {
  count = 1

  name               = "ewbankkit-test"
  number_of_shards   = "1"
  number_of_replicas = "1"
  mappings           = <<EOF
{
  "properties": {
    "name": {
      "type": "text"
    }
  }
}
EOF
}

gives

opensearch_index.test[0]: Creating...
╷
│ Error: HEAD healthcheck failed: This is usually due to network or permission issues. The underlying error isn't accessible, please debug by disabling healthchecks.
│ 
│   with opensearch_index.test[0],
│   on main.tf line 100, in resource "opensearch_index" "test":
│  100: resource "opensearch_index" "test" {

during apply (or something similar if I set healtcheck = false -- see https://registry.terraform.io/providers/opensearch-project/opensearch/latest/docs).

Have you had any luck?
Given the number of upvotes for this service we would like to have an end-to-end test of creating the OpenSearch Serverless collection, vector indexes and Bedrock KB.

@markoskandylis
Copy link
Contributor Author

Hi @ewbankkit i didn't have time to check on that at all I will have a look tommotow morning

@ewbankkit ewbankkit mentioned this pull request Apr 19, 2024
6 tasks
ewbankkit and others added 2 commits April 19, 2024 12:58
# Conflicts:
#	internal/service/bedrockagent/exports_test.go
#	internal/service/bedrockagent/service_package_gen.go
@markoskandylis
Copy link
Contributor Author

Hi @ewbankkit i added an rds test with local exec but i dint have time to properly test it end to end. I will try to find some time before Thursday. The only thing is that you need to have psql client installed localy and vpc or public RDS to create the tables

@markoskandylis
Copy link
Contributor Author

End to End rds test:

Admin:~/environment/terraform-provider-aws (feature/bedrockagent-knowledge-base) $ make testacc TESTARGS='-run=TestAccBedrockAgent_serial/KnowledgeBase/rds' PKG=bedrockagent  ACCTEST_PARALLELISM=1
==> Checking that code complies with gofmt requirements...
TF_ACC=1 go test ./internal/service/bedrockagent/... -v -count 1 -parallel 1  -run=TestAccBedrockAgent_serial/KnowledgeBase/rds -timeout 360m
=== RUN   TestAccBedrockAgent_serial
=== PAUSE TestAccBedrockAgent_serial
=== CONT  TestAccBedrockAgent_serial
=== RUN   TestAccBedrockAgent_serial/KnowledgeBase
=== RUN   TestAccBedrockAgent_serial/KnowledgeBase/rds
--- PASS: TestAccBedrockAgent_serial (1767.28s)
    --- PASS: TestAccBedrockAgent_serial/KnowledgeBase (1767.28s)
        --- PASS: TestAccBedrockAgent_serial/KnowledgeBase/rds (1767.28s)
PASS
ok      github.com/hashicorp/terraform-provider-aws/internal/service/bedrockagent       1767.442s

Copy link
Contributor

@ewbankkit ewbankkit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🚀.

@ewbankkit
Copy link
Contributor

@markoskandylis Thanks for the contribution 🎉 👏.

@ewbankkit ewbankkit merged commit 2b339c6 into hashicorp:main Apr 24, 2024
43 checks passed
@github-actions github-actions bot added this to the v5.47.0 milestone Apr 24, 2024
@github-actions github-actions bot removed the prioritized Part of the maintainer teams immediate focus. To be addressed within the current quarter. label Apr 26, 2024
Copy link

This functionality has been released in v5.47.0 of the Terraform AWS Provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading.

For further feature requests or bug reports with this functionality, please create a new GitHub issue following the template. Thank you!

@acwwat
Copy link
Contributor

acwwat commented May 12, 2024

@ewbankkit I was able to get the opensearch provider to create the vector index. The tricky part was that the latest version of the provider v2.2.1 is not working because of this issue. After downgrading to v2.2.0, the index is created successfully. The index spec is the same as what I observed in network trace when I created the KB with quick start AOSS vector store in the Bedrock Console.

Here's the Terraform config that should cover the end-to-end KB setup (please excuse the resource names and variables as I am writing this for my own use case). Hope it makes good reference to create the minimal config for acceptance tests.

variable "kb_s3_bucket_name_prefix" {
  description = "The name prefix of the S3 bucket for the data source of the knowledge base."
  type        = string
  default     = "forex-kb"
}

variable "kb_oss_collection_name" {
  description = "The name of the OSS collection for the knowledge base."
  type        = string
  default     = "bedrock-knowledge-base-forex-kb"
}

variable "kb_model_id" {
  description = "The ID of the foundational model used by the knowledge base."
  type        = string
  default     = "amazon.titan-embed-text-v1"
}

variable "kb_name" {
  description = "The knowledge base name."
  type        = string
  default     = "ForexKB"
}

terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.47"
    }
    opensearch = {
      source  = "opensearch-project/opensearch"
      version = "= 2.2.0"
    }
  }
  required_version = "~> 1.5"
}

# Use data sources to get common information about the environment
data "aws_caller_identity" "this" {}
data "aws_partition" "this" {}
data "aws_region" "this" {}

data "aws_bedrock_foundation_model" "kb" {
  model_id = var.kb_model_id
}

locals {
  account_id            = data.aws_caller_identity.this.account_id
  partition             = data.aws_partition.this.partition
  region                = data.aws_region.this.name
  region_name_tokenized = split("-", local.region)
  region_short          = "${substr(local.region_name_tokenized[0], 0, 2)}${substr(local.region_name_tokenized[1], 0, 1)}${local.region_name_tokenized[2]}"
}

resource "aws_opensearchserverless_access_policy" "forex_kb" {
  name = var.kb_oss_collection_name
  type = "data"
  policy = jsonencode([
    {
      Rules = [
        {
          ResourceType = "index"
          Resource = [
            "index/${var.kb_oss_collection_name}/*"
          ]
          Permission = [
            "aoss:CreateIndex",
            "aoss:DeleteIndex",
            "aoss:DescribeIndex",
            "aoss:ReadDocument",
            "aoss:UpdateIndex",
            "aoss:WriteDocument"
          ]
        },
        {
          ResourceType = "collection"
          Resource = [
            "collection/${var.kb_oss_collection_name}"
          ]
          Permission = [
            "aoss:CreateCollectionItems",
            "aoss:DescribeCollectionItems",
            "aoss:UpdateCollectionItems"
          ]
        }
      ],
      Principal = [
        aws_iam_role.bedrock_kb_forex_kb.arn,
        data.aws_caller_identity.this.arn
      ]
    }
  ])
}

resource "aws_opensearchserverless_security_policy" "forex_kb_encryption" {
  name = var.kb_oss_collection_name
  type = "encryption"
  policy = jsonencode({
    Rules = [
      {
        Resource = [
          "collection/${var.kb_oss_collection_name}"
        ]
        ResourceType = "collection"
      }
    ],
    AWSOwnedKey = true
  })
}

resource "aws_opensearchserverless_security_policy" "forex_kb_network" {
  name = var.kb_oss_collection_name
  type = "network"
  policy = jsonencode([
    {
      Rules = [
        {
          ResourceType = "collection"
          Resource = [
            "collection/${var.kb_oss_collection_name}"
          ]
        },
        {
          ResourceType = "dashboard"
          Resource = [
            "collection/${var.kb_oss_collection_name}"
          ]
        }
      ]
      AllowFromPublic = true
    }
  ])
}

resource "aws_opensearchserverless_collection" "forex_kb" {
  name = var.kb_oss_collection_name
  type = "VECTORSEARCH"
  depends_on = [
    aws_opensearchserverless_access_policy.forex_kb,
    aws_opensearchserverless_security_policy.forex_kb_encryption,
    aws_opensearchserverless_security_policy.forex_kb_network
  ]
}

provider "opensearch" {
  url         = aws_opensearchserverless_collection.forex_kb.collection_endpoint
  healthcheck = false
}

resource "opensearch_index" "forex_kb" {
  name                           = "bedrock-knowledge-base-default-index"
  number_of_shards               = "2"
  number_of_replicas             = "0"
  index_knn                      = true
  index_knn_algo_param_ef_search = "512"
  mappings                       = <<-EOF
    {
      "properties": {
        "bedrock-knowledge-base-default-vector": {
          "type": "knn_vector",
          "dimension": 1536,
          "method": {
            "name": "hnsw",
            "engine": "faiss",
            "parameters": {
              "m": 16,
              "ef_construction": 512
            },
            "space_type": "l2"
          }
        },
        "AMAZON_BEDROCK_METADATA": {
          "type": "text",
          "index": "false"
        },
        "AMAZON_BEDROCK_TEXT_CHUNK": {
          "type": "text",
          "index": "true"
        }
      }
    }
  EOF
  depends_on                     = [aws_opensearchserverless_collection.forex_kb]
}

# S3 bucket for the knowledge base
resource "aws_s3_bucket" "forex_kb" {
  bucket        = "${var.kb_s3_bucket_name_prefix}-${local.region_short}-${local.account_id}"
  force_destroy = true
}

# Knowledge base resource role
resource "aws_iam_role" "bedrock_kb_forex_kb" {
  name = "AmazonBedrockExecutionRoleForKnowledgeBase_${var.kb_name}"
  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Action = "sts:AssumeRole"
        Effect = "Allow"
        Principal = {
          Service = "bedrock.amazonaws.com"
        }
        Condition = {
          StringEquals = {
            "aws:SourceAccount" = local.account_id
          }
          ArnLike = {
            "aws:SourceArn" = "arn:${local.partition}:bedrock:${local.region}:${local.account_id}:knowledge-base/*"
          }
        }
      }
    ]
  })
}

resource "aws_iam_role_policy" "bedrock_kb_forex_kb_model" {
  name = "AmazonBedrockFoundationModelPolicyForKnowledgeBase_${var.kb_name}"
  role = aws_iam_role.bedrock_kb_forex_kb.name
  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Action   = "bedrock:InvokeModel"
        Effect   = "Allow"
        Resource = data.aws_bedrock_foundation_model.kb.model_arn
      }
    ]
  })
}

resource "aws_iam_role_policy" "bedrock_kb_forex_kb_oss" {
  name = "AmazonBedrockOSSPolicyForKnowledgeBase_${var.kb_name}"
  role = aws_iam_role.bedrock_kb_forex_kb.name
  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Action   = "aoss:APIAccessAll"
        Effect   = "Allow"
        Resource = aws_opensearchserverless_collection.forex_kb.arn
      }
    ]
  })
}

resource "time_sleep" "aws_iam_role_policy_bedrock_kb_forex_kb_oss" {
  create_duration = "10s"
  depends_on      = [aws_iam_role_policy.bedrock_kb_forex_kb_oss]
}

resource "aws_iam_role_policy" "bedrock_kb_forex_kb_s3" {
  name = "AmazonBedrockS3PolicyForKnowledgeBase_${var.kb_name}"
  role = aws_iam_role.bedrock_kb_forex_kb.name
  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Sid      = "S3ListBucketStatement"
        Action   = "s3:ListBucket"
        Effect   = "Allow"
        Resource = aws_s3_bucket.forex_kb.arn
        Condition = {
          StringEquals = {
            "aws:PrincipalAccount" = local.account_id
          }
      } },
      {
        Sid      = "S3GetObjectStatement"
        Action   = "s3:GetObject"
        Effect   = "Allow"
        Resource = "${aws_s3_bucket.forex_kb.arn}/*"
        Condition = {
          StringEquals = {
            "aws:PrincipalAccount" = local.account_id
          }
        }
      }
    ]
  })
}

resource "aws_bedrockagent_knowledge_base" "forex_kb" {
  name     = var.kb_name
  role_arn = aws_iam_role.bedrock_kb_forex_kb.arn
  knowledge_base_configuration {
    vector_knowledge_base_configuration {
      embedding_model_arn = data.aws_bedrock_foundation_model.kb.model_arn
    }
    type = "VECTOR"
  }
  storage_configuration {
    type = "OPENSEARCH_SERVERLESS"
    opensearch_serverless_configuration {
      collection_arn    = aws_opensearchserverless_collection.forex_kb.arn
      vector_index_name = "bedrock-knowledge-base-default-index"
      field_mapping {
        vector_field   = "bedrock-knowledge-base-default-vector"
        text_field     = "AMAZON_BEDROCK_TEXT_CHUNK"
        metadata_field = "AMAZON_BEDROCK_METADATA"
      }
    }
  }
  depends_on = [
    aws_iam_role_policy.bedrock_kb_forex_kb_model,
    aws_iam_role_policy.bedrock_kb_forex_kb_s3,
    opensearch_index.forex_kb,
    time_sleep.aws_iam_role_policy_bedrock_kb_forex_kb_oss
  ]
}

resource "aws_bedrockagent_data_source" "forex_kb" {
  knowledge_base_id = aws_bedrockagent_knowledge_base.forex_kb.id
  name              = "${var.kb_name}DataSource"
  data_source_configuration {
    type = "S3"
    s3_configuration {
      bucket_arn              = aws_s3_bucket.forex_kb.arn
      bucket_owner_account_id = local.account_id
    }
  }
}

Copy link

I'm going to lock this pull request because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jun 16, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
documentation Introduces or discusses updates to documentation. generators Relates to code generators. github_actions Pull requests that update Github_actions code new-resource Introduces a new resource. partner Contribution from a partner. repository Repository modifications; GitHub Actions, developer docs, issue templates, codeowners, changelog. service/bedrockagent Issues and PRs that pertain to the bedrockagent service. size/XL Managed by automation to categorize the size of a PR. tags Pertains to resource tagging. tests PRs: expanded test coverage. Issues: expanded coverage, enhancements to test infrastructure.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants