Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ai models support cross-region inference profile #3076

Open
zxkane opened this issue Dec 15, 2024 · 6 comments
Open

ai models support cross-region inference profile #3076

zxkane opened this issue Dec 15, 2024 · 6 comments
Labels
aikit Related to Amplify AI kit documentation Improvements or additions to documentation feature-request New feature or request Gen 2

Comments

@zxkane
Copy link

zxkane commented Dec 15, 2024

Environment information

System:
  OS: Linux 6.8 Ubuntu 22.04.5 LTS 22.04.5 LTS (Jammy Jellyfish)
  CPU: (8) x64 Intel(R) Xeon(R) Platinum 8488C
  Memory: 23.05 GB / 30.82 GB
  Shell: /usr/bin/zsh
Binaries:
  Node: 20.18.0 - ~/.nvm/versions/node/v20.18.0/bin/node
  Yarn: 1.22.19 - ~/.linuxbrew/homebrew/bin/yarn
  npm: 10.8.2 - ~/.nvm/versions/node/v20.18.0/bin/npm
  pnpm: 9.6.0 - ~/.nvm/versions/node/v20.18.0/bin/pnpm
NPM Packages:
  @aws-amplify/auth-construct: 1.5.0
  @aws-amplify/backend: 1.8.0
  @aws-amplify/backend-auth: 1.4.1
  @aws-amplify/backend-cli: 1.4.2
  @aws-amplify/backend-data: 1.2.1
  @aws-amplify/backend-deployer: 1.1.9
  @aws-amplify/backend-function: 1.8.0
  @aws-amplify/backend-output-schemas: 1.4.0
  @aws-amplify/backend-output-storage: 1.1.3
  @aws-amplify/backend-secret: 1.1.5
  @aws-amplify/backend-storage: 1.2.3
  @aws-amplify/cli-core: 1.2.0
  @aws-amplify/client-config: 1.5.2
  @aws-amplify/deployed-backend-client: 1.4.2
  @aws-amplify/form-generator: 1.0.3
  @aws-amplify/model-generator: 1.0.9
  @aws-amplify/platform-core: 1.2.2
  @aws-amplify/plugin-types: 1.5.0
  @aws-amplify/sandbox: 1.2.5
  @aws-amplify/schema-generator: 1.2.5
  aws-amplify: 6.10.2
  aws-cdk: 2.167.1
  aws-cdk-lib: 2.167.1
  typescript: 5.6.3
No AWS environment variables
No CDK environment variables

Describe the feature

Cross-region inference enhances the resilience of Bedrock invocation.

https://docs.aws.amazon.com/bedrock/latest/userguide/cross-region-inference.html

Use case

invoke the LLM models of Bedrock which support cross-region inference

@ykethan
Copy link
Member

ykethan commented Dec 16, 2024

Hey,👋 thanks for raising this! I'm going to transfer this over to our API repository for better assistance 🙂
related to aws-amplify/docs#8121 (comment) providing an example

@ykethan ykethan transferred this issue from aws-amplify/amplify-backend Dec 16, 2024
@ykethan ykethan added Gen 2 aikit Related to Amplify AI kit labels Dec 16, 2024
@Siqi-Shan
Copy link
Member

Hey @zxkane, thanks for raising the issue! Regarding the cross-region inference profile, please take a look at the followup of a similar issue: AI kit does not support Cross-region inference #8121. It provides an example of how to implement within Amplify backend. Feel free to ask if you have more concerns. Thanks.

@Siqi-Shan Siqi-Shan added documentation Improvements or additions to documentation question Further information is requested pending-community-response Issue is pending a response from the author or community. and removed pending-triage labels Dec 16, 2024
@zxkane
Copy link
Author

zxkane commented Dec 17, 2024

Thanks for sharing the workaround, it's not trivial to grant additional IAM permissions to the role. Also, it requires additional logic to support environment-agnostic deployment in multiple regions.

So it would be useful as a built-in feature to mitigate the capacity limitation of those models.

@github-actions github-actions bot added pending-maintainer-response Issue is pending a response from the Amplify team. and removed pending-community-response Issue is pending a response from the author or community. labels Dec 17, 2024
@Siqi-Shan
Copy link
Member

Hey @zxkane. Thanks for sharing feedbacks and concerns. We will surely have your concerns discussed and reviewed, and evaluated whether this should be categorized as a feature request. Meanwhile, will have the topic updated here once the next step is confirmed.

@atierian atierian added feature-request New feature or request and removed question Further information is requested labels Dec 19, 2024
@atierian
Copy link
Member

it's not trivial to grant additional IAM permissions to the role.

Agreed, it also requires figuring out which IAM permissions are needed in the Amazon Bedrock documentation. We want to offer a more seamless experience for setting up cross-region inference with AI kit.

Thanks for the feature request! We'll update this issue with any news.

@atierian atierian removed the pending-maintainer-response Issue is pending a response from the Amplify team. label Dec 19, 2024
@zxkane
Copy link
Author

zxkane commented Dec 20, 2024

Below is a code snippet of how to hack both generation and conversation with cross-region inference. But the stack name and resource name are related to my app,

function createBedrockPolicyStatement(currentRegion: string, accountId: string, modelId: string, crossRegionModel: string) {
  return new PolicyStatement({
    resources: [
      `arn:aws:bedrock:*::foundation-model/${modelId}`,
      `arn:aws:bedrock:${currentRegion}:${accountId}:inference-profile/${crossRegionModel}`,
    ],
    actions: ['bedrock:InvokeModel*'],
  });
}

if (CROSS_REGION_INFERENCE && CUSTOM_MODEL_ID) {
  const currentRegion = getCurrentRegion(backend.stack);
  const crossRegionModel = getCrossRegionModelId(currentRegion, CUSTOM_MODEL_ID);
  
  // [chat converstation]
  const chatStack = backend.data.resources.nestedStacks?.['ChatConversationDirectiveLambdaStack'];
  if (chatStack) {
    const conversationFunc = chatStack.node.findAll()
      .find(child => child.node.id === 'conversationHandlerFunction') as IFunction;

    if (conversationFunc) {
      conversationFunc.addToRolePolicy(
        createBedrockPolicyStatement(currentRegion, backend.stack.account, CUSTOM_MODEL_ID, crossRegionModel)
      );
    }
  }

  // [insights generation]
  const insightsStack = backend.data.resources.nestedStacks?.['GenerationBedrockDataSourceGenerateInsightsStack'];
  if (insightsStack) {
    const dataSourceRole = insightsStack.node.findChild('GenerationBedrockDataSourceGenerateInsightsIAMRole') as IRole;
    if (dataSourceRole) {
      dataSourceRole.attachInlinePolicy(
        new Policy(insightsStack, 'CrossRegionInferencePolicy', {
          statements: [
            createBedrockPolicyStatement(currentRegion, backend.stack.account, CUSTOM_MODEL_ID, crossRegionModel)
          ],
        }),
      );
    }
  }
}

I don't think it's an easy thing for the full-stack developers without strong CDK knowledge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
aikit Related to Amplify AI kit documentation Improvements or additions to documentation feature-request New feature or request Gen 2
Projects
None yet
Development

No branches or pull requests

4 participants