Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ec2: Creating an InterfaceVpcEndpoint fails with VPC Endpoint Service in another account #28851

Open
mmieluch opened this issue Jan 24, 2024 · 3 comments
Labels
@aws-cdk/aws-ec2 Related to Amazon Elastic Compute Cloud bug This issue is a bug. effort/medium Medium work item – several days of effort p2

Comments

@mmieluch
Copy link

Describe the bug

I have the main account and a sub-account (organization).

In the main account I have:

  • an RDS database with a proxy;
  • a VPC endpoint service exposing a connection to the proxy. The service is visible from the CLI when ran with a profile for this account.

In the sub-account, I have a network stack setting up a VPC. In the same stack, I'm trying to create an interface endpoint for the service exposed in the main account. My sub-account role is added to the service as allowed principal. I am able to create the endpoint both from the console, and from the CLI.

Expected Behavior

The deployment should not fail, instead it should create a new VpcInterfaceEndpoint pointing at the provided VpcInterfaceService.

Current Behavior

When trying to create the endpoint using CDK, the deployment fails with (real service name changed):

[Error at /MyStack/RdsAuroraProdProxyEndpoint] The Vpc Endpoint Service 'com.amazonaws.vpce.eu-west-1.vpce-svc-abc123' does not exist

However this CLI command succeeds (service IDs changed):

aws --profile distribution-dev ec2 create-vpc-endpoint --vpc-endpoint-type Interface --vpc-id vpc-abc123 --service-name com.amazonaws.vpce.eu-west-1.vpce-svc-abc123 --subnet-ids subnet-aaa123 subnet-bbb123 --ip-address-type ipv4 --no-private-dns-enabled

Reproduction Steps

Failing CDK script:

import { Environment, Stack } from 'aws-cdk-lib';
import { InterfaceVpcEndpoint, InterfaceVpcEndpointService, IVpcEndpoint, SubnetType, Vpc } from 'aws-cdk-lib/aws-ec2';
import { Construct } from 'constructs';
import { VpcNetwork, VpcNetworkProps } from '../constructs/vpc-network';

interface NetworkStackProps {
  readonly env: Environment;
  readonly networkProps: VpcNetworkProps;
  readonly rdsEndpointServiceName: string;
}

export class NetworkStack extends Stack {
  readonly vpc: Vpc;
  readonly rdsEndpoint: IVpcEndpoint;

  constructor(scope: Construct, id: string, props: NetworkStackProps) {
    super(scope, id, {
      description: 'Network infrastructure',
      env: props.env,
    });
    const { vpc } = new VpcNetwork(this, 'Vpc', props.networkProps);
    this.vpc = vpc;

    this.rdsEndpoint = new InterfaceVpcEndpoint(this, 'RdsAuroraProdProxyEndpoint', {
      vpc: this.vpc,
      service: new InterfaceVpcEndpointService('com.amazonaws.vpce.eu-west-1.vpce-svc-0db5283ce0cd76edd'),
      subnets: this.vpc.selectSubnets({
        subnetType: SubnetType.PRIVATE_WITH_EGRESS,
      }),
      lookupSupportedAzs: true,
    });
  }
}

However this CLI call succeeded (I changed the VPC ID and subnet IDs here for safety):

aws --profile dev ec2 create-vpc-endpoint --vpc-endpoint-type Interface --vpc-id vpc-abc123 --service-name com.amazonaws.vpce.eu-west-1.vpce-svc-0db5283ce0cd76edd --subnet-ids subnet-aaa123 subnet-bbb123 --ip-address-type ipv4 --no-private-dns-enabled

Possible Solution

No response

Additional Information/Context

No response

CDK CLI Version

2.121.1

Framework Version

No response

Node.js Version

v20.9.0

OS

Linux Manjaro 6.1.69-1-MANJARO x86_64 GNU/Linux

Language

TypeScript

Language Version

TypeScript 5.3.3

Other information

No response

@mmieluch mmieluch added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Jan 24, 2024
@github-actions github-actions bot added the @aws-cdk/aws-ec2 Related to Amazon Elastic Compute Cloud label Jan 24, 2024
@mmieluch
Copy link
Author

Turns out that CloudFormation performs VPC service name lookups using random role names, created and assigned during stack deployment. My Service was allowing only specific roles and users to access itself, and hiding from any other roles in the sub-account. I had to widen the pool of permitted principals in order for the construct to work for me.

So in the main account, the principals I added to the service had to be constructed like so:

arn:aws:iam::1234567890:root

to allow all roles and users defined within that specific sub-account.

So now I don't know whether the construct's behaviour is correct or not. From practical point of view I don't think that a service should have to broaden its permissions this much to become visible, but I don't see any way to provide a specific IAM role or user to the construct. Am I missing something here?

@pahud
Copy link
Contributor

pahud commented Jan 29, 2024

Hi

Turns out that CloudFormation performs VPC service name lookups using random role names, created and assigned during stack deployment.

How did you figure out that? Can you share more details?

@pahud pahud added p2 effort/medium Medium work item – several days of effort response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. and removed needs-triage This issue or PR still needs to be triaged. labels Jan 29, 2024
@mmieluch
Copy link
Author

Hi, in all honesty I'm not 100% sure, because I didn't find any specific logs from CF regarding the roles. I do have IDs of some failing requests, but I don't know where to look them up for any additional info. If you could please tell me where I can find some traces, or logs, or whatever else to help debug this, then I'd be happy to look it up.

My conclusion does, however, seem consistent with my findings in some other approaches I experimented with when trying to find my way around this issue. I don't remember which exactly yielded the randomized role name in the logs, but one of them did. I spent pretty much the whole day on this, so it's a bit of a haze...

Things I tried:

  1. Creating an instance of a lower-level CfnVPCEndpoint. It kept failing with the same error message as the ec2.InterfaceVpcEndpoint had been, so I assumed the error kept being returned during a CF lookup call.
  2. Creating an AwsCustomResource and trying to perform the creating with an API command.
  3. Creating a one-off EC2 nano-sized instance and attaching user data to it, so that it would issue a CLI command performing a service lookup through AWS API. I think it was here where I eventually SSH-ed into this running instance and tried running the command manually, where it pointed out that the role MyStackName-BlahBlahBlahRandomstuffhashetc didn't have sufficient permissions to perform the lookup call.

All roles for all approaches were created automatically by CF during the deployment phase, so that pointed me to a role name issue. Once I determined it may have been the problem, I widened the permissions to allow all users and roles from the specific sub-account, and voila! It finally worked.

Sorry if I'm rambling; as I said earlier, I tried so many random approaches that it's all a blur now...

@github-actions github-actions bot removed the response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. label Jan 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
@aws-cdk/aws-ec2 Related to Amazon Elastic Compute Cloud bug This issue is a bug. effort/medium Medium work item – several days of effort p2
Projects
None yet
Development

No branches or pull requests

2 participants