Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(glue): add ExternalTable for use with connections #24753

Merged
merged 54 commits into from
Sep 11, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
54 commits
Select commit Hold shift + click to select a range
012df48
addition: connection name
Rizxcviii Mar 22, 2023
cfed556
addition: custom location of data
Rizxcviii Mar 22, 2023
13dcaa8
adding connection integ
Rizxcviii Mar 22, 2023
5be4ffc
addition: README
Rizxcviii Mar 23, 2023
2172a27
Merge branch 'main' into feature/glue-connection-for-tables
Rizxcviii Apr 3, 2023
93923a5
Merge branch 'main' into feature/glue-connection-for-tables
Rizxcviii Apr 13, 2023
6f337a5
Merge branch 'main' into feature/glue-connection-for-tables
Rizxcviii Apr 13, 2023
6581bf4
integ
Rizxcviii Apr 14, 2023
37c4965
Merge branch 'main' into feature/glue-connection-for-tables
Rizxcviii Jul 4, 2023
00593b7
updating README to combine 2 examples into one
Rizxcviii Jul 4, 2023
431fbca
grant behaviour incorrect
Rizxcviii Jul 13, 2023
3af0907
adding package by accident
Rizxcviii Jul 13, 2023
3e007ee
Merge branch 'main' into feature/glue-connection-for-tables
Rizxcviii Jul 13, 2023
bac46a6
removing undefined
Rizxcviii Jul 13, 2023
243a0e3
Merge branch 'main' into feature/glue-connection-for-tables
Rizxcviii Aug 1, 2023
501249d
integ test
Rizxcviii Aug 1, 2023
3a43fe8
deprecating table
Rizxcviii Aug 23, 2023
e083cff
deprecating table
Rizxcviii Aug 23, 2023
cc1a180
deprecating
Rizxcviii Aug 23, 2023
6aa651d
deprecating
Rizxcviii Aug 23, 2023
772649d
Table -> S3Table
Rizxcviii Aug 23, 2023
31c0d44
Table -> S3Table
Rizxcviii Aug 23, 2023
5a9b46f
deprecating Table constructu
Rizxcviii Aug 23, 2023
07cb52d
using S3 table
Rizxcviii Aug 23, 2023
d7ea017
new table
Rizxcviii Aug 23, 2023
ea081de
old table deprecated
Rizxcviii Aug 23, 2023
c4317b1
integ test
Rizxcviii Aug 23, 2023
130bd3c
bump
Rizxcviii Aug 24, 2023
261a9a8
Merge branch 'main' into feature/glue-connection-for-tables
Rizxcviii Aug 24, 2023
3fc36cd
Merge branch 'main' into feature/glue-connection-for-tables
Rizxcviii Aug 25, 2023
dc90f58
separating table tests
Rizxcviii Aug 25, 2023
b9ba3f2
moving table encryption to s3 tables only
Rizxcviii Aug 29, 2023
4348400
bug with table description
Rizxcviii Aug 29, 2023
634f717
removing encryption test
Rizxcviii Aug 29, 2023
e259fbd
removing encryption tests
Rizxcviii Aug 30, 2023
8fcc934
adding deprecated table tests
Rizxcviii Aug 30, 2023
eabfa71
updating README
Rizxcviii Aug 31, 2023
d5f762c
incorrect changes
Rizxcviii Sep 1, 2023
dab9f28
english...
Rizxcviii Sep 1, 2023
5abf222
generate -> get
Rizxcviii Sep 1, 2023
7d8a005
moving s3 declaration
Rizxcviii Sep 1, 2023
05cd0ce
using Table.*
Rizxcviii Sep 1, 2023
285de6c
keeping deprecated table the same
Rizxcviii Sep 1, 2023
a8b4948
Making a subclass of S3Table
Rizxcviii Sep 4, 2023
8062297
changing table name
Rizxcviii Sep 4, 2023
da2f878
reverting test file
Rizxcviii Sep 4, 2023
df4d288
integ test
Rizxcviii Sep 4, 2023
8e053a7
removing location prop, keeping this for separate PR
Rizxcviii Sep 6, 2023
4ebc18d
the english language
Rizxcviii Sep 6, 2023
693dae0
Merge branch 'main' into feature/glue-connection-for-tables
mrgrain Sep 11, 2023
f0c455b
splitting external table into own test file
Rizxcviii Sep 11, 2023
45ea77c
compat issues, TableProps is now used for deprecated table
Rizxcviii Sep 11, 2023
447acb9
integ test
Rizxcviii Sep 11, 2023
e12dee6
Merge branch 'main' into feature/glue-connection-for-tables
Rizxcviii Sep 11, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
50 changes: 36 additions & 14 deletions packages/@aws-cdk/aws-glue-alpha/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -211,7 +211,7 @@ A Glue table describes a table of data in S3: its structure (column names and ty

```ts
declare const myDatabase: glue.Database;
new glue.Table(this, 'MyTable', {
new glue.S3Table(this, 'MyTable', {
database: myDatabase,
columns: [{
name: 'col1',
Expand All @@ -230,7 +230,7 @@ By default, a S3 bucket will be created to store the table's data but you can ma
```ts
declare const myBucket: s3.Bucket;
declare const myDatabase: glue.Database;
new glue.Table(this, 'MyTable', {
new glue.S3Table(this, 'MyTable', {
bucket: myBucket,
s3Prefix: 'my-table/',
// ...
Expand All @@ -247,7 +247,7 @@ Glue tables can be configured to contain user-defined properties, to describe th

```ts
declare const myDatabase: glue.Database;
new glue.Table(this, 'MyTable', {
new glue.S3Table(this, 'MyTable', {
storageParameters: [
glue.StorageParameter.skipHeaderLineCount(1),
glue.StorageParameter.compressionType(glue.CompressionType.GZIP),
Expand All @@ -269,7 +269,7 @@ To improve query performance, a table can specify `partitionKeys` on which data

```ts
declare const myDatabase: glue.Database;
new glue.Table(this, 'MyTable', {
new glue.S3Table(this, 'MyTable', {
database: myDatabase,
columns: [{
name: 'col1',
Expand Down Expand Up @@ -300,7 +300,7 @@ property:

```ts
declare const myDatabase: glue.Database;
new glue.Table(this, 'MyTable', {
new glue.S3Table(this, 'MyTable', {
database: myDatabase,
columns: [{
name: 'col1',
Expand Down Expand Up @@ -337,7 +337,7 @@ If you have a table with a large number of partitions that grows over time, cons

```ts
declare const myDatabase: glue.Database;
new glue.Table(this, 'MyTable', {
new glue.S3Table(this, 'MyTable', {
database: myDatabase,
columns: [{
name: 'col1',
Expand All @@ -355,6 +355,28 @@ new glue.Table(this, 'MyTable', {
});
```

### Glue Connections

Glue connections allow external data connections to third party databases and data warehouses. However, these connections can also be assigned to Glue Tables, allowing you to query external data sources using the Glue Data Catalog.

Whereas `S3Table` will point to (and if needed, create) a bucket to store the tables' data, `ExternalTable` will point to an existing table in a data source. For example, to create a table in Glue that points to a table in Redshift:

```ts
declare const myConnection: glue.Connection;
declare const myDatabase: glue.Database;
new glue.ExternalTable(this, 'MyTable', {
connection: myConnection,
externalDataLocation: 'default_db_public_example', // A table in Redshift
// ...
database: myDatabase,
columns: [{
name: 'col1',
type: glue.Schema.STRING,
}],
dataFormat: glue.DataFormat.JSON,
});
```

## [Encryption](https://docs.aws.amazon.com/athena/latest/ug/encryption.html)

You can enable encryption on a Table's data:
Expand All @@ -363,7 +385,7 @@ You can enable encryption on a Table's data:

```ts
declare const myDatabase: glue.Database;
new glue.Table(this, 'MyTable', {
new glue.S3Table(this, 'MyTable', {
encryption: glue.TableEncryption.S3_MANAGED,
// ...
database: myDatabase,
Expand All @@ -380,7 +402,7 @@ new glue.Table(this, 'MyTable', {
```ts
declare const myDatabase: glue.Database;
// KMS key is created automatically
new glue.Table(this, 'MyTable', {
new glue.S3Table(this, 'MyTable', {
encryption: glue.TableEncryption.KMS,
// ...
database: myDatabase,
Expand All @@ -392,7 +414,7 @@ new glue.Table(this, 'MyTable', {
});

// with an explicit KMS key
new glue.Table(this, 'MyTable', {
new glue.S3Table(this, 'MyTable', {
encryption: glue.TableEncryption.KMS,
encryptionKey: new kms.Key(this, 'MyKey'),
// ...
Expand All @@ -409,7 +431,7 @@ new glue.Table(this, 'MyTable', {

```ts
declare const myDatabase: glue.Database;
new glue.Table(this, 'MyTable', {
new glue.S3Table(this, 'MyTable', {
encryption: glue.TableEncryption.KMS_MANAGED,
// ...
database: myDatabase,
Expand All @@ -426,7 +448,7 @@ new glue.Table(this, 'MyTable', {
```ts
declare const myDatabase: glue.Database;
// KMS key is created automatically
new glue.Table(this, 'MyTable', {
new glue.S3Table(this, 'MyTable', {
encryption: glue.TableEncryption.CLIENT_SIDE_KMS,
// ...
database: myDatabase,
Expand All @@ -438,7 +460,7 @@ new glue.Table(this, 'MyTable', {
});

// with an explicit KMS key
new glue.Table(this, 'MyTable', {
new glue.S3Table(this, 'MyTable', {
encryption: glue.TableEncryption.CLIENT_SIDE_KMS,
encryptionKey: new kms.Key(this, 'MyKey'),
// ...
Expand All @@ -451,15 +473,15 @@ new glue.Table(this, 'MyTable', {
});
```

*Note: you cannot provide a `Bucket` when creating the `Table` if you wish to use server-side encryption (`KMS`, `KMS_MANAGED` or `S3_MANAGED`)*.
*Note: you cannot provide a `Bucket` when creating the `S3Table` if you wish to use server-side encryption (`KMS`, `KMS_MANAGED` or `S3_MANAGED`)*.

## Types

A table's schema is a collection of columns, each of which have a `name` and a `type`. Types are recursive structures, consisting of primitive and complex types:

```ts
declare const myDatabase: glue.Database;
new glue.Table(this, 'MyTable', {
new glue.S3Table(this, 'MyTable', {
columns: [{
name: 'primitive_column',
type: glue.Schema.STRING,
Expand Down
171 changes: 171 additions & 0 deletions packages/@aws-cdk/aws-glue-alpha/lib/external-table.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,171 @@
import { CfnTable } from 'aws-cdk-lib/aws-glue';
import * as iam from 'aws-cdk-lib/aws-iam';
import { Construct } from 'constructs';
import { IConnection } from './connection';
import { Column } from './schema';
import { PartitionIndex, TableBase, TableBaseProps } from './table-base';

export interface ExternalTableProps extends TableBaseProps {
/**
* The connection the table will use when performing reads and writes.
*
* @default - No connection
*/
readonly connection: IConnection;

/**
* The data source location of the glue table, (e.g. `default_db_public_example` for Redshift).
*
* If this property is set, it will override both `bucket` and `s3Prefix`.
*
* @default - No outsourced data source location
*/
readonly externalDataLocation: string;
}

/**
* A Glue table that targets an external data location (e.g. A table in a Redshift Cluster).
*/
export class ExternalTable extends TableBase {
/**
* Name of this table.
*/
public readonly tableName: string;

/**
* ARN of this table.
*/
public readonly tableArn: string;

/**
* The connection associated to this table
*/
public readonly connection: IConnection;

/**
* This table's partition indexes.
*/
public readonly partitionIndexes?: PartitionIndex[];

protected readonly tableResource: CfnTable;

constructor(scope: Construct, id: string, props: ExternalTableProps) {
super(scope, id, props);
this.connection = props.connection;
this.tableResource = new CfnTable(this, 'Table', {
catalogId: props.database.catalogId,

databaseName: props.database.databaseName,

tableInput: {
name: this.physicalName,
description: props.description || `${this.physicalName} generated by CDK`,

partitionKeys: renderColumns(props.partitionKeys),

parameters: {
'classification': props.dataFormat.classificationString?.value,
'has_encrypted_data': true,
'partition_filtering.enabled': props.enablePartitionFiltering,
'connectionName': props.connection.connectionName,
},
storageDescriptor: {
location: props.externalDataLocation,
compressed: this.compressed,
storedAsSubDirectories: props.storedAsSubDirectories ?? false,
columns: renderColumns(props.columns),
inputFormat: props.dataFormat.inputFormat.className,
outputFormat: props.dataFormat.outputFormat.className,
serdeInfo: {
serializationLibrary: props.dataFormat.serializationLibrary.className,
},
parameters: props.storageParameters ? props.storageParameters.reduce((acc, param) => {
if (param.key in acc) {
throw new Error(`Duplicate storage parameter key: ${param.key}`);
}
const key = param.key;
acc[key] = param.value;
return acc;
}, {} as { [key: string]: string }) : undefined,
},

tableType: 'EXTERNAL_TABLE',
},
});

this.tableName = this.getResourceNameAttribute(this.tableResource.ref);
this.tableArn = this.stack.formatArn({
service: 'glue',
resource: 'table',
resourceName: `${this.database.databaseName}/${this.tableName}`,
});
this.node.defaultChild = this.tableResource;

// Partition index creation relies on created table.
if (props.partitionIndexes) {
this.partitionIndexes = props.partitionIndexes;
this.partitionIndexes.forEach((index) => this.addPartitionIndex(index));
}
}

/**
* Grant read permissions to the table
*
* @param grantee the principal
*/
public grantRead(grantee: iam.IGrantable): iam.Grant {
const ret = this.grant(grantee, readPermissions);
return ret;
}

/**
* Grant write permissions to the table
*
* @param grantee the principal
*/
public grantWrite(grantee: iam.IGrantable): iam.Grant {
const ret = this.grant(grantee, writePermissions);
return ret;
}

/**
* Grant read and write permissions to the table
*
* @param grantee the principal
*/
public grantReadWrite(grantee: iam.IGrantable): iam.Grant {
const ret = this.grant(grantee, [...readPermissions, ...writePermissions]);
return ret;
}
}

const readPermissions = [
'glue:BatchGetPartition',
'glue:GetPartition',
'glue:GetPartitions',
'glue:GetTable',
'glue:GetTables',
'glue:GetTableVersion',
'glue:GetTableVersions',
];

const writePermissions = [
'glue:BatchCreatePartition',
'glue:BatchDeletePartition',
'glue:CreatePartition',
'glue:DeletePartition',
'glue:UpdatePartition',
];

function renderColumns(columns?: Array<Column | Column>) {
if (columns === undefined) {
return undefined;
}
return columns.map(column => {
return {
name: column.name,
type: column.type.inputString,
comment: column.comment,
};
});
}
5 changes: 4 additions & 1 deletion packages/@aws-cdk/aws-glue-alpha/lib/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,12 @@ export * from './connection';
export * from './data-format';
export * from './data-quality-ruleset';
export * from './database';
export * from './external-table';
export * from './job';
export * from './job-executable';
export * from './s3-table';
export * from './schema';
export * from './security-configuration';
export * from './storage-parameter';
export * from './table';
export * from './table-base';
export * from './table-deprecated';
Loading