episoder 🎙️🧠

Create pull requests for podcast websites using provided transcripts and GenAI.

Overview

This an AWS SAM project that takes podcast transcripts generated by Podwhisperer and performs two steps:

Use an Amazon Bedrock LLM to summarise the episode and create draft YouTube chapters
Create a Pull Request for the podcast's website source code containing the summary and the transcript content

An example of a Pull Request generated by this project can be found here.

GenAI / LLM use

We currently use the Anthropic Claude V2 Large Language Model (LLM) in Amazon Bedrock

System Overview

(Source for this diagram is in template.drawio in this directory)

Prerequisites

The project assumes that JSON transcripts are created upstream in an S3 Bucket by Podwhisperer.
You will need the following build tooling installed.

Node.js 18.x and NPM 8.x
AWS SAM, used to build and deploy most of the application
The AWS CLI
esbuild

By default, the target AWS account should have the SLIC Watch SAR Application installed. It can be installed by going to this page in the AWS Console. SLIC Watch is used to create alarms and dashboards for our transcription application. If you want to skip this option, just remove the single line referring to the SlicWatch-v2 macro from template.yaml.
You will need to go to the Amazon Bedrock Console and enable the Anthropic Claude v2 model in "Model Settings"
Enable access to the website repository for your podcast with an SSM SecureString parameter in your AWS account:

Parameter	Description	Example Value
`/episoder/gitHubUserCredentials`	Personal Access Token (PAT) for the GitHub repository	`username:github_pat_123AB...xyz`

Prompt Engineering

To test changes to the LLM prompt, you don't have to deploy. You can run summarise.ts with a path to a JSON transcript file. A sample transcript is provided. This script uses Bedrock so you must have AWS credentials for an account set up.

./bin/summarise.ts ./sample-transcripts/aws-bites-101.json

To tweak the prompt, edit lib/prompt-template.ts.

Deployment

Using AWS SAM:

sam build --parallel
sam deploy --guided

You will be prompted for:

The S3 Bucket where transcripts are expected to arrive
The region to use for Bedrock, since Bedrock is currently only available in a limited number of regions
The email address and name to use for Git commits
The HTTPS URL of your website GitHub repository, e.g., https://github.com/awsbites/aws-bites-site.git

Once deployment has completed, you can check the Step Function that orchestrates the whole process in the AWS Console. This state machine is automatically executed when transcripts are placed in the processed-transcripts/ prefix.

Price Monitoring

Bedrock pricing can be difficult to estimate. This repo comes with a pricing CloudWatch dashboard that helps to show the cost for a given period and the relationship between invocations, input tokens and output tokens. This is calculated based on published on-demand pricing for the ClaudeV2 model as of 28 October 2023. A CloudWatch alarm is also created for the total cost per hour, defaulting to breach when the cost exceeds $1 per hour for three consecutive hours.

The pricing dashboard can be deployed with CDK:

cd price-monitor
npm install
npx cdk deploy -c bedrockRegion=us-east-1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

episoder 🎙️🧠

Overview

GenAI / LLM use

System Overview

Prerequisites

Prompt Engineering

Deployment

Price Monitoring

Files

README.md

Latest commit

History

README.md

File metadata and controls

episoder 🎙️🧠

Overview

GenAI / LLM use

System Overview

Prerequisites

Prompt Engineering

Deployment

Price Monitoring