GitHub - rabbidave/Denzel-Crocker-Hunting-For-Fairly-Odd-Prompts: A serverless set of functions for evaluating whether incoming messages to an LLM system seem to contain instances of prompt injection; uses cascading cosine similarity and ROUGLE-L calculation against known good and bad prompts

♫ The Dream of the 90's ♫ is alive in Portland "a weird suite of Enterprise LLM tools" named after Nicktoons

by some dude in his 30s

Utility 3) Denzel Crocker: Serverless & Event Driven inspection of messages for Prompt Injection; for use with Language Models

Description:

A set of serverless functions designed to assist in the monitoring of inputs to language models, specifically inspection of messages for prompt injection and subsequent routing of messages to the appropriate SQS bus

Rationale:

Large Language Models are subject to various forms of prompt injection (indirect or otherwise); lightweight and step-wise alerting of similar prompts compared to a baseline help your application stay secure
User experience, instrumentation, and metadata capture are crucial to the adoption of LLMs for orchestration of multi-modal agentic systems; a high cosine similarity (with known bad prompts) paired with a low rouge-L (for known good prompts) allows for appropriate routing of messages

Intent:

The intent of this FAIRIES.py is to efficiently spin up, calculate needed values for evaluation, and inspect each message for prompt injection attacks; thereafter routing messages to the appropriate SQS bus (e.g. for building a master prompt, alerting, further inspection, etc)

The goal being to detect if the message has high similarity with known bad prompts and low similarity with known good prompts; via cascading cosine similarity and ROUGE-L calculation.

The ROUGE-L value is calculated intially from the baseline, and stored in memory. ROUGE-L is calculated for incoming messages only after comparing the cosine similarity of new messages in the dataframe to known bad prompts; when complete the function spins down appropriately.

The cosine similarity is used as a heuristic to detect similarity of inputs from incoming dataframes with known bad prompts (ostensibly to identify prompt injection), and the ROUGE-L score is used to more precisely compare the inputs with a baseline dataset of known good prompts; as a means of validating the assumption of the first function.

Based on the resultant calculations messages are routed to the appropriate SQS bus. Makes use of an open-source good/bad prompt dataset available on huggingface

Note: Needs logging and additional error-handling; this is mostly conceptual and assumes the use of environment variables rather than hard-coded values for cosine similarity & ROUGE-L

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.gitattributes		.gitattributes
.gitignore		.gitignore
ConvertCSVtoParquet.py		ConvertCSVtoParquet.py
FAIRIES.py		FAIRIES.py
LICENSE		LICENSE
README.md		README.md
acceptable_prompts.csv		acceptable_prompts.csv
acceptable_prompts.parquet		acceptable_prompts.parquet
bad_prompts.csv		bad_prompts.csv
bad_prompts.parquet		bad_prompts.parquet

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

♫ The Dream of the 90's ♫ is alive in Portland "a weird suite of Enterprise LLM tools" named after Nicktoons

by some dude in his 30s

Utility 3) Denzel Crocker: Serverless & Event Driven inspection of messages for Prompt Injection; for use with Language Models

Description:

Rationale:

Intent:

Note: Needs logging and additional error-handling; this is mostly conceptual and assumes the use of environment variables rather than hard-coded values for cosine similarity & ROUGE-L

About

Releases

Packages

Languages

License

rabbidave/Denzel-Crocker-Hunting-For-Fairly-Odd-Prompts

Folders and files

Latest commit

History

Repository files navigation

♫ The Dream of the 90's ♫ is alive in Portland "a weird suite of Enterprise LLM tools" named after Nicktoons

by some dude in his 30s

Utility 3) Denzel Crocker: Serverless & Event Driven inspection of messages for Prompt Injection; for use with Language Models

Description:

Rationale:

Intent:

Note: Needs logging and additional error-handling; this is mostly conceptual and assumes the use of environment variables rather than hard-coded values for cosine similarity & ROUGE-L

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages