This repository demonstrates how to build real-time generative AI applications using Azure Event Hubs + Azure OpenAI + Pathway’s LLM App+Streamlit.
- Data streaming for real-time enterprise AI apps
Real-time AI app needs real-time data to respond with the most up-to-date information to user queries or perform quick actions autonomously. To reduce cost and infrastructural complexity, you can build a real-time data pipeline with Azure Event Hubs, Pathway, and Azure OpenAI. This integrated system leverages the strengths of Pathway for robust data processing, LLMs like GPT for advanced text analytics, and Streamlit for user-friendly data visualization.
This combination empowers businesses to build and deploy enterprise AI applications that provide the freshest contextual visual data.
For example, a multinational corporation wants to improve its customer support by analyzing customer feedback and inquiries in real-time. They aim to understand common issues, track customer sentiment, and identify areas for improvement in their products and services. To achieve this, they need a system that can process large data streams, analyze text for insights, and present these insights in an accessible way.
-
Azure Event Hubs & Kafka: Real-Time Data Streaming and Processing
Azure Event Hubs collects real-time data from various sources, such as customer feedback forms, support chat logs, and social media mentions. This data is then streamed into a Kafka cluster for further processing.
-
Large Language Models (LLMs) like GPT from Azure OpenAI: Text Analysis and Sentiment Detection
The text data from Kafka is fed into an LLM for natural language processing using Pathway. This model performs sentiment analysis, key phrase extraction, and feedback categorization (e.g., identifying common issues or topics).
-
Pathway to enable real-time data pipeline
Pathway gains access to the data streams from Azure Event Hubs, it preprocesses, transforms, or joins them and the LLM App helps to bring real-time context to the AI App with real-time vector indexing, semantic search, and retrieval capabilities. The text content of the events will be sent to Azure OpenAI embedding APIs via the LLM App to compute the embeddings and vector representations will be indexed.
Using the LLM app, the company can gain deep insights from unstructured text data, understanding the sentiment and nuances of customer feedback.
-
Streamlit: Interactive Dashboard for Visualization
Streamlit is used to create an interactive web dashboard that visualizes the insights derived from customer feedback. This dashboard can show real-time metrics such as overall sentiment trends, and common topics in customer feedback, and even alert the team to emerging issues (See example implementation of alerting to enhance this project).
Service | Purpose |
---|---|
Azure AI Services | To use Azure OpenAI GPT model and embeddings. |
Azure Event Hubs | To stream real-time events from various data sources. |
Azure Container Apps | Hosts our containerized applications (backend and frontend) with features like auto-scaling and load balancing. |
Azure Container Registry | Stores our Docker container images in a managed, private registry. |
Azure Log Analytics | Collects and analyzes telemetry and logs for insights into application performance and diagnostics. |
Azure Monitor | Provides comprehensive monitoring of our applications, infrastructure, and network. |
Follow the link to see the running UI app in Azure:
Customer support and sentiment analysis dashboard
It builds a real-time dashboard based on an example prompt we provided to analyze the data.
To set up the project you need to follow the below steps:
- You have an Azure account with the required settings specified in the Prerequisites section.
- Choose one of these environments to open the project:
- Follow the deploy from scratch or deploy with existing Azure resources guide.
Azure account requirements: To run and deploy the example project, you'll need:
- Azure account. If you're new to Azure, get an Azure account for free and you'll get some free Azure credits to get started.
- Azure subscription with access enabled for the Azure OpenAI service. You can apply for access to Azure OpenAI by completing the form at https://aka.ms/oai/access.
Follow these steps to open the project in a Codespace:
- Click here to open in GitHub Codespaces
- Click here to open in Dev Container
First, install the required tools:
- Azure Developer CLI
- Python 3.9, 3.10, or 3.11
- Important: Ensure you can run
python --version
from the console. On Ubuntu, you might need to runsudo apt install python-is-python3
to linkpython
topython3
.
- Important: Ensure you can run
- Git
- Install WSL - For Windows users only.
- Powershell 7+ (pwsh) - For Windows users only.
- Important: Ensure you can run
pwsh.exe
from a PowerShell terminal. If this fails, you likely need to upgrade PowerShell.
- Important: Ensure you can run
Then bring down the project code:
- Open a terminal.
- Run
azd auth login
and log in using your Azure account credentials. - Run
azd init -t https://github.com/pathway-labs/azure-openai-real-time-data-app
. This command will initialize a git repository and you do not need to clone this repository. - When the project starts, the system prompts you to enter a new environment name:
AZURE_ENV_NAME
. Read more manage environment variables. For example, any name like: pathway and outputs for infrastructure provisioning are automatically stored as environment variables in an.env
file, located under.azure/pathway/.env
in the project folder. - Then, follow the deploying from scratch guide.
- Open the project in GitHub Codespaces, VS Code Dev Containers, or Local environment.
- Deploy from scratch or deploy with existing Azure resources.
- Copy
.env
file, located under.azure/<environment name>/.env
folder to a new.env
file in the project root folder whereREADME.md
file is. - Install the required packages:
pip install --upgrade -r requirements_dev.txt
- Navigate to
/app/frontend
foldercd /app/frontend
. - Run the UI app with the
streamlit run app.py
command. Frontend app uses the backend API deployed in Azure automatically.
If you don't have any pre-existing Azure services and want to start from a fresh deployment, execute the following commands.
- Open a terminal.
- Run
azd up
- This will provision Azure resources and deploy the sample project to those resources. We're using Bicep, a language that simplifies the definition of ARM templates and configuring Azure resources. - You keep
EVENT_HUBS_NAMESPACE_CONNECTION_STRING
andAZURE_OPENAI_API_KEY
empty. We will assign them later after the first successful deployment.
After the application has been successfully deployed you will see URLs for both backend and frontend apps printed to the console.
NOTE: It may take 5-10 minutes for the application to be fully deployed.
- After the first deployment, we set environment variable values for
EVENT_HUBS_NAMESPACE_CONNECTION_STRING
andAZURE_OPENAI_API_KEY
by running below commands. See how to retrieve Azure OpenAI API Key and Event Hubs connection string. You can also manually set these values from the Azure portal and skip Step 7.
azd env set AZURE_OPENAI_API_KEY {Azure OpenAI API Key}
azd env set EVENT_HUBS_NAMESPACE_CONNECTION_STRING {Azure Event Hubs Namespace Connection String}
- Run
azd deploy
to update these values in the Azure Container App. Pathway LLM App backend uses these environment variables. Other variables will be filled automatically.
- Follow the generated link in the terminal for the frontend app in the Azure Container app and start to use the app. App ingests data from Azure event hubs. Learn how to send events using Azure Event Hubs Data Generator.
If you already have existing Azure resources, you can re-use those by setting azd
environment values.
- Run
azd env set AZURE_RESOURCE_GROUP {Name of existing resource group}
- Run
azd env set AZURE_LOCATION {Location of existing resource group}
- Run
azd env set AZURE_OPENAI_SERVICE {Name of existing OpenAI service}
- Run
azd env set AZURE_OPENAI_RESOURCE_GROUP {Name of existing resource group that OpenAI service is provisioned to}
- Run
azd env set AZURE_OPENAI_CHATGPT_DEPLOYMENT {Name of existing ChatGPT deployment}
. Only needed if your ChatGPT deployment is not the default 'chat'. - Run
azd env set AZURE_OPENAI_EMB_DEPLOYMENT {Name of existing GPT embedding deployment}
. Only needed if your embedding deployment is not the default 'embedding'.
When you run azd up
after and are prompted to select a value for openAiResourceGroupLocation
, make sure to select the same location as the existing OpenAI resource group.