Skip to content

Latest commit

 

History

History
 
 

pubsub2inbox

Pubsub2Inbox

Pubsub2Inbox is a generic tool to handle input from Pub/Sub messages and turn them into email, webhooks or GCS objects. It's based on an extendable framework consisting of input and output processors. Input processors can enrich the incoming messages with details (for example, fetching the budget from Cloud Billing Budgets API). Multiple output processors can be chained together.

Pubsub2Inbox is written in Python 3.8+ and can be deployed as a Cloud Function easily. To guard credentials and other sensitive information, the tool can fetch its YAML configuration from Google Cloud Secret Manager.

The tool also supports templating of emails, messages and other parameters through Jinja2 templating.

Please note: You cannot connect to SMTP port 25 from GCP. Use alternative ports 465 or 587, or connect via Serverless VPC Connector to your own mailservers.

Out of the box

Out of the box, you'll have the following functionality:

Input processors

Available input processors are:

  • budget.py: retrieves details from Cloud Billing Budgets API and presents.
  • scc.py: enriches Cloud Security Command Center findings notifications.
  • bigquery.py: queries from BigQuery datasets.
  • genericjson.py: Parses message data as JSON and presents it to output processors.
  • recommendations.py: Retrieves recommendations and insights from the Recommender API.
  • groups.py: Retrieves Cloud Identity Groups
  • directory.py: Retrieves users, groups, group members and group settings
  • monitoring.py: Retrieves time series data from Cloud Ops Monitoring
  • projects.py: Searches or gets GCP project details
  • cai.py: Fetch assets from Cloud Asset Inventory

For full documentation of permissions, processor input and output parameters, see PROCESSORS.md.

Please note that the input processors have some IAM requirements to be able to pull information from GCP:

  • Resend mechanism (see below)
    • Storage Object Admin (roles/storage.objectAdmin)
  • Signed URL generation (see filters/strings.py:generate_signed_url)
    • Storage Admin on the bucket (roles/storage.admin)

Output processors

Available output processors are:

  • mail.py: can send HTML and/or text emails via SMTP gateways, SendGrid or MS Graph API (Graph API implementation lacks attachment support)
  • gcs.py: can create objects on GCS from any inputs.
  • webhook.py: can send arbitrary HTTP requests, optionally with added OAuth2 bearer token from GCP.
  • gcscopy.py: copies files between buckets.
  • logger.py: Logs message in Cloud Logging.
  • pubsub.py: Sends one or more Pub/Sub messages.
  • bigquery.py: Sends output to a BigQuery table via a load job.
  • scc.py: Sends findings to Cloud Security Command Center.
  • twilio.py: Sends SMS messages via Twilio API.

Please note that the output processors have some IAM requirements to be able to pull information from GCP:

Configuring Pubsub2Inbox

Pubsub2Inbox is configured through a YAML file (for examples, see the examples/ directory). Input processors are configured under processors key and outputs under outputs.

Features of the specific processors are explain in the corresponding examples.

Retry and resend mechanism

Pubsub2Inbox has two mechanisms to prevent excessive retries and resend of messages.

The retry mechanism acknowledges and discards any messages that are older than a configured period (retryPeriod in configuration, default 2 days).

The resend mechanism is to prevent recurring notifications from being send. It relies on a Cloud Storage bucket where is stores zero-length files, that are named by hashing the resendKey (if it is omitted, all template parameters are used). The resend period is configurable through resendPeriod. To prevent the resend bucket from accumulating unlimited files, set an Object Lifecycle Management policy on the bucket.

Deploying as Cloud Function

Deploying via Terraform

Sample Terraform module is provided in main.tf, variables.tf and outputs.tf. Pass the following parameters in when using as a module:

  • project_id (string): where to deploy the function
  • organization_id (number): organization ID (for organization level permissions)
  • function_name (string): name for the Cloud Function
  • function_roles (list(string)): list of curated permissions roles for the function (eg. scc, budgets, bigquery_reader, bigquery_writer, cai, recommender, monitoring)
  • pubsub_topic (string): Pub/Sub topic in the format of projects/project-id/topics/topic-id which the Cloud Function should be triggered on
  • region (string, optional): region where to deploy the function
  • secret_id (string, optional): name for the Cloud Secrets Manager secrets (defaults to function_name)
  • config_file (string, optional): function configuration YAML file location (defaults to config.yaml)
  • service_account (string, optional): service account name for the function (defaults to function_name)
  • bucket_name (string, optional): bucket where to host the Cloud Function archive (defaults to cf-pubsub2inbox)
  • bucket_location (string, optional): location of the bucket for Cloud Function archive (defaults to EU)
  • helper_bucket_name (string, optional): specify an additional Cloud Storage bucket where the service account is granted storage.objectAdmin on
  • function_timeout (number, optional): a timeout for the Cloud Function (defaults to 240 seconds)

Deploying manually

First, we have the configuration in config.yaml and we're going to store the configuration for the function as a Cloud Secret Manager secret.

Let's define some variables first:

export PROJECT_ID=your-project # Project ID where function will be deployed
export REGION=europe-west1 # Where to deploy the functions
export SECRET_ID=pubsub2inbox # Secret Manager secret name
export SERVICE_ACCOUNT=pubsub2inbox # Service account name
export SECRET_URL="projects/$PROJECT_ID/secrets/$SECRET_ID/versions/latest"
export FUNCTION_NAME="pubsub2inbox"
export PUBSUB_TOPIC="billing-alerts" # projects/$PROJECT_ID/topics/billing-alerts

Then we'll create the secrets in Secret Manager:

gcloud secrets create $SECRET_ID \
    --replication-policy="automatic" \
    --project $PROJECT_ID

gcloud secrets versions add $SECRET_ID \
    --data-file=config.yaml \
    --project $PROJECT_ID

We will also create a service account for the Cloud Function:

gcloud iam service-accounts create $SA_NAME \
    --project $PROJECT_ID

gcloud secrets add-iam-policy-binding $SECRET_ID \
    --member "serviceAccount:$SA_NAME@$PROJECT_ID.iam.gserviceaccount.com" \
    --role "roles/secretmanager.secretAccessor" \
    --project $PROJECT_ID

gcloud iam service-accounts add-iam-policy-binding $SA_NAME@$PROJECT_ID.iam.gserviceaccount.com \
    --member "serviceAccount:$SA_NAME@$PROJECT_ID.iam.gserviceaccount.com" \
    --role "roles/iam.serviceAccountTokenCreator" \
    --project $PROJECT_ID

Now we can deploy the Cloud Function:

gcloud functions deploy $FUNCTION_NAME \
    --entry-point process_pubsub \
    --runtime python38 \
    --trigger-topic $PUBSUB_TOPIC \
    --service-account "$SA_NAME@$PROJECT_ID.iam.gserviceaccount.com" \
    --set-env-vars "CONFIG=$SECRET_URL" \
    --region $REGION \
    --project $PROJECT_ID

Running tests

Run the command:

# python3 -m unittest discover

To set against a real cloud project, set PROJECT_ID environment variable.