This repository contains the code to build a pipeline that scans objects uploaded to GCS for malware, moving the documents to a clean or quarantined bucket depending on the malware scan status.
This project was forked from GoogleCloudPlatform/docker-clamav-malware-scanner in order to use Cloud Pub/Sub in place of Eventarc, as well as add Nuvalence-specific configurations.
- 3 GCS buckets for
unscanned
,scanned
, andquarantined
files. - Pub/Sub topic + pull subscription for notifying workers of uploaded files
- GCS upload notification for publishing messages to Pub/Sub
gsutil notification create -t my-upload-topic -e OBJECT_FINALIZE -f json gs://my-unscanned-bucket
- GKE Cluster
- Google Artifacts Registry (Docker) Repository -
application
in the below example
- build and push images in
images/
directory
PROJECT=my-project
REGION=us-east4
GAR_REPO=application
for i in $(ls images); do
docker build -t $REGION-docker.pkg.dev/$PROJECT/$GAR_REPO/$i -f images/$i/Dockerfile images/$i
docker push $REGION-docker.pkg.dev/$PROJECT/$GAR_REPO/$i
done
- deploy to GKE with Helm
Create a values.yaml
with appropriate values:
name: clamav-malware-scanner
environment: "dev"
replicaCount: 1
gcp:
project: my-project
region: us-east4
gar:
repo: application
worker:
env:
UNSCANNED_BUCKET: my-bucket-unscanned
CLEAN_BUCKET: my-bucket-scanned
QUARANTINED_BUCKET: my-bucket-quarantined
UPLOAD_TOPIC: projects/my-project/topics/my-topic
CLAMD_HOST: 127.0.0.1
CLAMD_PORT: 3310
PROJECT=my-project
REGION=us-east4
gcloud container clusters get-credentials $PROJECT --zone $REGION
helm upgrade -f my-values.yaml --install clamav-scanner \
k8s/charts/clamav-scanner -n [namespace]
There are 2 files in static/samples/
; a CLEAN
file and an INFECTED
file with a fake malware signature. Upload them to the unscanned bucket
while observing the node-worker
logs. eicar-anti-malware-testfile.txt
should report INFECTED
and the other should not.
- 2019-09-01 Initial version
- 2020-10-05 Fixes for ClamAV OOM
- 2021-10-14 Use Cloud Run and EventArc instead of Cloud Functions/App Engine
- 2021-10-22 Improve resiliency, Use streaming reads (no temp disk required), improve logging, and handles files in subdirectories
- 2021-11-08 Add support for scanning multiple buckets, improve error handling to prevent infinite retries
- 2021-11-22 Remove requirement for Project Viewer permissions.
- 2022-02-22 Fix node-forge vulnerability.
- 2022-03-01 Support larger file sizes (up to 500MiB)
- 2022-06-07 Fix issue where clamav cannot update itself on container start
- 2022-07-09 Refactor into separate containers for GKE, add clamd env vars
Copyright 2021 Google LLC
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.