Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GCP package billing datastream tests #2312

Closed
wants to merge 18 commits into from
Closed

GCP package billing datastream tests #2312

wants to merge 18 commits into from

Conversation

endorama
Copy link
Member

@endorama endorama commented Dec 3, 2021

What does this PR do?

Adds system tests for billing data_stream in gcp package.

Checklist

  • I have reviewed tips for building integrations and this pull request is aligned with them.
  • I have verified that all data streams collect metrics or logs.
  • I have added an entry to my package's changelog.yml file.
  • I have verified that Kibana version constraints are current according to guidelines.

Author's Checklist

  • [ ]

How to test this PR locally

Run elastic-package test system --data-streams billing -v from within gcp package.
Note: this is not working yet, as the terraform docker image used for running tests does not support go.

Related issues

Screenshots

@elasticmachine
Copy link

elasticmachine commented Dec 3, 2021

💔 Build Failed

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2021-12-20T14:34:53.635+0000

  • Duration: 80 min 57 sec

  • Commit: 2cf85e7

Test stats 🧪

Test Results
Failed 0
Passed 3511
Skipped 5
Total 3516

Steps errors 1

Expand to view the steps failures

Check integration: gcp
  • Took 0 min 1 sec . View more details here
  • Description: ../../build/elastic-package check -v

🤖 GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

@@ -0,0 +1,45 @@
import csv
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will remove this script, as with bq cli tool is possible to easily export in JSON format (not possible from GCP console)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me ask a few questions:

  1. What is missing in the Terraform Docker image, so that you don't have to add any Go source code here?
  2. Do you prefer to add the bq tool to the image or prefer to include a special script here to execute and install bq tool in runtime? For example: init.sh with yum install bq (probably not so simple).

Both approaches required modifications in the elastic-package, but I think it is worth it? Let me know what you think.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. nothing. The issue is with the Terraform google provider that does not support loading data. The proposed solution is either through the bq tool or through the BigQuery SDK. So either there is a programming language available or the GCP Cloud SDK CLI
  2. I would prefer it in the image, so at least every test uses the same base image, sort of like GitHub Action runners guarantee a tooling baseline through preinstalled software.

About point 2 there are some considerations to make:

  • tests should be able to use different base image versions, to support different toolchains; for example using a toolchain for some database or cloud provider with breaking changes (think AWS v1 vs v2); this version must be selectable by test developers
  • security wise is easier to scan/check vulnerabilities if the baseline image does not change, and developers/contributors cannot insert vulnerable software as easily
  • the disadvantage is with flexibility, as developers cannot just add some init code but have to discuss it, but I think this limitation is actually a positive forcing function, preventing each integration to setup the CI differently, which would result in a exponential combination of CI configurations

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, security constraint is important here, so I think we could add bq tool to the Terraform image.

Regarding different toolchains, let's focus on solving your problem at the moment, don't need to invent a super-flexible option on Day 1.

If you're happy to work on this, feel free to open a pull request to elastic-package or report an issue there. As you wish :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will work on it, just a question: should I just change the Dockerfile.terraform_deployer?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, that's the goal of the hashicorp/terraform:light image, not to bring heavy dependencies or the entire image. Could you please try to extend the terraform:light with Python distribution? It shouldn't be a problem.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tired adding Python to the alpine image, but I went down the system dependencies rabbit hole. At the end, the "light" terraform image is not light anymore.

I tried approaching this the other way round, starting from google/cloud-sdk:latest and adding terraform and is easier. Can this be a viable route for this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't recommend depending on the google image as we may end up with a similar request for other SDK (AWS, Azure).

Did you try using the python:3-alpine or alpine with apk add --update python3 py3-pip?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried the latter. I will try again starting from python:3-alpine and add things on top of that. AWS and Azure CLIs are python based too!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Proper PR for adding gcloud SDk support available elastic/elastic-package#638

@endorama
Copy link
Member Author

While testing I discovered that gcp.billing metricset in Metricbeat does not support credentials_json parameter. As that is the one I've seen relied upon to pass credentials, I'm focusing on adding that functionality.

Project ID should be specified via the proper CLI flag instead that
being part of table name.
Test file should contain data in a specified time range to be collected
by beats and be useful for testing.
@endorama endorama changed the title Gcp billing tests GCP package billing datastream tests Jan 20, 2022
@endorama endorama force-pushed the gcp-metrics branch 3 times, most recently from 8cf80ec to fe54af3 Compare March 22, 2022 15:14
@endorama endorama force-pushed the gcp-metrics branch 2 times, most recently from edc3905 to 826b985 Compare July 13, 2022 07:54
@endorama endorama deleted the branch elastic:gcp-metrics July 14, 2022 08:49
@endorama endorama closed this Jul 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Integration:gcp Google Cloud Platform
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants