Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated iOS onboarding funnel config to fix the submission_date error #6328

Open
wants to merge 46 commits into
base: rhythmofrain63-patch-1
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
46 commits
Select commit Hold shift + click to select a range
deae84e
Skip generating org_mozilla_ios_focus_derived/baseline_clients_last_s…
alekhyamoz Oct 9, 2024
a329ce9
Bump google-cloud-bigquery from 3.25.0 to 3.26.0 (#6318)
dependabot[bot] Oct 9, 2024
75f4ae5
Bump pre-commit from 3.8.0 to 4.0.1 (#6319)
dependabot[bot] Oct 9, 2024
f4d313c
Bump types-pytz from 2024.2.0.20240913 to 2024.2.0.20241003 (#6317)
dependabot[bot] Oct 9, 2024
1920e18
Fix typo in Bigconfig (#6321)
scholtzan Oct 9, 2024
c0114f4
Generate BigConfig files for views (#6312)
scholtzan Oct 10, 2024
4322b46
Bump exceptiongroup from 1.2.0 to 1.2.2 (#6324)
dependabot[bot] Oct 10, 2024
aba76a0
Bump sqlglot from 25.21.3 to 25.24.5 (#6323)
dependabot[bot] Oct 10, 2024
c657841
Bump types-python-dateutil from 2.9.0.20240906 to 2.9.0.20241003 (#6322)
dependabot[bot] Oct 10, 2024
e4ff38b
replace CAST with SAFE_CAST (#6327)
m-d-bowerman Oct 10, 2024
221cd9b
fix: only generate Airflow task for BigEye if monitoring enabled in t…
kik-kik Oct 10, 2024
102669a
Extend timeout for no output for private-generate-sql step (#6336)
curtismorales Oct 11, 2024
6f53a9a
Bug 1923976 - Change fxci_derived queries to run at 1800 (#6335)
ahal Oct 11, 2024
8950757
Bump pytest-pydocstyle from 2.3.2 to 2.4.0 (#6334)
dependabot[bot] Oct 11, 2024
0f108bb
Bump pandas from 2.2.2 to 2.2.3 (#6333)
dependabot[bot] Oct 11, 2024
8f3e440
Bump bigeye-sdk from 0.4.88 to 0.4.90 (#6332)
dependabot[bot] Oct 11, 2024
dec443d
Add vpnsession and daemonsession pings to vpn events unnested (#6338)
BenWu Oct 11, 2024
5427f88
Bug 1922987 - create syndication file for bugzilla_metrics (#6330)
dklawren Oct 11, 2024
14c2f18
Create shredder_targets_joined table with additional table metadata (…
BenWu Oct 11, 2024
e0f5873
Fix PR checklist link (#6340)
whd Oct 11, 2024
530e987
Fix schema for sumo ga4_events (#6341)
scholtzan Oct 11, 2024
5f7ce6c
Add more memory for private-generate-sql task (#6342)
curtismorales Oct 11, 2024
fe03467
DSRE-1755 switch to v2 workgroup for stripe (#6290)
whd Oct 11, 2024
5e08071
[MC-1458] Add newtab_merino_priors DAG (#6303)
mmiermans Oct 14, 2024
dd2f02b
adding dependecies on upstream tables (#6345)
chelseybeck Oct 14, 2024
5dfb2ef
Set analysis dataset default retention to 180 days (#6346)
whd Oct 14, 2024
405402e
feat: tweak glean_usage generator to include install_source in derive…
kik-kik Oct 15, 2024
da84f23
feat: update fenix and ios feature_usage tables to use the new attrib…
kik-kik Oct 15, 2024
36e4f6c
Bump sqlglot from 25.24.5 to 25.25.0 (#6348)
dependabot[bot] Oct 15, 2024
21be705
Bump mkdocs-material from 9.5.39 to 9.5.40 (#6344)
dependabot[bot] Oct 15, 2024
f60fbfc
Bump black from 24.8.0 to 24.10.0 (#6343)
dependabot[bot] Oct 15, 2024
433b954
feat: add bigconfig.yml to all mobile_kpi_metric tables only (#6337)
kik-kik Oct 15, 2024
c3c5cad
Deploy all BigConfig files at once (#6351)
scholtzan Oct 15, 2024
68ed290
adding referenced tables for upstream dependencies (#6347)
chelseybeck Oct 15, 2024
12677b6
feat(DENG-4577): add monitoring bigeye_usage view to allow us to get …
kik-kik Oct 15, 2024
e886d4c
fix: add missing project id prefix in the bigconfig templates in the …
kik-kik Oct 15, 2024
484b8a9
Bug 1923976 - Create a backfill for 'moz-fx-data-shared-prod.fxci_der…
ahal Oct 15, 2024
0e5b7c3
feat: add install_source to new_profiles selection as baseline ping n…
kik-kik Oct 16, 2024
d066ba0
Bump google-cloud-bigquery-storage[fastavro] from 2.26.0 to 2.27.0 (#…
dependabot[bot] Oct 16, 2024
571f923
Bump mkdocs-material from 9.5.40 to 9.5.41 (#6353)
dependabot[bot] Oct 16, 2024
96b3446
Bump types-requests from 2.32.0.20240914 to 2.32.0.20241016 (#6355)
dependabot[bot] Oct 16, 2024
8b81394
Add table to record GPU hours used (#6295)
chelseatroy Oct 16, 2024
3e5528f
Bug 1923976 - Mark fxci worker_costs_v1 backfill complete (#6359)
ahal Oct 16, 2024
bf391d7
Search Mobile - Fix engine coding from baseline tables (#6357)
alekhyamoz Oct 16, 2024
8d17916
feat(DSRE-1774) Grant Google Ads access to desktop conv events view (…
kwindau Oct 16, 2024
d7c8a12
Bug 1905938 Support events with no metrics in glean_usage generator (…
BenWu Oct 16, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .circleci/workflows.yml
Original file line number Diff line number Diff line change
Expand Up @@ -743,6 +743,7 @@ jobs:
-d "{\"dagrun_note\": \"${DAGRUN_NOTE}\", \"dag_id\": \"bqetl_artifact_deployment\"}"
private-generate-sql:
docker: *docker
resource_class: large
steps:
- when:
condition: *deploy
Expand Down Expand Up @@ -772,6 +773,7 @@ jobs:
rsync --archive ~/private-bigquery-etl/dags.yaml dags.yaml
- run:
name: Generate SQL content
no_output_timeout: 30m
command: |
mkdir -p /tmp/workspace/private-generated-sql

Expand Down
2 changes: 1 addition & 1 deletion .github/pull_request_template.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,4 +15,4 @@ configured to automatically insert hyperlinks for DSRE and DENG tickets.
See https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/managing-repository-settings/configuring-autolinks-to-reference-external-resources
-->

**Reviewer, please follow [this checklist](https://github.com/mozilla/bigquery-etl/.github/reviewer_checklist.md)**
**Reviewer, please follow [this checklist](https://github.com/mozilla/bigquery-etl/blob/main/.github/reviewer_checklist.md)**
213 changes: 155 additions & 58 deletions bigquery_etl/cli/monitoring.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,15 @@
from bigeye_sdk.client.enum import Method
from bigeye_sdk.controller.metric_suite_controller import MetricSuiteController
from bigeye_sdk.exceptions.exceptions import FileLoadException
from bigeye_sdk.model.big_config import BigConfig, TableDeployment, TableDeploymentSuite
from bigeye_sdk.model.big_config import (
BigConfig,
ColumnSelector,
RowCreationTimes,
TableDeployment,
TableDeploymentSuite,
TagDeployment,
TagDeploymentSuite,
)
from bigeye_sdk.model.protobuf_message_facade import (
SimpleCollection,
SimpleMetricDefinition,
Expand Down Expand Up @@ -107,14 +115,6 @@ def deploy(
sql_dir=sql_dir,
project_id=project_id,
)
mc.execute_bigconfig(
input_path=[metadata_file.parent / BIGCONFIG_FILE],
output_path=Path(sql_dir).parent if sql_dir else None,
apply=True,
recursive=False,
strict_mode=True,
auto_approve=True,
)

if (metadata_file.parent / VIEW_FILE).exists():
# monitoring to be deployed on a view
Expand All @@ -129,6 +129,136 @@ def deploy(
except FileNotFoundError:
print("No metadata file for: {}.{}.{}".format(project, dataset, table))

# Deploy BigConfig files at once.
# Deploying BigConfig files separately can lead to previously deployed metrics being removed.
mc.execute_bigconfig(
input_path=[
metadata_file.parent / BIGCONFIG_FILE
for metadata_file in list(set(metadata_files))
],
output_path=Path(sql_dir).parent if sql_dir else None,
apply=True,
recursive=False,
strict_mode=True,
auto_approve=True,
)


def _update_table_bigconfig(
bigconfig,
metadata,
project,
dataset,
table,
):
"""Update the BigConfig file to monitor a table."""
default_metrics = [
SimplePredefinedMetricName.FRESHNESS,
SimplePredefinedMetricName.VOLUME,
]

for collection in bigconfig.table_deployments:
for deployment in collection.deployments:
for metric in deployment.table_metrics:
if metric.metric_type.predefined_metric in default_metrics:
default_metrics.remove(metric.metric_type.predefined_metric)

if metadata.monitoring.collection and collection.collection is None:
collection.collection = SimpleCollection(
name=metadata.monitoring.collection
)

if len(default_metrics) > 0:
deployments = [
TableDeployment(
fq_table_name=f"{project}.{project}.{dataset}.{table}",
table_metrics=[
SimpleMetricDefinition(
metric_type=SimplePredefinedMetric(
type="PREDEFINED", predefined_metric=metric
),
metric_schedule=SimpleMetricSchedule(
named_schedule=SimpleNamedSchedule(
name="Default Schedule - 13:00 UTC"
)
),
)
for metric in default_metrics
],
)
]

collection = None
if metadata.monitoring.collection:
collection = SimpleCollection(name=metadata.monitoring.collection)

bigconfig.table_deployments += [
TableDeploymentSuite(deployments=deployments, collection=collection)
]


def _update_view_bigconfig(
bigconfig,
metadata,
project,
dataset,
table,
):
"""Update the BigConfig file to monitor a view."""
default_metrics = [
SimplePredefinedMetricName.FRESHNESS_DATA,
SimplePredefinedMetricName.VOLUME_DATA,
]

for collection in bigconfig.tag_deployments:
for deployment in collection.deployments:
for metric in deployment.metrics:
if metric.metric_type.predefined_metric in default_metrics:
default_metrics.remove(metric.metric_type.predefined_metric)

if metadata.monitoring.collection and collection.collection is None:
collection.collection = SimpleCollection(
name=metadata.monitoring.collection
)

if len(default_metrics) > 0:
deployments = [
TagDeployment(
column_selectors=[
ColumnSelector(name=f"{project}.{project}.{dataset}.{table}.*")
],
metrics=[
SimpleMetricDefinition(
metric_type=SimplePredefinedMetric(
type="PREDEFINED", predefined_metric=metric
),
metric_schedule=SimpleMetricSchedule(
named_schedule=SimpleNamedSchedule(
name="Default Schedule - 13:00 UTC"
)
),
)
for metric in default_metrics
],
)
]

collection = None
if metadata.monitoring.collection:
collection = SimpleCollection(name=metadata.monitoring.collection)

bigconfig.tag_deployments += [
TagDeploymentSuite(deployments=deployments, collection=collection)
]

bigconfig.row_creation_times = RowCreationTimes(
column_selectors=[
ColumnSelector(
name=f"{project}.{project}.{dataset}.{table}.{metadata.monitoring.partition_column}"
)
]
)


@monitoring.command(
help="""
Expand Down Expand Up @@ -158,55 +288,22 @@ def update(name: str, sql_dir: Optional[str], project_id: Optional[str]) -> None
else:
bigconfig = BigConfig(type="BIGCONFIG_FILE")

default_metrics = [
SimplePredefinedMetricName.FRESHNESS,
SimplePredefinedMetricName.VOLUME,
]

for collection in bigconfig.table_deployments:
for deployment in collection.deployments:
for metric in deployment.table_metrics:
if metric.metric_type.predefined_metric in default_metrics:
default_metrics.remove(
metric.metric_type.predefined_metric
)

if metadata.monitoring.collection and collection.collection is None:
collection.collection = SimpleCollection(
name=metadata.monitoring.collection
)

if len(default_metrics) > 0:
deployments = [
TableDeployment(
fq_table_name=f"{project}.{project}.{dataset}.{table}",
table_metrics=[
SimpleMetricDefinition(
metric_type=SimplePredefinedMetric(
type="PREDEFINED", predefined_metric=metric
),
metric_schedule=SimpleMetricSchedule(
named_schedule=SimpleNamedSchedule(
name="Default Schedule - 17:00 UTC"
)
),
)
for metric in default_metrics
],
)
]

collection = None
if metadata.monitoring.collection:
collection = SimpleCollection(
name=metadata.monitoring.collection
)

bigconfig.table_deployments += [
TableDeploymentSuite(
deployments=deployments, collection=collection
)
]
if (metadata_file.parent / VIEW_FILE).exists():
_update_view_bigconfig(
bigconfig=bigconfig,
metadata=metadata,
project=project,
dataset=dataset,
table=table,
)
else:
_update_table_bigconfig(
bigconfig=bigconfig,
metadata=metadata,
project=project,
dataset=dataset,
table=table,
)

bigconfig.save(
output_path=bigconfig_file.parent,
Expand Down
1 change: 1 addition & 0 deletions bigquery_etl/metadata/parse_metadata.py
Original file line number Diff line number Diff line change
Expand Up @@ -478,6 +478,7 @@ class DatasetMetadata:
user_facing: bool = attr.ib(False)
labels: Dict = attr.ib({})
default_table_workgroup_access: Optional[List[Dict[str, Any]]] = attr.ib(None)
default_table_expiration_ms: str = attr.ib(None)
workgroup_access: list = attr.ib(DEFAULT_WORKGROUP_ACCESS)
syndication: Dict = attr.ib({})

Expand Down
Loading