Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dev/incremental media model #2

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@

target/
dbt_modules/
dbt_packages/
logs/
.DS_Store
97 changes: 96 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,97 @@
[![early-release]][tracker-classificiation] [![License][license-image]][license] [![Discourse posts][discourse-image]][discourse]

![snowplow-logo](https://raw.githubusercontent.com/snowplow/dbt-snowplow-utils/main/assets/snowplow_logo.png)

# dbt-snowplow-media-player
A fully incremental model, that transforms media player event data generated by the Snowplow JavaScript tracker into derived tables for easier querying

A fully incremental model that transforms media player event data into derived tables for easier querying generated by the Snowplow [JavaScript tracker][javascript-tracker] in combination with media tracking specific plugins such as the [Media Tracking plugin][media-tracking] or the [YouTube Tracking plugin][youtube-tracking]. The package is built on top of the [dbt-snowplow-web package][dbt-snowplow-web] taking that as a basis to carry out the incremental update. It is therefore designed to be run together with the web model very similar to how a custom module would run.

Please refer to the [doc site][snowplow-media-player-docs] for a full breakdown of the package.

### Adapter Support

The snowplow-media-player v0.1.0 package currently supports Redshift & Postgres.

| warehouse | dbt versions | snowplow-web version | snowplow-media-player version |
|:------------------------:|:-------------------:|:--------------------:|:-----------------------------:|
| Redshift & Postgres | >=0.20.0 to <1.1.0 | >=0.6.0 to <0.7.0> | 0.1.0 |

### Requirements

- A dataset of media-player web events from the [Snowplow JavaScript tracker][tracker-docs] must be available in the database. In order for this to happen at least one of the JavaScript based media tracking plugins need to be enabled: [Media Tracking plugin][media-tracking] or [YouTube Tracking plugin][youtube-tracking]
- Have the [`webPage` context][webpage-context] enabled.
- Have the [media-player event schema][media-player-event-schema] enabled.
- Have the [media-player context schema][media-player-context-schema] enabled.
- Depending on the plugin / intention have all the relevant contexts from below enabled:
- in case of embedded YouTube tracking: Have the [YouTube specific context schema][youtube-specific-context-schema] enabled.
- in case of HTML5 audio or video tracking: Have the [HTML5 media element context schema][html5-media-element-context-schema] enabled.
- in case of HTML5 video tracking: Have the [HTML5 video element context schema][html5-video-element-context-schema] enabled.

### Installation

Check dbt Hub for the latest installation instructions, or read the [dbt docs][dbt-package-docs] for more information on installing packages.

### Configuration & Operation

Please refer to the [doc site][snowplow-media-player-docs] for extensive details on how to configure and run the package.

### Models

The package contains multiple staging models however the mart models are as follows:

| Model | Description |
|------------------------------------------|--------------------------------------------------------------------------------------------|
| snowplow_media_player_base | A table summarising media player events by media and pageview including impressions. |
| snowplow_media_player_plays_by_pageview | A view summarising media plays by media on a pageview level. |
| snowplow_media_player_media_stats | An aggregated table of media metrics on a media_id level. |

# Join the Snowplow community

We welcome all ideas, questions and contributions!

For support requests, please use our community support [Discourse][discourse] forum.

If you find a bug, please report an issue on GitHub.

# Copyright and license

The snowplow-media-player package is Copyright 2022 Snowplow Analytics Ltd.

Licensed under the [Apache License, Version 2.0][license] (the "License");
you may not use this software except in compliance with the License.

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

[license]: http://www.apache.org/licenses/LICENSE-2.0
[license-image]: http://img.shields.io/badge/license-Apache--2-blue.svg?style=flat
[tracker-classificiation]: https://docs.snowplowanalytics.com/docs/collecting-data/collecting-from-own-applications/tracker-maintenance-classification/
[early-release]: https://img.shields.io/static/v1?style=flat&label=Snowplow&message=Early%20Release&color=014477&labelColor=9ba0aa&logo=

[tracker-docs]: https://docs.snowplowanalytics.com/docs/collecting-data/collecting-from-own-applications/

[webpage-context]: https://docs.snowplowanalytics.com/docs/collecting-data/collecting-from-own-applications/javascript-trackers/javascript-tracker/javascript-tracker-v3/tracker-setup/initialization-options/#Adding_predefined_contexts

[media-player-event-schema]: https://github.com/snowplow/iglu-central/blob/master/schemas/com.snowplowanalytics.snowplow/media_player_event/jsonschema/1-0-0
[media-player-context-schema]: https://github.com/snowplow/iglu-central/blob/master/schemas/com.snowplowanalytics.snowplow/media_player/jsonschema/1-0-0
[youtube-specific-context-schema]: https://github.com/snowplow/iglu-central/blob/master/schemas/com.youtube/youtube/jsonschema/1-0-0
[html5-media-element-context-schema]: https://github.com/snowplow/iglu-central/blob/master/schemas/org.whatwg/media_element/jsonschema/1-0-0
[html5-video-element-context-schema]: https://github.com/snowplow/iglu-central/blob/master/schemas/org.whatwg/video_element/jsonschema/1-0-0

[media-tracking]: https://docs.snowplowanalytics.com/docs/collecting-data/collecting-from-own-applications/javascript-trackers/javascript-tracker/javascript-tracker-v3/plugins/media-tracking/

[javascript-tracker]: https://docs.snowplowanalytics.com/docs/collecting-data/collecting-from-own-applications/javascript-trackers/javascript-tracker/javascript-tracker-v3

[youtube-tracking]: https://docs.snowplowanalytics.com/docs/collecting-data/collecting-from-own-applications/javascript-trackers/javascript-tracker/javascript-tracker-v3/plugins/youtube-tracking/

[dbt-package-docs]: https://docs.getdbt.com/docs/building-a-dbt-project/package-management

[discourse-image]: https://img.shields.io/discourse/posts?server=https%3A%2F%2Fdiscourse.snowplowanalytics.com%2F
[discourse]: http://discourse.snowplowanalytics.com/

[snowplow-media-player-docs]: https://snowplow.github.io/dbt-snowplow-media-player/#!/overview/snowplow_media_player

[dbt-snowplow-web]: https://hub.getdbt.com/dbt-labs/snowplow/latest/
Empty file added analyses/.gitkeep
Empty file.
Empty file added data/.gitkeep
Empty file.
41 changes: 41 additions & 0 deletions dbt_project.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
name: 'snowplow_media_player'
version: '0.1.0'
config-version: 2
require-dbt-version: ">=1.0.0"

profile: 'default'

model-paths: ["models"]
analysis-paths: ["analyses"]
test-paths: ["tests"]
seed-paths: ["seeds"]
macro-paths: ["macros"]
docs-paths: ["docs"]
snapshot-paths: ["snapshots"]

target-path: "target"
clean-targets:
- "target"
- "dbt_packages"

vars:
snowplow__percent_progress_boundaries: [10, 25, 50, 75]
snowplow__valid_play_sec: 30
snowplow__complete_play_rate: 0.99
snowplow__max_media_pv_window: 10

models:
snowplow_media_player:
+bind: false
+materialized: view
web:
+schema: "derived"
+tags: "snowplow_web_incremental"
+enabled: true
scratch:
+schema: "scratch"
+tags: "scratch"
custom:
emielver marked this conversation as resolved.
Show resolved Hide resolved
+schema: "scratch"
+tags: "snowplow_web_incremental"
+enabled: false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This enabled should be triggered by a variable, which should be defined in your vars and in the documentation so that users can easily enable/disable these custom tables, right? Or how else would we expect users to build off of them?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what I always do for testing, you just overwrite it in your own dbt project's project.yml file and explained in the docs so they should be ok I think (?):

By default these are disabled, but you can enable them in the project's profiles.yml, if needed.

yml
# dbt_project.yml
...
  models:
    snowplow_media_player:
      custom:
        enabled: true

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough, although I do think we should change it to being behind a variable once this dbt bug is fixed dbt-labs/dbt-core#3698 to align with the web and mobile packages and their optional modules, but it's good for now!

Empty file added docs/.nojekyll
Empty file.
1 change: 1 addition & 0 deletions docs/catalog.json

Large diffs are not rendered by default.

Loading