You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In order to refactor atomic events we need to extract all non-generic information from a fat table into dedicated contexts and preserve only common properties. As a first step, we can have those properties in atomic event (as we do now, to not break data models) and in their deciated tables/columns (to start writing new data models).
I tried to summarize what contexts and event-specific properties can be extracted out of Event:
app_id
platform
etl_tstamp
collector_tstamp
dvce_created_tstamp
event
event_id
txn_id
name_tracker
v_tracker
v_collector
v_etl
user_id
user_ipaddress
user_fingerprint
domain_userid
domain_sessionidx
network_userid
geo_country - MaxMind context
geo_region - MaxMind context
geo_city - MaxMind context
geo_zipcode - MaxMind context
geo_latitude - MaxMind context
geo_longitude - MaxMind context
geo_region_name - MaxMind context
ip_isp - MaxMind context
ip_organization - MaxMind context
ip_domain - MaxMind context
ip_netspeed - MaxMind context
page_url - Web page context (source of truth)
page_title - Web page context (source of truth)
page_referrer - Referrer context (source of truth)
event_fingerprint - This should remain in canonical event
true_tstamp
Their grouping is not very semantic, but should be based mostly on the info source, e.g. although browser/device info semantically is the same information, some of properties are passed thourgh the tracker protocol and some derived through user-agent enrichment.
Contexts
MaxMind context
Web page context
Referrer context
Marketing campaign context
Ecommerce transaction item
Browser/device context (potentially multiple of them)
Self-describing events
Struct event
Ecommerce transaction
Page ping
Common properties
It leaves us with 31 core properties that can be set almost for all events/pipelines. Maybe some of them (user/device identification) can/should be moved into dedicated contexts.
event_id - event identification
app_id - event identification
event - eventually will be discarded in favor of vendor/name/version
txn_id - event identification
event_vendor - event identification
event_name - event identification
event_format - event identification
event_version - event identification
event_fingerprint - event identification
platform - probably should be moved as well
dvce_created_tstamp - timestamps
dvce_sent_tstamp - timestamps
collector_tstamp - timestamps
etl_tstamp - timestamps
derived_tstamp - timestamps
true_tstamp - timestamps
user_id - user/device identification
user_ipaddress - user/device identification
user_fingerprint - user/device identification
domain_userid - user/device identification
domain_sessionidx - user/device identification
domain_sessionid - user/device identification
network_userid - user/device identification
name_tracker - pipeline/aux
v_tracker - pipeline/aux
v_collector - pipeline/aux
v_etl - pipeline/aux
etl_tags - pipeline/aux
unstruct_event - payload
contexts- payload
derived_contexts - payload
The text was updated successfully, but these errors were encountered:
In order to refactor atomic events we need to extract all non-generic information from a fat table into dedicated contexts and preserve only common properties. As a first step, we can have those properties in atomic event (as we do now, to not break data models) and in their deciated tables/columns (to start writing new data models).
I tried to summarize what contexts and event-specific properties can be extracted out of
Event
:app_id
platform
etl_tstamp
collector_tstamp
dvce_created_tstamp
event
event_id
txn_id
name_tracker
v_tracker
v_collector
v_etl
user_id
user_ipaddress
user_fingerprint
domain_userid
domain_sessionidx
network_userid
geo_country
- MaxMind contextgeo_region
- MaxMind contextgeo_city
- MaxMind contextgeo_zipcode
- MaxMind contextgeo_latitude
- MaxMind contextgeo_longitude
- MaxMind contextgeo_region_name
- MaxMind contextip_isp
- MaxMind contextip_organization
- MaxMind contextip_domain
- MaxMind contextip_netspeed
- MaxMind contextpage_url
- Web page context (source of truth)page_title
- Web page context (source of truth)page_referrer
- Referrer context (source of truth)page_urlscheme
- Web page contextpage_urlhost
- Web page contextpage_urlport
- Web page contextpage_urlpath
- Web page contextpage_urlquery
- Web page contextpage_urlfragment
- Web page contextrefr_urlscheme
- Referrer contextrefr_urlhost
- Referrer contextrefr_urlport
- Referrer contextrefr_urlpath
- Referrer contextrefr_urlquery
- Referrer contextrefr_urlfragment
- Referrer contextrefr_medium
- Referrer contextrefr_source
- Referrer contextrefr_term
- Referrer contextmkt_medium
- Marketing campaign contextmkt_source
- Marketing campaign contextmkt_term
- Marketing campaign contextmkt_content
- Marketing campaign contextmkt_campaign
- Marketing campaign contextcontexts
se_category
- Struct event self-describing eventse_action
- Struct event self-describing eventse_label
- Struct event self-describing eventse_property
- Struct event self-describing eventse_value
- Struct event self-describing eventunstruct_event
tr_orderid
- Ecommerce transaction self-describing eventtr_affiliation
- Ecommerce transaction self-describing eventtr_total
- Ecommerce transaction self-describing eventtr_tax
- Ecommerce transaction self-describing eventtr_shipping
- Ecommerce transaction self-describing eventtr_city
- Ecommerce transaction self-describing eventtr_state
- Ecommerce transaction self-describing eventtr_country
- Ecommerce transaction self-describing eventti_orderid
- Ecommerce transaction item contextti_sku
- Ecommerce transaction item contextti_name
- Ecommerce transaction item contextti_category
- Ecommerce transaction item contextti_price
- Ecommerce transaction item contextti_quantity
- Ecommerce transaction item contextpp_xoffset_min
- Page ping self-describing eventpp_xoffset_max
- Page ping self-describing eventpp_yoffset_min
- Page ping self-describing eventpp_yoffset_max
- Page ping self-describing eventuseragent
- Browser context (but populated from different places)br_name
- Browser context (but populated from different places) (ua-utils)br_family
- Browser context (but populated from different places) (ua-utils)br_version
- Browser context (but populated from different places) (ua-utils)br_type
- Browser context (but populated from different places) (ua-utils)br_renderengine
- Browser context (but populated from different places) (ua-utils)br_lang
- Browser context (but populated from different places)br_features_pdf
- Browser context (but populated from different places)br_features_flash
- Browser context (but populated from different places)br_features_java
- Browser context (but populated from different places)br_features_director
- Browser context (but populated from different places)br_features_quicktime
- Browser context (but populated from different places)br_features_realplayer
- Browser context (but populated from different places)br_features_windowsmedia
- Browser context (but populated from different places)br_features_gears
- Browser context (but populated from different places)br_features_silverlight
- Browser context (but populated from different places)br_cookies
- Browser context (but populated from different places)br_colordepth
- Browser context (but populated from different places)br_viewwidth
- Browser context (but populated from different places)br_viewheight
- Browser context (but populated from different places)os_name
- Browser context (but populated from different places) (ua-utils)os_family
- Browser context (but populated from different places) (ua-utils)os_manufacturer
- Browser context (but populated from different places)os_timezone
- Browser context (but populated from different places)dvce_type
- Browser context (but populated from different places) (ua-utils)dvce_ismobile
- Browser context (but populated from different places) (ua-utils)dvce_screenwidth
- Browser context (but populated from different places)dvce_screenheight
- Browser context (but populated from different places)doc_charset
- Web page (or document) contextdoc_width
- Web page (or document) contextdoc_height
- Web page (or document) contexttr_currency
- Ecommerce transaction self-describing eventtr_total_base
- Ecommerce transaction self-describing eventtr_tax_base
- Ecommerce transaction self-describing eventtr_shipping_base
- Ecommerce transaction self-describing eventti_currency
- Ecommerce transaction item contextti_price_base
- Ecommerce transaction item contextbase_currency
- Ecommerce transaction self-describing eventgeo_timezone
- MaxMind contextmkt_clickid
- Marketing campaign contextmkt_network
- Marketing campaign contextetl_tags
dvce_sent_tstamp
refr_domain_userid
- Referrer contextrefr_dvce_tstamp
- Referrer contextderived_contexts
domain_sessionid
derived_tstamp
event_vendor
event_name
event_format
event_version
event_fingerprint
- This should remain in canonical eventtrue_tstamp
Their grouping is not very semantic, but should be based mostly on the info source, e.g. although browser/device info semantically is the same information, some of properties are passed thourgh the tracker protocol and some derived through user-agent enrichment.
Contexts
Self-describing events
Common properties
It leaves us with 31 core properties that can be set almost for all events/pipelines. Maybe some of them (user/device identification) can/should be moved into dedicated contexts.
event_id
- event identificationapp_id
- event identificationevent
- eventually will be discarded in favor of vendor/name/versiontxn_id
- event identificationevent_vendor
- event identificationevent_name
- event identificationevent_format
- event identificationevent_version
- event identificationevent_fingerprint
- event identificationplatform
- probably should be moved as welldvce_created_tstamp
- timestampsdvce_sent_tstamp
- timestampscollector_tstamp
- timestampsetl_tstamp
- timestampsderived_tstamp
- timestampstrue_tstamp
- timestampsuser_id
- user/device identificationuser_ipaddress
- user/device identificationuser_fingerprint
- user/device identificationdomain_userid
- user/device identificationdomain_sessionidx
- user/device identificationdomain_sessionid
- user/device identificationnetwork_userid
- user/device identificationname_tracker
- pipeline/auxv_tracker
- pipeline/auxv_collector
- pipeline/auxv_etl
- pipeline/auxetl_tags
- pipeline/auxunstruct_event
- payloadcontexts
- payloadderived_contexts
- payloadThe text was updated successfully, but these errors were encountered: