Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Followup to Paul's comments about debezium-demo over slack. #1932

Merged
merged 2 commits into from
Feb 4, 2022

Conversation

jcferretti
Copy link
Member

The comments were:

===

Vs README_RPM.md

  • pageview_stg - looks good. Only difference is they use double for received_at and we use datetime, which is better.
  • purchases_by_item - looks good
  • pageviews_by_item - looks good
  • item_summary - different column order, but logic looks good
  • profile_views_per_minute_last_10 could be simpler IMO with:
    received_at_minute = lowerBin(received_at, minute_in_nanos)
    profile_views
    looks wrong from their perspective (they say last 5 users per profile, but are selecting 10).
  • We could use tailBy() to make this a lot simpler.
    pf2 = pageviews_stg.sort("target_id","received_at").tailBy(5, "target_id")
  • profile_views_enriched - column order and some column names are different, but otherwise looks good
  • I don't see any data for dd_flagged tables.
  • high_value_users
    Column order is different
    Some column names are different
    Data type of lifetime_value is different (our BigDecimal vs. their int)
    Otherwise, looks good

===

We discussed about data for dd_flagged_tables having to come from a curl invocation scripted in the README_RPM.md original. When Paul did that he found another issue related to the natural join in the definition for dd_flagged_profiles_view throwing an exception due to finding more than one right hand match; the solution is to change the operation from naturalJoin to join.

I addressed column order and simplifications suggested by Paul.

@jcferretti jcferretti requested a review from cpwright February 3, 2022 17:33
@jcferretti jcferretti self-assigned this Feb 3, 2022
@jcferretti jcferretti added this to the Feb 2022 milestone Feb 3, 2022
@jcferretti jcferretti requested a review from pchambre February 3, 2022 17:36
@jcferretti
Copy link
Member Author

jcferretti commented Feb 4, 2022

I had a conversation with Charles over slack about

profile_views = pageviews_stg \
    .view(
        'owner_id = target_id',
        'viewer_id = user_id',
        'received_at'
    ).sort(
        'owner_id',
        'received_at'
    ).tailBy(10, 'owner_id')

He suggested we drop the owner_id from the sort since (1) the result should be the same, and (2) ordering by two columns is slower and will make tailBy's job a lot harder.

So I just did that in the last push.

@cpwright cpwright merged commit 4a34eb8 into deephaven:main Feb 4, 2022
@github-actions github-actions bot locked and limited conversation to collaborators Feb 4, 2022
@jcferretti jcferretti deleted the cfs-debezium-demo-paul-review-0 branch February 18, 2022 15:53
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants