Explore this snippet here.
Finding the first value in a column partitioned by some other column is simple in SQL, using the GROUP BY clause. Returning the rest of the columns corresponding to that value is a little trickier. Here's one method, which relies on the RANK window function:
with and_ranking as (
select
*,
rank() over (partition by <partition> order by <ordering>) ranking
from <table>
)
select * from and_ranking where ranking = 1
where
partition
- the column(s) to partition byordering
- the column(s) which determine the ordering defining 'first'table
- the source table
The CTE and_ranking
adds a column to the original table called ranking
, which is the order of that row within the groups defined by partition
. Filtering for ranking = 1
picks out those 'first' rows.
Using total Spotify streams as an example data source, we can pick the rows of the table corresponding to the most-streamed song per day. Filling in the template above:
partition
- this is justday
ordering
- this isstreams desc
table
- this isspotify.spotify_daily_tracks
with and_ranking as (
select
*,
rank() over (partition by day order by streams desc) ranking
from spotify.spotify_daily_tracks
)
select day, title, artist, streams from and_ranking where ranking = 1