-
Notifications
You must be signed in to change notification settings - Fork 2
/
codebook.txt
executable file
·163 lines (100 loc) · 6.37 KB
/
codebook.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
Title: YouTube data visualisation tool
Author: Mainye Ben
Version: 0.1
This codebook is based on advice by J.Leek which can be found here:
https://github.com/jtleek/datasharing
Description:
The datasets available will be YouTube data collected for the channel Sliceace; It is about
on extreme sports like paintball and skateboarding. The other is channel about gaming and
vlogs among other things. This is how the data was collected from YouTube since i'm
assuming the reader is interested in trying out the tool for themselves/their own channels.
https://photos.app.goo.gl/Ib6haDXJqMC6Sgfk1 - this is a photo album that will show you
how to get your data. What you'll get is a folder, click on the watch_time and rename
the first file like i have yt_date.csv. Prior to renaming the file, it'll look like this
e92OL5XV42NvuV4MSjeggQ.20090911.20170603.UN001.date_csv
e92OL5XV42NvuV4MSjeggQ.20090911.20170603.UN001 this varies depending on your channel.
For the other channel you want to compare you could call it exotic_yt_date.csv. So that
the tool will work on your machine. Replace the files available with your own YouTube data.
After renaming them exactly how i've pointed out.
In addition, the other channel we are using for comparison is monetised meaning that the
content creator gets money out of their channel.
Format:
A Dataframe with 573 rows and 41 columns for the channel Sliceace
A Dataframe with 1526 rows and 54 columns for a channel
Variables:
These are the number of columns in the dataset. Notice they are different for the two
channels. I'll try to explain what they mean to the best of my ability. At first glance,
they look straightforward but right now. I'm already facing challenges since i'm not
sure what some mean i.e the variables.
* date: in year-month-date format. Answers the question when all these other variables
were recorded.
* watch_time_minutes: time in minutes people spend watching your videos. Big numbers good
small numbers bad.
* views : How unique people have discovered your channel and probably looked at the
content like introductory video. P.s my channel doesn't have one. :)
* average_view_duration: The mean views of the previous variable.
* average_percentage_viewed: views as a percentage
* card_clicks: The "i" icon usually at the top right of videos. How many people have clicked
it.
* cards_shown: How many of those cards were seen in your video.
* clicks_per_card_shown: clicks each card gets from different viewers.
* card_teaser_clicks: clicks the card get as soon as the text along with the "i" icon is
displayed to viewer.
* card_teasers_shown: how many viewers have made it till the card text in the "i" is
shown.
* clicks_per_card_teaser_shown: number of times the card has been clicked for different
videos as soon as the "i" icon and text appears.
* end_screen_elements_shown: end screen elements are the subscribe icon seen at the end
of the video, it could be a playlist or playlist. So this is how many that were seen by
the viewer of the video who've watched the video until the end or till when the element
is shown.
* end_screen_element_clicks: number of times the viewers have clicked your screen elements.
* clicks_per_end_screen_element_shown: How much each end screen element per video has been
clicked.
* youtube_red_watch_time_minutes: same as what i pointed out before for more information.
Check this out `https://www.youtube.com/red`
* youtube_red_views: same as what i pointed out before for more information.
Check this out `https://www.youtube.com/red`
* youtube_red_watch_time_hours: same as what i pointed out before except time is in
hours for more information.
Check this out `https://www.youtube.com/red`
* annotation_clicks: I looked at a wikihow post to remember what annotations are here:
https://www.wikihow.com/Add-Annotations-to-a-YouTube-Video. So how many times your
"Add Speech Bubble", "Add Spotlight","Label", "Title" or "Add Note" were clicked.
* clickable_annotations_shown: The number of "Add Speech Bubble", "Add Spotlight","Label", "Title" or "Add Note"
that can be clicked and thus, are shown in the video.
* annotation_click_through_rate: rate at which people click these annotations.
* annotation_closes: How many times the viewers have clicked the "x" icon in annotations.
* closable_annotations_shown: number of annotations that can be closed in your videos.
* annotation_close_rate: rate at which people click close your annotations.
* annotations_shown: How many annotations are seen in your videos.
* videos_published: number of videos that can be seen on a your YouTube channel and
available for view to the public.
* videos_added : number of videos the content creator has added.
* likes: The number of people who clicked on the "thumbs-up" icon.
* likes_added: Seems to cumulative number of likes your videos get.
* likes_removed: The number of people who initially liked the video but decided to unclick
the "thumbs-up" icon. Also, they could have closed their YouTube account.
* dislikes: The number of people who clicked on the "thumbs-down" icon.
* dislikes_added: Seems to cumulative number of dislikes your videos get.
* dislikes_removed: The number of people who initially liked the video but decided to unclick
the "thumbs-down" icon. Also, they could have closed their YouTube account.
* shares: How many times your videos was shared in different social media.
* comments: How many comments you got from your videos.
* videos_in_playlists: Generally the number of videos in your playlists. Don't have to be
your videos.
* videos_added_to_playlists: number of videos added to playlists.
* videos_removed_from_playlists: number of videos removed from a playlist created.
* subscribers: How many subscribers did you have.
* subscribers_gained: Number of subscribers you've gotten.
* subscribers_lost: Number of subscribers you've lost.
* watch_time_hours: time in hours people spend watching your videos. Big numbers good
small numbers bad.
* channel i added this one to try make a hover tool to differentiate channels. This is
simply
If you have any edits that i should know about please submit a pull request.
Data preprocessing:
Didn't do much when it comes to cleaning since it follows the tidy format. All i did is
add a variable called Sliceace for one channel. Furthermore, i converted the date into a
format that works well with Bokeh.
This document was prepared with TextWrangler version 5.5.2 (397016)