Skip to content

The Entire Transcript from Friends in Tidy Format

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE.md
Notifications You must be signed in to change notification settings

EmilHvitfeldt/friends

Repository files navigation

friends

R-CMD-check CRAN status Downloads Lifecycle: stable

The goal of friends to provide the complete script transcription of the Friends sitcom. The data originates from the Character Mining repository which includes references to scientific explorations using this data. This package simply provides the data in tibble format instead of json files.

Installation

You can install the released version of friends from CRAN with:

install.packages("friends")

And the development version from GitHub with:

# install.packages("devtools")
devtools::install_github("EmilHvitfeldt/friends")

Example

The friends package comes with a couple of datasets. The main one is the friends dataset which is a tibble containing all the utterances in the show

library(friends)

friends
#> # A tibble: 67,373 x 6
#>    text                                speaker    season episode scene utterance
#>    <chr>                               <chr>       <int>   <int> <int>     <int>
#>  1 There's nothing to tell! He's just… Monica Ge…      1       1     1         1
#>  2 C'mon, you're going out with the g… Joey Trib…      1       1     1         2
#>  3 All right Joey, be nice. So does h… Chandler …      1       1     1         3
#>  4 Wait, does he eat chalk?            Phoebe Bu…      1       1     1         4
#>  5 (They all stare, bemused.)          Scene Dir…      1       1     1         5
#>  6 Just, 'cause, I don't want her to … Phoebe Bu…      1       1     1         6
#>  7 Okay, everybody relax. This is not… Monica Ge…      1       1     1         7
#>  8 Sounds like a date to me.           Chandler …      1       1     1         8
#>  9 [Time Lapse]                        Scene Dir…      1       1     1         9
#> 10 Alright, so I'm back in high schoo… Chandler …      1       1     1        10
#> # … with 67,363 more rows

All the utterances are broken down by season, episode, scene and utterance which allows for very detailed analysis. Please note that the speaker will be denoted "Scene Directions" to show scene directions, or otherwise non-spoken descriptions.

The original data includes emotion and character entity annotation for some of the utterances. These annotations have been included in separate tibbles. These can easily be joined back to the main dataset as needed.

friends_entities
#> # A tibble: 10,557 x 5
#>    season episode scene utterance entities 
#>     <int>   <int> <int>     <int> <list>   
#>  1      1       1     1         2 <chr [1]>
#>  2      1       1     1         3 <chr [2]>
#>  3      1       1     1         4 <chr [1]>
#>  4      1       1     1         8 <chr [1]>
#>  5      1       1     1        31 <chr [2]>
#>  6      1       1     1        33 <chr [1]>
#>  7      1       1     1        34 <chr [1]>
#>  8      1       1     1        38 <chr [1]>
#>  9      1       1     1        42 <chr [4]>
#> 10      1       1     1        44 <chr [1]>
#> # … with 10,547 more rows

friends_emotions
#> # A tibble: 12,606 x 5
#>    season episode scene utterance emotion
#>     <int>   <int> <int>     <int> <chr>  
#>  1      1       1     4         1 Mad    
#>  2      1       1     4         3 Neutral
#>  3      1       1     4         4 Joyful 
#>  4      1       1     4         5 Neutral
#>  5      1       1     4         6 Neutral
#>  6      1       1     4         7 Neutral
#>  7      1       1     4         8 Scared 
#>  8      1       1     4        10 Joyful 
#>  9      1       1     4        11 Joyful 
#> 10      1       1     4        12 Sad    
#> # … with 12,596 more rows

There is also a tibble containing episode specific information such as title, air_date and imdb_rating

friends_info
#> # A tibble: 236 x 8
#>    season episode title     directed_by  written_by  air_date   us_views_millio…
#>     <int>   <int> <chr>     <chr>        <chr>       <date>                <dbl>
#>  1      1       1 The Pilot James Burro… David Cran… 1994-09-22             21.5
#>  2      1       2 The One … James Burro… David Cran… 1994-09-29             20.2
#>  3      1       3 The One … James Burro… Jeffrey As… 1994-10-06             19.5
#>  4      1       4 The One … James Burro… Alexa Junge 1994-10-13             19.7
#>  5      1       5 The One … Pamela Frym… Jeff Green… 1994-10-20             18.6
#>  6      1       6 The One … Arlene Sanf… Adam Chase… 1994-10-27             18.2
#>  7      1       7 The One … James Burro… Jeffrey As… 1994-11-03             23.5
#>  8      1       8 The One … James Burro… Marta Kauf… 1994-11-10             21.1
#>  9      1       9 The One … James Burro… Jeff Green… 1994-11-17             23.1
#> 10      1      10 The One … Peter Bonerz Adam Chase… 1994-12-15             19.9
#> # … with 226 more rows, and 1 more variable: imdb_rating <dbl>

Code of Conduct

Please note that the friends project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.