TidyTuesday - Animal Crossing

Intro

Hi! Tidy Tuesday Time! This week’s data is data on Animal Crossing! The data contains some user and critic reviews, which is what I will be examining today. It also contains infor about in game items and in game characters. This game is a lot of fun so that will make the analysis fun too!

Check out data!

head(critic)
## # A tibble: 6 x 4
##   grade publication    text                                           date      
##   <dbl> <chr>          <chr>                                          <date>    
## 1   100 Pocket Gamer … Animal Crossing; New Horizons, much like its … 2020-03-16
## 2   100 Forbes         Know that if you’re overwhelmed with the worl… 2020-03-16
## 3   100 Telegraph      With a game this broad and lengthy, there’s m… 2020-03-16
## 4   100 VG247          Animal Crossing: New Horizons is everything I… 2020-03-16
## 5   100 Nintendo Insi… Above all else, Animal Crossing: New Horizons… 2020-03-16
## 6   100 Trusted Revie… Animal Crossing: New Horizons is the best gam… 2020-03-16

The critic dataset seems to have a ton of text data about the game. Would make for a nice text mining study.

head(items)
## # A tibble: 6 x 16
##   num_id id    name  category orderable sell_value sell_currency buy_value
##    <dbl> <chr> <chr> <chr>    <lgl>          <dbl> <chr>             <dbl>
## 1     12 3d-g… 3D G… Accesso… NA               122 bells               490
## 2     14 a-tee A Tee Tops     NA               140 bells               560
## 3     17 abst… Abst… Wallpap… TRUE             390 bells              1560
## 4     19 acad… Acad… Dresses  NA               520 bells              2080
## 5     20 acan… Acan… Fossils  FALSE           2000 bells                NA
## 6     21 acce… Acce… Furnitu… TRUE             375 bells              1500
## # … with 8 more variables: buy_currency <chr>, sources <chr>,
## #   customizable <lgl>, recipe <dbl>, recipe_id <chr>, games_id <chr>,
## #   id_full <chr>, image_url <chr>

The items dataset has a all the items from the game and how many bells they are worth (buy/sell), how it is able to be obtained, and if it is customizable!

head(user_reviews)
## # A tibble: 6 x 4
##   grade user_name   text                                              date      
##   <dbl> <chr>       <chr>                                             <date>    
## 1     4 mds27272    My gf started playing before me. No option to cr… 2020-03-20
## 2     5 lolo2178    While the game itself is great, really relaxing … 2020-03-20
## 3     0 Roachant    My wife and I were looking forward to playing th… 2020-03-20
## 4     0 Houndf      We need equal values and opportunities for all p… 2020-03-20
## 5     0 ProfessorF… BEWARE!  If you have multiple people in your hou… 2020-03-20
## 6     0 tb726       The limitation of one island per Switch (not per… 2020-03-20

The user_reviews data is similar to the critic data, but is a lot more informal as it is just users rating the games. Perhaps I’ll compare the text in the two?

head(villagers)
## # A tibble: 6 x 11
##   row_n id    name  gender species birthday personality song  phrase full_id
##   <dbl> <chr> <chr> <chr>  <chr>   <chr>    <chr>       <chr> <chr>  <chr>  
## 1     2 admi… Admi… male   bird    1-27     cranky      Stee… aye a… villag…
## 2     3 agen… Agen… female squirr… 7-2      peppy       DJ K… sidek… villag…
## 3     4 agnes Agnes female pig     4-21     uchi        K.K.… snuff… villag…
## 4     6 al    Al    male   gorilla 10-18    lazy        Stee… Ayyee… villag…
## 5     7 alfo… Alfo… male   alliga… 6-9      lazy        Fore… it'sa… villag…
## 6     8 alice Alice female koala   8-19     normal      Surf… guvnor villag…
## # … with 1 more variable: url <chr>

Last, but not least, the villagers data contains information about the villagers such as their name, gender, species, birthday, personality, etc.

Exploratory Data Analysis 1

I really want to do a text analysis on the critic data and compare it with the user_review data.

So to begin, I’ll get that data in a tidy format.

critic_words <- critic %>% 
  unnest_tokens(word, text) %>% 
  anti_join(stop_words)
## Joining, by = "word"
user_rev_words <- user_reviews %>% 
  unnest_tokens(word, text) %>% 
  anti_join(stop_words)
## Joining, by = "word"

Now that the data is split into words I can do some analysis. Naturally, it would be interesting to see what the sentiment is in each group. Are the users more harsh than the critics?

critic_sentiments <- critic_words %>% 
  left_join(get_sentiments(lexicon = "nrc"), by = "word") %>% 
  drop_na() 

user_sentiments <- user_rev_words %>% 
  left_join(get_sentiments(lexicon = "nrc"), by = "word") %>% 
  drop_na()
#source("https://raw.githubusercontent.com/Levi-Nicklas/ggcute/master/R/animalcrossing.R")

critic_sentiments %>% 
  group_by(sentiment) %>% 
  count() %>% 
  ggplot(aes(sentiment, n))+
  #geom_col(color = "black", fill = animalcrossing_colours["sky_blue"]) +
  geom_col(color = "black", fill = "light blue") +
  #theme_animalcrossing() +
  theme_light()+
  coord_flip() +
  labs(title = "Critic Sentiments")

user_sentiments %>% 
  group_by(sentiment) %>% 
  count() %>% 
  ggplot(aes(sentiment, n))+
  #geom_col(color = "black", fill = animalcrossing_colours["sky_blue"]) +
  geom_col(color = "black", fill = "light blue") +
  #theme_animalcrossing() +
  theme_light()+
  coord_flip() +
  labs(title = "User Sentiments")

Levi C. Nicklas
Levi C. Nicklas
Data Scientist

Graduate Student, Researcher, and Data Scientist.

Related