This page is for Plotly.

library(tidyverse)
## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --
## v ggplot2 3.3.5     v purrr   0.3.4
## v tibble  3.1.4     v dplyr   1.0.7
## v tidyr   1.1.3     v stringr 1.4.0
## v readr   2.0.1     v forcats 0.5.1
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
library(p8105.datasets)

library(plotly)
## 
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
## 
##     last_plot
## The following object is masked from 'package:stats':
## 
##     filter
## The following object is masked from 'package:graphics':
## 
##     layout

Here, I chose the instacart data.

instacart = 
  instacart %>% 
  mutate(
    day = order_dow +1,
    day_of_week = lubridate::wday(day, label = TRUE, locale = "English_United States")
    ) %>%
  select(order_id, product_id, order_number, day_of_week, order_hour_of_day, product_name, aisle, department) %>% 
  drop_na() 

Make a line plot.

First, we can check the number of orders for different department during the day.

instacart %>%
  count(order_hour_of_day, department) %>%
  plot_ly(
    x = ~order_hour_of_day, y = ~n, type = "scatter", mode = "lines", color = ~department, colors = "viridis"
  )%>% 
  layout(title = "Number of Orders during the day")

Plotly boxplot

Next, we can check the boxplot of the distributions of order_number during a week. Since the dataset instacart is too big, we peek inside by taking a sample with sample size 4000.

instacart %>% 
  sample_n(4000) %>% 
  plot_ly(
    y = ~order_number, color = ~day_of_week, type = "box", colors = "viridis"
  )%>% 
  layout(title = "Order number distribution during the week")

Bar chart

At last, what aisles are popular? letโ€™s compare number of orders from different aisles by making a bar chart. Here, we only look at aisles with more that 20000 orders.

instacart %>% 
  group_by(aisle) %>% 
  summarise(num_order = n()) %>% 
  filter(num_order > 20000) %>% 
  mutate(aisle = reorder(aisle, num_order)) %>% 
  plot_ly(
    x = ~aisle, y = ~num_order, color = ~aisle, type = "bar", colors = "viridis"
  ) %>% 
  layout(title = "Number of orders of different aisles")