This page is for Plotly.
library(tidyverse)
## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --
## v ggplot2 3.3.5 v purrr 0.3.4
## v tibble 3.1.4 v dplyr 1.0.7
## v tidyr 1.1.3 v stringr 1.4.0
## v readr 2.0.1 v forcats 0.5.1
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
library(p8105.datasets)
library(plotly)
##
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
##
## last_plot
## The following object is masked from 'package:stats':
##
## filter
## The following object is masked from 'package:graphics':
##
## layout
Here, I chose the instacart data.
instacart =
instacart %>%
mutate(
day = order_dow +1,
day_of_week = lubridate::wday(day, label = TRUE, locale = "English_United States")
) %>%
select(order_id, product_id, order_number, day_of_week, order_hour_of_day, product_name, aisle, department) %>%
drop_na()
First, we can check the number of orders for different department during the day.
instacart %>%
count(order_hour_of_day, department) %>%
plot_ly(
x = ~order_hour_of_day, y = ~n, type = "scatter", mode = "lines", color = ~department, colors = "viridis"
)%>%
layout(title = "Number of Orders during the day")
Next, we can check the boxplot of the distributions of order_number during a week. Since the dataset instacart is too big, we peek inside by taking a sample with sample size 4000.
instacart %>%
sample_n(4000) %>%
plot_ly(
y = ~order_number, color = ~day_of_week, type = "box", colors = "viridis"
)%>%
layout(title = "Order number distribution during the week")
At last, what aisles are popular? letโs compare number of orders from different aisles by making a bar chart. Here, we only look at aisles with more that 20000 orders.
instacart %>%
group_by(aisle) %>%
summarise(num_order = n()) %>%
filter(num_order > 20000) %>%
mutate(aisle = reorder(aisle, num_order)) %>%
plot_ly(
x = ~aisle, y = ~num_order, color = ~aisle, type = "bar", colors = "viridis"
) %>%
layout(title = "Number of orders of different aisles")