Suppose I want to calculate the proportion of different values within each group. For example, using the mtcars
data, how do I calculate the relative frequency of number of gears by am (automatic/manual) in one go with dplyr
?
library(dplyr)
data(mtcars)
mtcars <- tbl_df(mtcars)
# count frequency
mtcars %>%
group_by(am, gear) %>%
summarise(n = n())
# am gear n
# 0 3 15
# 0 4 4
# 1 4 8
# 1 5 5
What I would like to achieve:
am gear n rel.freq
0 3 15 0.7894737
0 4 4 0.2105263
1 4 8 0.6153846
1 5 5 0.3846154
Best Answer
Try this:
From the dplyr vignette:
Thus, after the
summarise
, the last grouping variable specified ingroup_by
, 'gear', is peeled off. In themutate
step, the data is grouped by the remaining grouping variable(s), here 'am'. You may check grouping in each step withgroups
.The outcome of the peeling is of course dependent of the order of the grouping variables in the
group_by
call. You may wish to do a subsequentgroup_by(am)
, to make your code more explicit.For rounding and prettification, please refer to the nice answer by @Tyler Rinker.