I want to use dplyr::mutate()
to create multiple new columns in a data frame. The column names and their contents should be dynamically generated.
Example data from iris:
library(dplyr)
iris <- as_tibble(iris)
I've created a function to mutate my new columns from the Petal.Width
variable:
multipetal <- function(df, n) {
varname <- paste("petal", n , sep=".")
df <- mutate(df, varname = Petal.Width * n) ## problem arises here
df
}
Now I create a loop to build my columns:
for(i in 2:5) {
iris <- multipetal(df=iris, n=i)
}
However, since mutate thinks varname is a literal variable name, the loop only creates one new variable (called varname) instead of four (called petal.2 – petal.5).
How can I get mutate()
to use my dynamic name as variable name?
Best Answer
Since you are dynamically building a variable name as a character value, it makes more sense to do assignment using standard data.frame indexing which allows for character values for column names. For example:
The
mutate
function makes it very easy to name new columns via named parameters. But that assumes you know the name when you type the command. If you want to dynamically specify the column name, then you need to also build the named argument.dplyr version >= 1.0
With the latest dplyr version you can use the syntax from the
glue
package when naming parameters when using:=
. So here the{}
in the name grab the value by evaluating the expression inside.If you are passing a column name to your function, you can use
{{}}
in the string as well as for the column namedplyr version >= 0.7
dplyr
starting with version 0.7 allows you to use:=
to dynamically assign parameter names. You can write your function as:For more information, see the documentation available form
vignette("programming", "dplyr")
.dplyr (>=0.3 & <0.7)
Slightly earlier version of
dplyr
(>=0.3 <0.7), encouraged the use of "standard evaluation" alternatives to many of the functions. See the Non-standard evaluation vignette for more information (vignette("nse")
).So here, the answer is to use
mutate_()
rather thanmutate()
and do:dplyr < 0.3
Note this is also possible in older versions of
dplyr
that existed when the question was originally posed. It requires careful use ofquote
andsetName
: