Thomas Lumley answers this in a superb post on r-help the other day. <<-
is about the enclosing environment so you can do thing like this (and again, I quote his post from April 22 in this thread):
make.accumulator<-function(){
a <- 0
function(x) {
a <<- a + x
a
}
}
> f<-make.accumulator()
> f(1)
[1] 1
> f(1)
[1] 2
> f(11)
[1] 13
> f(11)
[1] 24
This is a legitimate use of <<-
as "super-assignment" with lexical scope. And not simply to assign in the global environment. For that, Thomas has these choice words:
The Evil and Wrong use is to modify
variables in the global environment.
Very good advice.
It depends on context as to what =
means. ==
is always for testing equality.
=
can be
in most cases used as a drop-in replacement for <-
, the assignment operator.
> x = 10
> x
[1] 10
used as the separator for key-value pairs used to assign values to arguments in function calls.
rnorm(n = 10, mean = 5, sd = 2)
Because of 2. above, =
can't be used as a drop-in replacement for <-
in all situations. Consider
> rnorm(N <- 10, mean = 5, sd = 2)
[1] 4.893132 4.572640 3.801045 3.646863 4.522483 4.881694 6.710255 6.314024
[9] 2.268258 9.387091
> rnorm(N = 10, mean = 5, sd = 2)
Error in rnorm(N = 10, mean = 5, sd = 2) : unused argument (N = 10)
> N
[1] 10
Now some would consider rnorm(N <- 10, mean = 5, sd = 2)
poor programming, but it is valid and you need to be aware of the differences between =
and <-
for assignment.
==
is always used for equality testing:
> set.seed(10)
> logi <- sample(c(TRUE, FALSE), 10, replace = TRUE)
> logi
[1] FALSE TRUE TRUE FALSE TRUE TRUE TRUE TRUE FALSE TRUE
> logi == TRUE
[1] FALSE TRUE TRUE FALSE TRUE TRUE TRUE TRUE FALSE TRUE
> seq.int(1, 10) == 5L
[1] FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE
Do be careful with ==
too however, as it really means exactly equal to and on a computer where floating point operations are involved you may not get the answer you were expecting. For example, from ?'=='
:
> x1 <- 0.5 - 0.3
> x2 <- 0.3 - 0.1
> x1 == x2 # FALSE on most machines
[1] FALSE
> identical(all.equal(x1, x2), TRUE) # TRUE everywhere
[1] TRUE
where all.equal()
tests for equality allowing for a little bit of fuzziness due to loss of precision/floating point operations.
Best Answer
The difference in assignment operators is clearer when you use them to set an argument value in a function call. For example:
In this case,
x
is declared within the scope of the function, so it does not exist in the user workspace.In this case,
x
is declared in the user workspace, so you can use it after the function call has been completed.There is a general preference among the R community for using
<-
for assignment (other than in function signatures) for compatibility with (very) old versions of S-Plus. Note that the spaces help to clarify situations likeMost R IDEs have keyboard shortcuts to make
<-
easier to type. Ctrl + = in Architect, Alt + - in RStudio (Option + - under macOS), Shift + - (underscore) in emacs+ESS.If you prefer writing
=
to<-
but want to use the more common assignment symbol for publicly released code (on CRAN, for example), then you can use one of thetidy_*
functions in theformatR
package to automatically replace=
with<-
.The answer to the question "Why does
x <- y = 5
throw an error but notx <- y <- 5
?" is "It's down to the magic contained in the parser". R's syntax contains many ambiguous cases that have to be resolved one way or another. The parser chooses to resolve the bits of the expression in different orders depending on whether=
or<-
was used.To understand what is happening, you need to know that assignment silently returns the value that was assigned. You can see that more clearly by explicitly printing, for example
print(x <- 2 + 3)
.Secondly, it's clearer if we use prefix notation for assignment. So
The parser interprets
x <- y <- 5
asWe might expect that
x <- y = 5
would then bebut actually it gets interpreted as
This is because
=
is lower precedence than<-
, as shown on the?Syntax
help page.