Apparently on my Windows 8 laptop with HotSpot JDK 1.7.0_45 (with all compiler/VM options set to default), the below loop
final int n = Integer.MAX_VALUE;
int i = 0;
while (++i < n) {
}
is at least 2 orders of magnitude faster (~10 ms vs. ~5000 ms) than:
final int n = Integer.MAX_VALUE;
int i = 0;
while (i++ < n) {
}
I happened to notice this problem while writing a loop to evaluate another irrelevant performance issue. And the difference between ++i < n
and i++ < n
was huge enough to significantly influence the result.
If we look at the bytecode, the loop body of the faster version is:
iinc
iload
ldc
if_icmplt
And for the slower version:
iload
iinc
ldc
if_icmplt
So for ++i < n
, it first increments local variable i
by 1 and then push it onto the operand stack while i++ < n
does those 2 steps in reverse order. But that doesn't seem to explain why the former is much faster. Is there any temp copy involved in the latter case? Or is it something beyond the bytecode (VM implementation, hardware, etc.) that should be responsible for the performance difference?
I've read some other discussion regarding ++i
and i++
(not exhaustively though), but didn't find any answer that is Java-specific and directly related to the case where ++i
or i++
is involved in a value comparison.
Best Answer
As others have pointed out, the test is flawed in many ways.
You did not tell us exactly how you did this test. However, I tried to implement a "naive" test (no offense) like this:
When running this with default settings, there seems to be a small difference. But the real flaw of the benchmark becomes obvious when you run this with the
-server
flag. The results in my case then are along something likeObviously, the pre-increment version has been completely optimized away. The reason is rather simple: The result is not used. It does not matter at all whether the loop is executed or not, so the JIT simply removes it.
This is confirmed by a look at the hotspot disassembly: The pre-increment version results in this code:
The post-increment version results in this code:
It's not entirely clear for me why it seemingly does not remove the post-increment version. (In fact, I consider asking this as a separate question). But at least, this explains why you might see differences with an "order of magnitude"...
EDIT: Interestingly, when changing the upper limit of the loop from
Integer.MAX_VALUE
toInteger.MAX_VALUE-1
, then both versions are optimized away and require "zero" time. Somehow this limit (which still appears as0x7fffffff
in the assembly) prevents the optimization. Presumably, this has something to do with the comparison being mapped to a (singed!)cmp
instruction, but I can not give a profound reason beyond that. The JIT works in mysterious ways...