Evaluating (x + x * ++x)
One night in Java class, the instructor wanted to demonstrate the trouble that can be caused by referencing a variable more than once in an expression that pre/post-increments/decrements it:
x = 3; y = x + x * ++x; System.out.println("x = "+x+"; x + x * ++x = "+y+"\n");Having always had the good sense to not try something like that, I simply applied the evaluation theory I had learned a couple decades prior and came up with an answer, 15. The instructor, however, came up with an answer of 20. But then he ran that Java program and was surprised when it came up with my answer (which I had not shared with him at the time, but shared with him afterwards via email).
It turns out that while there is a standard way specified to evaluate that expression when humans read it, the code that the compiler generates to evaluate it is implementation dependent and those implementation dependencies can change the results unexpectedly. And as my tests revealed, different languages and different types of computers will indeed generate different results, albeit in two different ways in this case. The bottom line and basic lesson from this test is: don't use the same variable more than once in an expression where you increment or decrement it. There can be a tendency to pack as much as you can in one line of code, especially in C/C++, but doing so could expose you to situations like this.
During the week after that class, I got to wondering about it so I wrote test programs in every language I had access to. I compiled my results and came to realize that the value of y depended very much on how the intermediate values of x were stored and retrieved. An assembly listing of the C code and reference to the Java virtual-machine specification confirmed my suspicions.
So I wrote up my results as posted below.
Sir: I've been testing that problem we had trouble with in last week's Java class (Friday night): x = 3; y = x + x * ++x; System.out.println("x = "+x+"; x + x * ++x = "+y+"\n"); You were expecting y to be 20, whereas the Java program gave it as 15. I have tried the same code in a variety of different C-like languages with the following outputs: C (DOS program generated with Visual C++ v1.52): for x = 3; x + x * ++x = 20 C (gcc under Linux): for x = 3; x + x * ++x = 20 Perl: for x = 3; x + x * ++x = 20 AWK: for x = 3; x + x * ++x = 20 However ... Java: for x = 3; x + x * ++x = 15 JavaScript: for x = 3; x + x * ++x = 15 PocketC (on Palm Pilot): for x = 3; x + x * ++x = 15 It turns out that the result all depends on how the language retrieves the variable values. Under Visual C++, I got the following exerpt from the assembly listing: ; y = fffc ; x = fffa ; volatile int x; ; Line 6 ; int y; ; Line 7 ; ; x = 3; ; Line 9 *** 00000d c7 46 fa 03 00 mov WORD PTR -6[bp],OFFSET 3 ; y = x + x * ++x; ; Line 10 *** 000012 83 46 fa 01 add WORD PTR -6[bp],OFFSET 1 *** 000016 8b 46 fa mov ax,WORD PTR -6[bp] *** 000019 f7 6e fa imul WORD PTR -6[bp] *** 00001c 03 46 fa add ax,WORD PTR -6[bp] *** 00001f 89 46 fc mov WORD PTR -4[bp],ax Every reference to the x variable goes to that memory location (this turns out to be the crucial point). The first operation performed on Line 10 is the pre-increment, which changes the value stored in x. After that point, the modified value for x is retrieved from that memory location for all the other references to x: x + x * ++x = 4 + 4 * 4 = 4 + 16 = 20 Both C compilers generated native Intel code, as I assume that both Perl and AWK had done as well. Intel code does not normally perform arithmetic operations on a stack, but rather with registers and memory locations, as shown above. Furthermore, it fetches the values from memory in the order in which they are used, in accordance with the precedence of the operators. On the other hand, Java and JavaScript are not bound to a specific processor. From what I have read about the Java Virtual Machine, it is very stack-oriented and does perform arithmetic operations on a stack. That is in agreement with the theory I had learned years ago, which was not Intel-based (the 8080A had just become popular). Here is what I understand that sequence of operations to have been in Java: y = x + x * ++x; --<convert to post-fix>--> y (x) (x) (x) (++) * + = 1. Fetch the first instance of x. Read x from memory and push the value onto the stack. Stack == [addr(y),3] (top of stack is to the right) 2. Fetch the second instance of x. Read x from memory and push the value onto the stack. Stack == [addr(y),3,3] (top of stack is to the right) 3. Fetch the third instance of x. Read x from memory and push the value onto the stack. Stack == [addr(y),3,3,3] (top of stack is to the right) 4. Perform the pre-increment. Pop the top value, increment it, and push the result. Also update x's memory location. Stack == [addr(y),3,3,4] (top of stack is to the right) 5. Perform the multiplication. Pop the top two values, multiply them together, and push the result. Stack == [addr(y),3,12] (top of stack is to the right) 5. Perform the addition. Pop the top two values, add them together, and push the result. Stack == [addr(y),15] (top of stack is to the right) 6. Perform the assignment operation. Pop the stack and write the value to y's memory location. Stack == [] (ie, empty) In standard post-fix evaluation, the values are read from the variables in strict left-to-right order and pushed onto the stack or combined with the top of stack if an operation is performed. Therefore, the first references to x were read before the increment was ever performed. On the other hand, the Intel-based code handled each reference to x by going back to the memory location even after the increment. As we see from the PocketC test, a C program written on a stack-oriented platform yielded the same results as the Java program had. That would mean that C code can compile and perform differently on different platforms (variations in the definition of integer types is well-known). But since Java is designed to be platform-independent -- or rather dependent on a standard virtual machine --, I would expect Java programs to run more consistently across different platforms. Once you know what to expect from a Java program, you should be able to expect the same results when you run it on any platform. Though the real lesson is as you said: always test your code before you make pronouncements about it. ================================================= LISTINGS: test.c: #include <stdlib.h> #include <stdio.h> int main(void) { int x; int y; x = 3; y = x + x * ++x; printf("x = %d; x + x * ++x = %d\n",x,y); return 0; } ----------------- test.pl #! perl -w # file: test.pl my $x = 3; my $y = $x + $x * ++$x; print "x = $x; x + x * ++x = $y\n"; ## END ## ----------------- test.awk BEGIN{ x = 3; y = x + x * ++x;} END{print("x = " x "; x + x * ++x = "y);} ----------------- Test.java public class Test { public static void main(String args[]) { int x; int y; x = 3; y = x + x * ++x; System.out.println("x = "+x+"; x + x * ++x = "+y+"\n"); System.exit(0); } } ----------------- test.html <HTML> <HEAD> <SCRIPT> function doIt() { // int x; // int y; x = 3; y = x + x * ++x; document.write("x = "+x+"; x + x * ++x = "+y); } </SCRIPT> </HEAD> <BODY> <SCRIPT> doIt(); </SCRIPT> </BODY> </HTML> =================================================
Return to Top of Page
Return to My Programming Home Page
Share and enjoy!
First uploaded on 2008 February 08.
Updated 2020 January 19