A smarter brain than mine at work, pointed out this interesting article to me.
Java String Performance testing
It talks about string performance testing. I could make sense of most of the observations made in this test and thought of sharing it.
In short the test compares string operations like “+”, “new String” and then “+”, “+” with fields and then StringBuffer append. One of the conclusions of the test is
the ‘+’ operator is not evil when used with Strings
I agree with this statement and a look at the generate byte code pretty much convinces us of so. But, a word of caution here, it always makes sense to check this on our target compiler before committing to it.
Running through the examples, lets examine the first test “allPlusses”, the method uses constant strings, so the compiler is free to perform the operation before hand and work on the end result. For example if we look at the byte code generated for
String x = "0" + "1" + "2" + "3" + "4" + "5" + "6" + "7" + "8" + "9" ;
it is
LDC "0123456789"
So all the “+” operations are now replaced with a big string which is the end result of all those operations, this puts this test out of the same league for comparison. Had the code been something like
String x = counter + "1" + "2" + "3" + "4" + counter + "6" + "7" + "8" + "9" ;
where counter is a variable, the constant pool string would not have been this large and the plus operation would have been also forced to do some append operations. The byte would have been something like this
ILOAD 1
INVOKESTATIC java/lang/String.valueOf(I)Ljava/lang/String;
INVOKESPECIAL java/lang/StringBuilder.(Ljava/lang/String;)V
LDC "1"
INVOKEVIRTUAL java/lang/StringBuilder.append(Ljava/lang/String;)Ljava/lang/StringBuilder;
In this case the compiler was not smart enough to add all the other strings, since the variable x starts with another variable “counter”, but all the plus operations are being performed with StringBuilder.
Coming to “new String” usage, this gives the compiler the least amount of choice, the byte code looks something like
DUP
LDC "1"
INVOKESPECIAL java/lang/String.(Ljava/lang/String;)V
INVOKEVIRTUAL java/lang/StringBuilder.append(Ljava/lang/String;)Ljava/lang/StringBuilder;
NEW java/lang/String
In this case, we have to pay the overhead of creating the string and then doing the StringBuilder append, for brevity am not going in to what dup does.
Now between concatFields and stringBuffer methods, 1.5 introduced StringBuilder, so the compiler is smart enough to convert the
return f0 + f1 + f2 + f3+ f4 + f5 + f6 + f7 + f8 + f9;
into StringBuilder.append operations on 1.5. The byte code looks like
ALOAD 0
GETFIELD org/apurba/misc/TestStringPerformance.f1 : Ljava/lang/String;
INVOKEVIRTUAL java/lang/StringBuffer.append(Ljava/lang/String;)Ljava/lang/StringBuffer;
Since the 1.5 example was still using StringBuffer for stringBuffer method, it is not able to leverage this new class and lags behind so much on performance scales on 1.5. On 1.4 (checked only on jdk1.4.2_15) it was StringBuffer calls for “+” operations, so it was comparable with the stringBuffer() method on 1.4.
Another thing to notice is that we are using GETFIELD in contrast to LDC in other snippets, which could tilt the results in this direction at times. If the stringBuffer method were written with stringBuilder and field access, it would have been pretty much the same as the concatField method.
- In short, “+” does do a decent job, assuming that the compiler optimizations are present which would convert the calls to StringBuilder append calls.
- If it is “+” operations between constant strings, then it is even better than StringBuilder append calls.
We also have to understand that all these are dependent on the compiler optimizations and it is always best to test them before assuming it is true for our target vm. I checked on 1.4.2_15 and 1.5.0_06 and the ibm jdk 1.5 (IBM J9 VM (build 2.3, J2RE 1.5.0 IBM J9 2.3)) and these optimizations were present.

