Sunday, March 3, 2019

Java vs C Benchmarks

(Written a few years ago, but moving to my main blog - original date March 2014)
Benchmarks are much like statistics - so easy to mislead people about what they mean.  They are useful to be able to compare systems, programs, etc, and it is not uncommon when a new system is available, that everyone wants to try out their app to see how fast it will be. This was common at universities where the latest big system needed to be tested with every individual's idea of the 'most important' programs. Of course, the benchmark is really a factor of a number of changes - hardware, compiler, configuration, libraries, etc.

A common benchmark these days is between computer languages. It used to be that C and FORTRAN were always considered the fastest languages with C++ close behind.  With the increase in the number of languages and investment in making ones like Java much faster than the first iteration (Java 1.1 was about as fast as bash for running a program, but now it is close to C in many benchmarks), comparisons between languages are extra fuel on the fire of programming language wars/discussions.

There is a great site for such comparisons: Computer language benchmark games (previously shootout)

This shows that C, C++, and now Rust instead of FORTRAN still generally hold the favored positions although not for every micro-benchmark. Sometimes Java or another language will win out. The benchmark games also include source code and compile commands if you want to repeat it yourself.

Another fun tool is SciMark2 a Java benchmarking tool that also has a C version.

Running these two versions on one of my older computers (AMD Athlon II 3200) showed that Java (open_jdk_1.7 in this case) won by a small margin even when using profiling and various GCC compiler options for performance.  This result surprised me since the Java JVM is in C++ and seems to be compiled with GCC/G++ in this case - ok, yes, the JVM can do JIT (just in time compilation - runtime based optimizations), but to actually beat C was surprising. (Not all JVMs are written in the same language - some are C, an early IBM version was in Smalltalk. GCC is written in C++ since 2012, so, I'm using C/C++ a little interchangeably.)

The slowest C benchmark was the SOR method with the C version getting ~560 and the Java version about 780 - the other micro-benchmarks were either in C's favor or equal.  Looking at the SOR.c file, it had a simple loop over three vectors in a 2D matrix/array.  There was little to improve on ... at first sight and I suspect this is why Java was winning as it was doing optimizations that GCC wouldn't produce despite options such as -funroll -loops.  Manually unrolling the loop by a factor of two so that every loop iteration performed two SOR steps raised the C version performance to 3200 and unrolling the loops to 4 steps per iteration raised the performance to 4800. The lesson is clear that with a little experience in optimizing C, turning a well-written piece of code into a fast piece of code can be quite easy.  Did loop unrolling make a difference with Java? Yes, but only a little.  The Java version was improved up to 800 (from ~780), but that's a small difference and hardly worth the tuning effort as Java really did do most of the work for you.

Now, what does this tell you - C is probably still the champion after all a Java program is converted to Java bytecode which is then run on a JVM which itself is a C++ program - to make Java truly faster than C/C++ is to make C/C++ faster than C/C++.  The difference is that the C++ program called the JVM puts more effort into optimizing itself than does 'run of the mill' C (and GCC compilations).  If you know and are willing to optimize C, you'll most likely get some benefit, if not Java may be a good choice!

What does all of this tell you - that benchmarks are easily twisted to suit what you want to say if you're just willing to find the right one.  Anyway, have some fun with the benchmarks and results above to tell your own story.