I've developed a small scientific program that performs some mathematical computations. I've tried to optimize it so that it's as fast as possible.
Thanks to help in this forum, I'm almost done deploying it for Windows, Mac and Linux users. But I have not been able to test it on many different computers yet.
Here's what troubles me: To deploy for Windows, I've used a laptop which has both Windows 7 and Ubuntu 12.04 installed on it (dual boot). I compared the speed of the app running on these two systems, and I was shocked to observe that it's at least twice as slow on Windows! I wouldn't have been surprised if there were a small difference, but how can one account for such a difference?
Here are a few precisions:
- The computation that I make the program do are just some brutal and stupid mathematical calculations, basically, it computes products and cosines in a loop that is called a billion times. On the other hand, the computation is multi-threaded: I launch something like 6 QThreads.
- The laptop has two cores @1.73Ghz. At first I thought that Windows was probably not using one of the cores, but then I looked at the processor activity, according to the small graphic, both cores are running 100%.
- Then I thought the C++ compiler for Windows didn't the use the optimization options (things like -O1 -O2) that the C++ compiler for Linux automatically did (in release build), but apparently it does.
I'm bothered that the app is so mush slower (2 to 4 times) on Windows, and it's really weird. On the other hand I haven't tried on other computers with Windows yet. Still, do you have any idea why the difference?
Additional info: some data...
Even though Windows seems to be using the two cores, I'm thinking this might have something to do with threads management, here's why:
Sample Computation n°1 (this one launches 2 QThreads):
Sample Computation n°2 (this one launches 3 QThreads):
Sample Computation n°3 (this one launches 6 QThreads):
PC1-windows = my 2 cores laptop (@1.73Ghz) with Windows 7
PC1-linux = my 2 cores laptop (@1.73Ghz) with Ubuntu 12.04
PC2-linux = my 8 cores laptop (@2.20Ghz) with Ubuntu 12.04
Note: Of course, it's not shocking that PC2 is faster. What's incredible to me is the difference between PC1-windows and PC1-linux. I've also tried running the program on a recent PC (4 or 8 cores @~3Ghz, don't remember exactly) under Mac OS, speed was comparable to PC2-linux (or slightly faster).