Proper image rotation by 90 degrees

Hey there,
I have huge images of sizes about 1GB. These I want to turn by 90 degrees. Is there a way to rotate them by 90 degrees without creating a temporary copy of the image which doubles the amount of data for a short time?
I'm only aware of these two possibilities, first:
@
QMatrix rm;
rm.rotate(90);
newImage = oldImage.transformed(rm);
@This copies the image and doubles the amount of data.
The other way is to rotate the QPainter while he is drawing it, but this results in the fact, that the QPainter is going to rotate the image at every update which isn't that nice as well.
Do you have any suggestions? Thx for any help.

I don't think you can implement a matrix multiplication "inplace", i.e. without creating a temporary copy. That's because the same index of the input matrix will be read multiple times, so overwriting it will result in an invalid result.
Just imagine you want to multiply matrix A by matrix B, resulting in matrix C. Multiplying the first row of matrix A with the first column of matrix B, will give you the value for index (0,0) of matrix C. Now, if you did overwrite the input matrix A at (0,0) at this point, because you don't want to store C separately, things go horribly wrong! Next you have to multiply the first row of A with the second column of B in order to get the value of C at position (0,1). But the first row of A has been modified already and thus you get a "wrong" result.

Yes, that's the inconvenient truth. But there are still algorithms that make it possible. Your "inplace" was the word I was missing to find something like that:
http://en.wikipedia.org/wiki/Inplace_matrix_transposition
It's like I first thought of it, you just copy one pixel from it's source to where it belongs, but deploy the overwritten pixel to a temporary variable. Then you replace that one and so on till the whole matrix is transformed.
This can be accelerated by using multiple values at a time. There it is possible to find a trade between how many data I want to allocate for the transformation and the speed of the transformation.
But this will take me quite some time to set up a proper algorithm, therefore I'm in the search of something like this already done in C++.