Ah, the way I remember it, you have to read combined transform operations from the last operation to the first. That is, the item is first translated (last operation) so it's center is exactly where the parent's center is (in this case, the top/left of the scene). Then it is scaled. Because it's centered, it's scaled in all directions at the same rate. And finally (first operation), you have to move it back where it belongs.