speed of different loop implementations
-
Hi,
You should rather use a QVector if you want to go the Qt way. It should perform better than QList.
-
@SGaist :Thanks for the quick answer.
I tested QVector as well as std::vector for the container, and get more or less the same result as for the QList in all cases:
QVector seems to be slightly faster however the difference is less than 1%Eigen3 4x1000__________ 61874 milliseconds
Eigen2 4x1 QList_________49248 milliseconds
RayClass QList__________49127 milliseconds
RayClass QVector________49536 milliseconds
RayClassMethode QList____ 47555 milliseconds
RayClassMethode QVector__ 47347 milliseconds
RayClassMethode std::vector_ 47126 millisecondsi think i will implemet the real algorithm and test it again with the different
-
@SGaist :Thanks for the quick answer.
I tested QVector as well as std::vector for the container, and get more or less the same result as for the QList in all cases:
QVector seems to be slightly faster however the difference is less than 1%Eigen3 4x1000__________ 61874 milliseconds
Eigen2 4x1 QList_________49248 milliseconds
RayClass QList__________49127 milliseconds
RayClass QVector________49536 milliseconds
RayClassMethode QList____ 47555 milliseconds
RayClassMethode QVector__ 47347 milliseconds
RayClassMethode std::vector_ 47126 millisecondsi think i will implemet the real algorithm and test it again with the different
Instead of doing matrix-vector multiplications in a loop do a single matrix-matrix multiplication and drop the OpenMP stuff. Eigen (if that's the library you're using) already features threading internally and makes use of the extensions your processor supports. Put your vectors as columns in a rectangular matrix (4x1000) and do the multiplication with the 4x4 matrix from the left. The resulting (multiplied) vectors will be the columns of the produced (4x1000) rectangular matrix. Basically:
void calcMatrix(const Matrix<qreal, 4, 4> & M, Matrix<qreal, 4, 1000> & rays) { rays = M * rays; }
-
@kshegunov Thanks. That is really a lot faster.
However i'm getting in trouble for large matrices (4x10000).I get following error:
/usr/include/eigen3/Eigen/src/Core/DenseStorage.h:33: error: 'OBJECT_ALLOCATED_ON_STACK_IS_TOO_BIG' is not a member of 'Eigen::internal::static_assertion<false>' EIGEN_STATIC_ASSERT(Size * sizeof(T) <= EIGEN_STACK_ALLOCATION_LIMIT, OBJECT_ALLOCATED_ON_STACK_IS_TOO_BIG);
The matrices i created should not be on the stack, so i think eigen allocates some memory on the stack internally? Can this be changes?
-
@kshegunov Thanks. That is really a lot faster.
However i'm getting in trouble for large matrices (4x10000).I get following error:
/usr/include/eigen3/Eigen/src/Core/DenseStorage.h:33: error: 'OBJECT_ALLOCATED_ON_STACK_IS_TOO_BIG' is not a member of 'Eigen::internal::static_assertion<false>' EIGEN_STATIC_ASSERT(Size * sizeof(T) <= EIGEN_STACK_ALLOCATION_LIMIT, OBJECT_ALLOCATED_ON_STACK_IS_TOO_BIG);
The matrices i created should not be on the stack, so i think eigen allocates some memory on the stack internally? Can this be changes?
@gde23 said in speed of different loop implementations:
OBJECT_ALLOCATED_ON_STACK_IS_TOO_BIG
Google tells me you can do
#define EIGEN_STACK_ALLOCATION_LIMIT 1000000
before including Eigen/Core
To alter the limit.
If that is enough, I cant tell :) -
Don't mess with the stack! Instead make your (big) matrix, the one holding the vectors, dynamically sized (i.e. allocated on the heap). Use:
Matrix<qreal, 4, Dynamic>
instead of a fixed number for the columns number. And don't forget to initialize it before using. Follow the documentation for more details.
Kind regards.
-
Don't mess with the stack! Instead make your (big) matrix, the one holding the vectors, dynamically sized (i.e. allocated on the heap). Use:
Matrix<qreal, 4, Dynamic>
instead of a fixed number for the columns number. And don't forget to initialize it before using. Follow the documentation for more details.
Kind regards.
@kshegunov Can I upvote you 10 times?
-
@kshegunov Can I upvote you 10 times?
Yes. I allow it. :]