How to make code dealing with calculations based on data from large data sets faster in C++?

Sachin Bhatt

I'm composing a code in which I have input information of size 50000x20 lattice (50000 blocks, 20 properties each). I then estimate to conclude the block I would choose to move in view of certain computations for which I need to run the circle over all blocks. Then I roll out certain improvements to the characteristic Value (an alternate variable )for this block and rehash a similar cycle again and again. Do the computations for each block inside the circle. Is calling a capability a superior and quicker method for getting it done? This code is running without meeting (in view of some combination standards) for a long. How would I manage this? Is this way to deal with managing enormous informational indexes great? Could I at any point improve? Kindly assistance.

sierdzio

It's a bit hard to understand what you are describing. Also, it's hard to help without knowing more details.

So, I'll just list some general hints, maybe they will apply to your case:

if you are looping over the data, use standard loops (not ranges, not iterators) - they are the fastest
be very careful about constness, expecially of Qt containers (they can detach and with such amount of data it will lead to a lot of lost RAM and CPU cycles)
use const refs, views as much as possible
use tight containers (QVector) to improve CPU cache usage
if possible - parallelize using threads, SIMD instructions or GPGPU libs
when hashing, use streaming (additive) approach

Kent-Dorfman

I suppose I'll risk being labeled a heretic and mention pointer access to elements instead of [] indexing...It's a loaded argument but I almost always have better performance with pointers in my world.

Also, be aware of data cache misses on a 1000000 element structure, especially if you are not sequentially accessing elements.

Christian Ehrlicher

@Kent-Dorfman said in How to make code dealing with calculations based on data from large data sets faster in C++?:

I suppose I'll risk being labeled a heretic and mention pointer access to elements instead of [] indexing...It's a loaded argument but I almost always have better performance with pointers in my world.

Then you underestimate recent compilers... even it's a simple testcase the code is the same for the last three ones, and nearly the same like the first.

Kent-Dorfman

@Christian-Ehrlicher said in How to make code dealing with calculations based on data from large data sets faster in C++?:

Then you underestimate recent compilers

there is a difference between underestimating and not being able to use something, especially when dealing with multiple CPU architectures...but as a I wrote, a loaded argument.