Console App how to populate a 10.000*10.000 Grid efficiently?

StudentScripter

Hello im trying to build a simulation/calculation for console where i have a huge "grass area" every square is 10x10, the whole area should be 10.000*10.000. I have to store the state of each cell and want to be able to access and modify it very quickly. I thought about using a qset, but populating this with a for loop still takes quite a long time. Is there any way to speed it up? Or any other design idea on how to make it more efficient?

The values of the squares will/should be able to change quite quickly and at onces (not all of them have always the same value).
Thanks a lot, eager to hear your input.

JonB

@StudentScripter
For Qt classes, just start with a QVector/QList of QVector/QList of your data type. Similar if you want to use std::vector.

QSet is not useful here, and does a lot more work than a plain vector (and has a consequential overhead). Initialising 100 million elements will take whatever time it takes, but shouldn't be too long (untested). You can make a perhaps marginal improvement by an initial reserve()/resize() for your fixed number of elements.

You can always worry about anything more efficient later on. If your data is "sparse", like a lot of "empty" items, you can also look at ways of reducing memory usage. But I actually think it won't be bad with QVector/QList.

If you want the Qt model interface you could implement a QAbstractItemModel for your data structure and use the row/column/data interfaces. But that's another layer, simple vectors will be faster.

StudentScripter

@JonB QVector is fine for the coordinates, for the grass area im going with qbytearray and bool values, seems to be faster. But still looping trough this huge as array is the main bottleneck. I thought of parallelizing it in different thread, each thread a chunk of the array.

Christian Ehrlicher

Why would you loop through the array? Do you want to search something?

JonB

@StudentScripter
Ummm, if you have a "huge" array (whatever) of data and you "loop through it" then it's going to be "slow". What else would you expect? You have something like 100 million squares of 10x10 somethings. I'm with @Christian-Ehrlicher: what are you trying to do, why, how often, is there a more efficient way of doing it (e.g. indexing, lookup tables)? Either this is inherently a very slow exercise or there is something "sub-optimal" about the approach.

You can indeed split the search across threads, but just how much difference will that make? How many free cores/CPUs will your target have? If, say, as many as 4 cores are used it might reduce the search by a factor of up to 4. How much real difference will that make to performance? OTOH reducing the map or the search to, say, 1,000x1,000 would reduce by a factor of 100. If the search could be "indexed" then even more.

StudentScripter

@JonB Well let me explain.

Some Function "GrassChanger" determines which patches will be changed, it stores all squares that shall be changed in a vector.

The squares changed can be anywhere in this big map, at the beginning only a few hunderd ones will be changed, but this amout will grow over time, eventually reaching many millions of squares needed to be changed at onces.

Of course i thought of an mechanism to invert this search at the point where one side becomes > 50% to save resources. How would i do indexed search with Qbitarray with the values from the vector? I asume something like:

(Pseudocode)

vector[100000]

QBitArray ba(5000000);

for(int i; i < vector.size; i++){

ba.togglebit(i);
}

Reducing the map smaller is of course an option, but nothing i would like to do as the precision of the calculations depend highly on having a huge samplesize eg a lot of numbers.

Let me know what you think, im grateful for any better solution/optimisation idea. :) <3

JonB

@StudentScripter
So here you toggle 100,000 bits, right? Which takes whatever time it takes (and what is that?) If that is what you need to do I don't see how that can be optimized in itself. There may be better low-level ways of toggling the bit-states in a larger number of adjacent words, I don't know. You may find quite a difference compiling for Release if you are currently Debug.

I did do a simulation a while ago with a large domain and simple operations,. I did parallelize it with QThread, but I had few cores/CPUs available. Obviously that approach is proportionate to what the end machine will have available for your use.

StudentScripter

@JonB Thanks yeah i guess as much threads with a strong enough machine is what im going to try. :) Don't see any other option aswell. Thanks alot regardless.

JonB

@StudentScripter
Before you start with threading make sure all your time really is being spent in the toggling or whatever action. Not for example any GUI or waiting! Profile if you can. And make sure you really are only doing what you show, whatever is right of the heart of the loop should be lean and mean.

jeremy_k

Depending on the pattern of changes, a quadtree may be a more efficient structure in terms of storage space and execution time.

StudentScripter

@JonB @jeremy_k

It actually no gui at all till now. Only 4 spinboxes or so to enter some values. Also: I switched from debug to release build and now even with 1000.000.000 numbers and without threading it does perform in nearly realtime.

I will have some "entities" that will be able to move in one direction at a time. Could a quadtree be there a good solution?
Originaly thought of using a 2d vector storing x and y pos of the "entities" (they arent visual, just position numbers) and updating them by looping trough the vector but as said probably not the best idea. Is a quadtree more efficient in updating all members positions and "collision"?

JonB

@StudentScripter
Yes I would expect this kind of code to show a good difference in release vs debug.

I have no idea about your "quadtree" since I have never heard of one :) Sounds like the kind of tree a quadbike would run into?

jeremy_k

@StudentScripter said in Console App how to populate a 10.000*10.000 Grid efficiently?:

@JonB @jeremy_k
I will have some "entities" that will be able to move in one direction at a time. Could a quadtree be there a good solution?
Originaly thought of using a 2d vector storing x and y pos of the "entities" (they arent visual, just position numbers) and updating them by looping trough the vector but as said probably not the best idea. Is a quadtree more efficient in updating all members positions and "collision"?

Quoting from the Wikipedia article:

A quadtree is a tree data structure in which each internal node has exactly four children. Quadtrees are the two-dimensional analog of octrees and are most often used to partition a two-dimensional space by recursively subdividing it into four quadrants or regions.

A easy optimization for setting a quadrant or an entire 2d space as a single color is to remove all subtrees of that tree, and instead store the color as a single value. Instead of setting the value for 1.000.000.000 cells, store a single cell that says that all cells have the same value. Searching for a collision can stop when this optimized node is encountered.

A penalty for this method is the potential for more pointer indirection, and additional runtime memory allocation if nodes aren't preallocated.

JonB

@jeremy_k said in Console App how to populate a 10.000*10.000 Grid efficiently?:

A easy optimization for setting a quadrant or an entire 2d space as a single color is to remove all subtrees of that tree,

Absolutely. If that's the sort of thing you want to do. Doesn't seem to relate to toggling the state of 100,000 adjacent bits in 5 million.

it stores all squares that shall be changed in a vector.
The squares changed can be anywhere in this big map,
eventually reaching many millions of squares needed to be changed at onces.

If that is what you need to do, again I'm not sure I see any use of these quadtrees. But only you need the exact situation.