Thread safety of single instance of QVector, [] operator

halfgaar · wrote on 6 Dec 2014, 20:10

I'm writing a program in which a QVector<quint64> of several hundred million elements large gets filled with numbers. The vector is pre-allocated (append is never called). The thing is, I want to write to it from several different threads. That is, each thread fills its own assigned section of it, using the [] operator.

QVector is normally not thread safe, but I was wondering how unsafe it is to have multiple threads writing to it when you know they won't be writing the same elements. Is there meta data that might get messed up?

In several test runs, it seems to run fine.

t3685 · wrote on 6 Dec 2014, 20:53

Thing with thread-safety is that "several" tests do not really tell you anything. It's better not to take chances on this.
Why are you using threads? Is it because the computation of data is really expensive? Or are you worried that adding data to the vector might be a problem.
If it's the first case you can have the computation in separate threads and have one thread join the data. If it's the second case, you might be off using some other data structure.

halfgaar · wrote on 6 Dec 2014, 21:02

I know that 'several' tests is not a good tests, hence the question.

It's an expensive operation that I'm optimizing using threads. It takes several gigabytes of RAM, which makes the option using separate data structures and joining them less attractive, because it would require even more memory.

The QVector is a very efficient way to store my result, because it's preallocated as a contiguous block and doesn't need to store pointers to fragmented segments of data, which would increase the overhead. That's why I went ahead and tested if I could fill the same vector using different threads, knowing they won't be accessing the same elements.

Using mutexes is not really an option, because of the overhead.

JKSH · wrote on 7 Dec 2014, 07:21

Hi,

The technically correct answer to your question is, "it's undefined". QVector wasn't designed for this use case in mind, so even if it works now there's no guarantee that it will continue to work in the future.

From an academic point of view, looking at QVector's "source code":http://code.woboq.org/qt5/qtbase/src/corelib/tools/qvector.h.html#_ZN7QVectorixEi, I think it should behaves how you want it to (but remember that you're still relying on undefined behaviour). However, if you ever do anything that causes the QVector to reallocate its internal memory or make a copy of the vector, then all bets are off. Disclaimer: I don't recommend taking this route.

Why not just use a raw C++ array? This way you don't need to worry about what a QVector might do behind the scenes.
@
quint64* myVector = new quint64[N];
@

halfgaar · wrote on 7 Dec 2014, 08:16

Actually, using a raw array may very well be option. The class it's in can do its bookkeeping.

Should have thought of that :)

t3685 · wrote on 7 Dec 2014, 12:04

Again though, if the operation itself (as in computing the data) is expensive that you can use threads to do the calculation and use one single thread to store the data.
If you are worried that storing the data is expensive, using threads won't do you much good because the bottleneck is memory and not computing power.