To Mutex or not to Mutex
-
@J.Hilk said in To Mutex or not to Mutex:
Right, even thoug memcpy is meant to be the fastest library routine for memory-to-memory copy, it's not save from being put to "sleep" by the OS halfway through the task.
The speed would depend on the actual C library implementation. For windows (XP) the standard library just did a word by word copy on the aligned memory block. So in any case it wasn't that fast, but then again how else can it be done? You have only one memory read per CPU cycle at most.
I'll have to indeed read that up. Thanks for the reference.
Dig up the Qt example on the producer consumer (I was too lazy to do it for you). If memory serves me it should cover your case.
-
@kshegunov
I did, I think you mean this one Wait Conditions ExampleVerry interesting approach and I can see uses for this, but I don't think it is useful for my situation.
- I'm populating the data array of an QImage with the calculated values. I don't want faulty interim images so, I'll have to wait until all calulations are done before I start copying.
- If I look and unlock between each Byte that is copied to tell the other thread that part of the data is ready to be copied, I'll end up - by an 800x800 image - with 640000 mutex locks, unlocks, bytecount increments & notification signals. On the first clance, this seems slower than locking before the copy function and unlocking after.
- this is with a Buffer size of 4 Byte(argb) of course
-
@J.Hilk said in To Mutex or not to Mutex:
I did, I think you mean this one Wait Conditions Example
Yes that's the one.
I'm populating the data array of an QImage with the calculated values. I don't want faulty interim images so, I'll have to wait until all calulations are done before I start copying.
The point is you should not copy at all. Just create a biggish buffer and while one thread's working on part of it, the other can read the already produced data (another part of the buffer). You don't need to have interim data, just modify the counters accordingly and free so many resources as the thread provided.
If I look and unlock between each Byte that is copied to tell the other thread that part of the data is ready to be copied, I'll end up - by an 800x800 image - with 640000 mutex locks, unlocks, bytecount increments & notification signals. On the first clance, this seems slower than locking before the copy function and unlocking after.
You misunderstand me sir. The mutex is a mutual exclusive lock that's going to serialize the access to the whole buffer. Here that's not the idea. The idea is to write one part of the buffer without restricting the consumer thread to read other parts of the same buffer. This is allowed as long as they don't read/write the same element (byte).
this is with a Buffer size of 4 Byte(argb) of course
I don't follow, you have a buffer of bytes. What you put in it is an implementation detail. You could as well write whole images, or single bytes, or arrays of integers or whatever.
-
Ok, this is new for me, so let's see if I got this right ;-),
In the example there are the following objects that are shared betweens the 2 threads:
const int DataSize = 100000; const int BufferSize = 8192; char buffer[BufferSize]; QWaitCondition bufferNotEmpty; QWaitCondition bufferNotFull; QMutex mutex; int numUsedBytes = 0;
DataSize is the actual size of the data that should, in my case, be copied.
BufferSize is the amount of bytes you want to be at least ahead of of the other thread. This can be anything between 1 and DataSize.I guess reasonable would be, in my case, a Pixel(4Bytes) or a line (bytesPerLine)
The Producer thread has this in it's run function:
for (int i = 0; i < DataSize; ++i) { mutex.lock(); if (numUsedBytes == BufferSize) bufferNotFull.wait(&mutex); mutex.unlock(); buffer[i % BufferSize] = "ACGT"[QRandomGenerator::global()->bounded(4)]; mutex.lock(); ++numUsedBytes; bufferNotEmpty.wakeAll(); mutex.unlock(); } }
For the edge case BufferSize = DataSize this could be simplyfied to
mutex.lock for (int i = 0; i < DataSize; ++i) { buffer[i % BufferSize] = "ACGT"[QRandomGenerator::global()->bounded(4)]; } } mutex.unlock();
The example also requieres that reading and writing happens all the time simultaneously, otherwise Buffer[i] will be overwritten before the other thread could read it.
In my case reading happens every 50ms (20Hz Frequenzy) where as writing happens irregulary, when ever the recursive function finished one iteration.But never the less I could adjust the example for my needs, I think:
char buffer[rows*cols]; QMutex mutex; int numLinesWritten = 0; int numLinesRead = 0; QWaitCondition wairForReader; QWaitCondition waitForWriter; //Producer mutex.lock(); if(numLinesRead == 0){ //Other thread reads currently line 0 wairForReader.wait(&mutex); mutex.unlock(); } mutex.lock(); numLinesWritten = 0; mutex.unlock(); for(numLinesWritten; numLinesWritten < rows; ){ mutex.lock(); if(numLinesWritten > numLinesRead) wairForReader.wait(&mutex); mutex.unlock(); for(uint x(0); x < cols; x++){ buffer[cols * numLinesWritten + x] =m_data[cols * numLinesWritten + x]; } mutex.lock(); ++numlinesWritten; waitForWriter.wakeAll(); mutex.unlock(); }
The Consumer look similar, I don't have the time to do that right now.
This is from the top of my head and may be wrong :-)But seems unnecessary complex with lots of extra additions ( on numlinesWritten & numLinesRead) multiple locks and unlocks and wake calls.
correct me when I'm wrongere here, of course.
Like I said, for my case it could be faster to simply wrap the read and write in a lock, like I did, because reader tries to read the data irregulary every 20 ms.
//Producer ... if(m_Mutex.tryLock(0)){ //If other thread currently reads, do not wait to lock, skip this write cycle memcpy(m_GrainsCopy, m_Grains, arraySize); m_Mutex.unlock(); } //restart while //Consumer m_Mutex.lock(); //if not free wait until it is for(uint i(0); i < arraySize; i++ ){ memcpy(image + (i*4), colors.at(m_GrainsCopy[i]), 4); } m_Mutex.unlock();
-
You have now completely confused me ... :)
As I am an old fashioned guy, I'll work with semaphores here. Consider the following very short snippet that should be more than sufficient for your case:Common data:
const int bufferSize = 20000; // or w/e QByteArray data(bufferSize); // The shared buffer QSemaphore bytesToRead(0), bytesToWrite(bufferSize); // Sync
Producer
DataType dataToWrite; int bytes = sizeof(DataType); bytesToWrite.acquire(bytes); // Write the data to the actual buffer bytesToRead.release(bytes);
Consumer
DataType dataToRead; int bytes = sizeof(DataType); bytesToRead.acquire(bytes); // Read the data from the buffer bytesToWrite.release(bytes);
And that's all - easy peasy. :)
PS.
Are you working on a genetics problem, as this"ACGT"
sounds very much like the four nitrogen bases? -
thanks, I'll take a lok at it :)
@kshegunov said in To Mutex or not to Mutex:
PS.
Are you working on a genetics problem, as this "ACGT" sounds very much like the four nitrogen bases?No, that is directly from the example from Qt's webside
You can find my project here
-
@J.Hilk said in To Mutex or not to Mutex:
You can find my project here
Yep, I glanced at it, looks okay more or less. That's a very odd way of doing threading however. Why not use signals and slots to transfer the data between the worker and the GUI? Or at least use a thread-safe queue with your data to push and pull out of, seems a bit simpler than writing to a buffer and then reading it back?
-
@kshegunov said in To Mutex or not to Mutex:
That's a very odd way of doing threading however. Why not use signals and slots to transfer the data between the worker and the GUI? Or at least use a thread-safe queue with your data to push and pull out of, seems a bit simpler than writing to a buffer and then reading it back?
😳
very simple reason, from my point of view at least.As long as the While loop is running the event loop of the thread is "on halt" therefore no Signals are emitted/received during that time.
Would I want to pass the data to the gui thread via SignalSlot, I would either have to call processEvents or exit the loop and reenter it via a QTimer or QMetaObject::invokeMethod (with Qt::QueuedConnection, as a parameter).
This slowes down the calculation of the next value significantly.QMetaObject::invokeMethod was the method I used at first, and changed it later.
QQueue however may be an option, haven't thought about that one to be honest :)
-
@J.Hilk said in To Mutex or not to Mutex:
As long as the While loop is running the event loop of the thread is "on halt" therefore no Signals are emitted/received during that time.
Emission of signals is not affected by a blocked event loop. Only the execution of the slot connected to the signal if the receiving object's thread's event loop is blocked. :)
-
@kshegunov said in To Mutex or not to Mutex:
@J.Hilk said in To Mutex or not to Mutex:
As long as the While loop is running the event loop of the thread is "on halt" therefore no Signals are emitted/received during that time.
Emission of signals is not affected by a blocked event loop. Only the execution of the slot connected to the signal if the receiving object's thread's event loop is blocked. :)
😱