Simple threading issue, need help!
-
I have spent the last 48 hours working on trying to get ANY multithreaded program working. I have scoured every resource, tried every method I can find (subclassing QThread, Worker/QThread, QConcurrent) I've copy pasted code directly from tutorials into new projects, nothing works. Either it doesn't work at all, QThread::isFinished() is always true and crashes at the end of the program (even though the thread is connected to the Workers "finish" signal which is emitted upon completion) or it runs completely serially.
Using QElapsed timer I was watching the time to process threads. Processing a task on an array X wide, without threads it took 180 ms, with the array divided equally between 2 threads it took 210 ms, with just 1/2 width and running one thread is too 90 ms. These are the results I am getting constantly, even though QThread::idealThreadCount() returns 12, and I am on a beast of a computer.
I followed VoidRealms example to the letter here
https://www.youtube.com/watch?v=fd_SpePyWyI&index=125&list=PL2D1942A4688E9D63and while his output was nearly instant, my result was struggling to chug along and took a while to finish.
I don't know what else to do, so I turn to you guys. If you can't help me get it working I'm going to stop trying to thread. I'm convinced either Qt Creator is broken, my PC is somehow limiting it, or god just hates me.
Could someone give me some test code that definitely works and shows a serial process vs a parallel process performance so I can verify my system is broken?
All I want is to split a QByteArray into chunks and convert it into an integer array to reduce a 500 ms process into sub 100ms, and I've sunk way too much time into this. Please help.
-
I'll add some code so you can see what I'm doing.
Here is the code that is the same as VoidRealms. It is so slow in comparison, while running on an i7-3930k @ 3.20GHz.
Here is a gif of it running
Here is a .gif showing it's slow operation. http://i.imgur.com/lgt3cPZ.gifvDialog.h
#ifndef DIALOG_H #define DIALOG_H #include <QDialog> #include <QDebug> #include <QtConcurrent> namespace Ui { class Dialog; } class Dialog : public QDialog { Q_OBJECT public: explicit Dialog(QWidget *parent = 0); ~Dialog(); static int getNumber(int &baseNumber); private slots: void on_pushButton_clicked(); private: Ui::Dialog *ui; }; #endif // DIALOG_H
Dialog.cpp
#include "dialog.h" #include "ui_dialog.h" Dialog::Dialog(QWidget *parent) : QDialog(parent), ui(new Ui::Dialog) { ui->setupUi(this); qDebug() << QThread::idealThreadCount(); } Dialog::~Dialog() { delete ui; } int Dialog::getNumber(int &baseNumber) { int high = 10000; int low = 1; int random = qrand() % ((high + 1) - low) + low; qDebug() << "Randomizing " << baseNumber << " = " << random; baseNumber = random; return 0; } void Dialog::on_pushButton_clicked() { QList<int> list; //Add 0 to 99 to list for(int i =0; i<100;i++) { list.append(i); } //block until all have completed QtConcurrent::blockingMap(list,&Dialog::getNumber); ui->listWidget->clear(); //update the ui for(int i = 0; i<list.count(); i++) { ui->listWidget->addItem(QString::number(i) + " = " + QString::number(list.at(i))); } }
Main
#include "dialog.h" #include <QApplication> int main(int argc, char *argv[]) { QApplication a(argc, argv); Dialog w; w.show(); return a.exec(); }
-
@bdmontz Here I can see that you print out the values: http://i.imgur.com/lgt3cPZ.gifv
You should measure the execution time without printing values, because printing values is an expensive operation.
And you should use a much bigger list, 100 elements is not really worth using multi-threading. -
Hi,
To add to @jsulm, if you want to get most of the performance you can from your code you should also take care of what container you use, how you access them, and how you construct your data. For example use QVector or std::vector rather than QList. Use a const iterator to go through your container, optimize your string building code etc.
-
@bdmontz
I dropped your dialog.cpp into the voidrealms code that I had downloaded from his website. It just zipped along. I have no timinig results, but there was no discernible delay. Strange indeed.Your timing results reminded me of a YouTube video by Scott Myers on cpu caches and false sharing. https://www.youtube.com/watch?v=WDIkqP4JbkE It's over an hour long but I think that it may answer your question about scalability.
Mike
-
@mjsurette said in Simple threading issue, need help!:
Your timing results reminded me of a YouTube video by Scott Myers on cpu caches and false sharing.
This is not even remotely related. Consider the function that's executed by each thread:
int Dialog::getNumber(int &baseNumber) { // These 3 lines expand to about 5 asm instructions after the compiler optimizes out the constants (they're smart that way). int high = 10000; int low = 1; int random = qrand() % ((high + 1) - low) + low; // This is an IO operation qDebug() << "Randomizing " << baseNumber << " = " << random; baseNumber = random; // This is a single mov instruction at best return 0; }
IO operations are inherently serial, you can't thread them, which means that whatever number of threads you start their writing to the file will be serialized (I'm talking low-level here, not even under your control). To pile up not only will the scheduler have to switch contexts for the writing giving time slots for each thread to write to the file, but also IO operations are slow. By slow I mean terribly slow compared to the function call, and addition instructions the compiler will generate for the first part of the function.
Think it this way: He's measuring how much the ocean level drops if you fill a cup of water from it ...