Thread or process?

dracheschreck

Hello! I know this is a more general question and not-so-much Qt related but the users of this forum are usually very nice so:

I have a lot of data to process, these files are video-like (OpenNI), and I can only read them in real-time. This means that processing one hour of video will take me one hour. (This is because OpenNI is designed for real-time tasks, whereas I have a LOT of already-recorded videos).

So I thought: I should open many videos at the same time and start working on them in parallel.

I have tried implementing this in Qt, using threads. I have a tried a simple example. I created a thread that spews out some text 100 times. At the same time, the main process attempts to also spew out 100 lines of text. I expected the text from both threads (main and child) to interleave, but I always find that one thread dominates the other.

Aren't the threads supposed to execute in an interleaved fashion? Am I using the wrong tool? Should I use processes instead?

Thanks!

dracheschreck

here is my code, to avoid polluting my original post:

hellothread/hellothread.cpp
@--
#include "hellothread.h"
void HelloThread::run()
{
for (int i=0; i<100;i++)
{
qDebug() << "hello from worker thread " << thread()->currentThreadId();
}
}
@

main.cpp
@
#include <QtGui/QApplication>
#include "mainwindow.h"
#include "hellothread.h"

int main(int argc, char *argv[])
{
QApplication a(argc, argv);
//MainWindow w;
//w.show();

HelloThread thread;
thread.start();
for (int i=0; i<100;i++)
    qDebug() << "hello from MAIN thread " << a.thread()->currentThreadId();
thread.wait();  // do not exit before the thread is completed!
return 0;

//return a.exec(&#41;;

}
@

hellothread.h:
@
#ifndef HELLOTHREAD_H
#define HELLOTHREAD_H

#include <QThread>
#include <QDebug>

class HelloThread : public QThread
{
Q_OBJECT

private:
void run();
};

#endif // HELLOTHREAD_H

@

The output is:
100 times "hello from MAIN thread 140737353906048 "
then 100 times : "hello from worker thread 140737102640896"

I would expect the output to be some lines of the first kind and some lines of the second interleaved.

rafaelhamdan

Try to add a small time interval between each message output (in both HelloThread and main thread), like 100ms or so, and see what happens. Then you will be able to understand what actually happens behind the scenes :)

lgeyer

[quote author="dracheschreck" date="1330633680"]
Aren't the threads supposed to execute in an interleaved fashion? Am I using the wrong tool? Should I use processes instead?[/quote]

You have no influence on when and how often the operating system schedules your threads. If one thread finishes the work before the other is even scheduled it appears that they are executed in sequence. If you raise the workload for the threads you will see that they are executed interleaved.

Using threads for your problem seems legit, although subclassing is no longer the recommended way of using QThread. See "You Are Doing It Wrong...":http://labs.qt.nokia.com/2010/06/17/youre-doing-it-wrong/ and "Threads, Events and QObjects":https://qt-project.org/wiki/Threads_Events_QObjects.

miroslav

Using text output to see how threads perform has one significant drawback - it is IO, which influences the scheduler and how the threads execute. So judging how the threads behave from the output is likely misleading. You could try outputting data into a data structure, for example.
Also, when measuring with threads, make sure to use processing chunks as workload for the threads that are significant (big, heavy :-), for example >10.000.000 operations or 50ms), otherwise the thread switching will dominate the timing behavior, again leading to skewed measurements.

pierrevr

As miroslav said, try doing some heavy lifting, like Hanoi tower calculation or something... Also, printing something 100 times is not a lot, the displaying probably takes up more time than anything. Dump things into files if you're testing with scheduling times...
Also, if you want to play around with thread scheduling, I can only recommend using a real-time OS, like QNX, or Xenomai Linux patch, and messing about with the sched_* functions. There's really cool stuff there, and some behaviour will be very different compared to a desktop OS. This will also allow you to get precise timing infos, using the realtime clock.
I suggest you try switching between round robin, fifo & sporadic policies, whilst playing with clock cycles & scheduling intervals. (probably doesn't work so well on non real-time systems though...)

miroslav

I agree that results can be better with more integration with the OS, but this needs to come with a warning: The results may be completely not what is expected, and platform-dependent. If the main goal is video processing throughput, I would recommend sticking to the basics, and parallelizing on a high level, maybe even one video per thread. For starters, make sure all your CPUs are at 100% :-)

The reason why the results of tweaking with priorities and scheduler policies can be surprising is that it is quite hard to fully predict asynchronous behavior and it's side effects. For example, giving one thread a higher priority can make the overall execution slower, since it may starve the others from resources.

nish

Hi,
Try wait for the thread to start. Catch the started() signal of the thread and then do the work in main thread.

miroslav

@nish What exactly are you trying to achieve with that?

nish

since his main thread always prints first.. looks like the other thread is not yet started. May be it is always starting after the main thread is free. So i was suggesting to catch the started() signal in main thread and then start doing the work of main thread there. So both the threads are started.

miroslav

Oh, yes. That does make sense. It is a common mistake to assume that the second thread starts right when start() is called. From a software point of view, it starts a decade later :-)

dracheschreck

WOW! Tons of good help here! Thanks!

I will try all this stuff today.

dracheschreck

I have a secondary question: Is the OS who decides what thread should have a slice of processor time?

I thought the OS decided on which process will get a slice of time, and the process would decide which thread would have priority for this time (quantum)

? This user is from outside of this forum

Yes, it is the OS thread scheduler, you have a little say besides setting priority for individual threads.

BTW check the doc notes in the QThread page to see how to properly use threads, your code follows the doc guidelines which are ancient and no longer considered best practice.

Also I recommend sticking to threads, a separate process is rarely needed, inter process communication is harder and potentially slower, even more so for resource sharing, the OS generally isolates the memory footprints of different processes from everything else just to keep things safe, and really, in 99% of the time you will be just fine with threads only.

miroslav

The thread scheduling policies are OS specific. Today, most OSes automatically switch between threads. The answers in this thread are Linux-focused, where scheduler policies and thread priorities are available through system APIs (not Qt). On some setups, manipulating these requires security privileges (google for AEGIS, for example on the N9). There were OSes in the old days (that are sometimes still around, which is why I mention that) where threads needed to "yield" (cooperative multitasking).

There are a few tools in Qt that can make your live a lot easier, like QThreadPool and QRunnable. Implementing your own thread class is rarely needed and rather old school.