QThreadPool - wait for (at least one) free thread?

MuldeR

Hi.

I'm using a QThreadPool to run a huge number of tasks. It's too many tasks to create them all right away, so I use QThreadPool::tryStart() to start as many taks as there are free threads. But what then? If tryStart() returns false, the current task could not be started, because there are no free threads in the pool. Okay. Now the big question: How do I wait for a free thread? Sure, I could call QThreadPool::waitForDone(), but this will wait until all running threads have finished. I only need one free thread to kick off the next task. After all I ended up with something like:

@QList<Foo> data; //<- There are a whole lot of items in here!
QThreadPool *pool = new QThreadPool;

while(!data.isEmpty())
{
MyTask *task = new MyTask(data.takeFirst());

while(!pool->tryStart(task))
{
    QThread::msleep(250); //No more free thread, wait a bit and retry!
}

}@

Is there a "better" way, compared to this "active waiting" approach?

Actually I would need a QThreadPool::waitForOne() or QThreadPool::waitForThreads(1) here!

Can this be emulated?

DerManu

I usually do this by having my runnable report when it's done, and thus a thread is now free. So the runnable has a "finished" signal that I connect to a "startNextRunnable" slot which adds a new runnable to the threadpool.
Note that this solution is only good when not using the global QThreadPool instance, because for it to work seamlessly, every object in the threadpool must report when it's done. This is not guaranteed for the global thread pool, which might cause one or more threads to be idling until the next reporting runnable finishes and the threadpool is filled with your runnables again. If you still want to use the global pool, try adding twice the amount of runnables than the maximum number of threads in the beginning, so there is some "buffering" or "pool-internal queuing" incase a non-reporting runnable finishes, and no threads will idle.

MuldeR

I came up with a similar solution to my problem. Though I don't use a Signal, as the "outer" thread doesn't run an event loop. Instead I use a QWaitCondition object. If tryStart() fails, then the "outer" thread will call QWaitCondition::wait(). At the same time each of my QRunnable's will call QWaitCondition::wakeAll() on the same QWaitCondition object in its destructor. It just needs to give each QRunnable a pointer to my QWaitCondition.

@QList<Foo> data; //<- There are a whole lot of items in here!
QThreadPool *pool = new QThreadPool;
QWaitCondition *waitCond = new QWaitCondition;
QMutex *mutex = new QMutex;

while(!data.isEmpty())
{
MyTask *task = new MyTask(data.takeFirst(), waitCond);

while(!pool->tryStart(task))
{
    mutex->lock();
    waitCond->wait(mutex, 5000); //No more free thread, wait until a thread has finished
    mutex->unlock();
}

}
@ @MyTask::~MyTask(void)
{
m_waitCond->wakeAll();
}@

(There still is one race condition, at least in theory: If tryStart() fails, because there is no "free" thread, but then all the QRunnable's finish all of a sudden before the wait() is entered, it could happen that there never will be a wakeAll() that wakes the "outer" thread. I think the chance for this to happen is pretty much zero. Nonetheless I use a timeout value in wait() for correctness. If the timeout triggers and there still is no "free" thread, tryStart() will simply fail and we'll wait again. Though the timeout value can now be much bigger, compared to the "active waiting" approach. So far this solution seems to work okay, but a QWaitCondition::waitForOne() would be nice!)

miroslav

I think this approach is not a good one. It adds a lot of overhead to the queueing, will end up in threads idling to get new work, and not achieve the best performance.

What you really want is to make sure that the thread pool queue holds only up to, say, 200 (n) jobs. You can use a semaphore for that. Start with n resources. In a loop, call acquire() on the semaphore, and add a job from the pool. From the runnables, call release() on the semaphore when they are about to exit and be deleted. acquire() will block if no more resources are available in the semaphore. Just make sure to exit the queueing loop when all jobs have been queued.

200 is an arbitrary number, you will have to pick one according to your app's situation. It should be a number of times bigger than the number of threads to make sure that those never idle.

MuldeR

[quote author="miroslav" date="1337785032"]I think this approach is not a good one. It adds a lot of overhead to the queueing, will end up in threads idling to get new work, and not achieve the best performance.[/quote]

I don't quite get this. I think my code will start as many tasks as there are threads in the pool. The "main" thread won't block, unless all threads in the pool have been used up. Then, as soon as one of the tasks finishes (and thus we have a free thread again) we will immediately wake up the main thread to launch the next task.

Okay, depending on how the OS schedules the threads, there may be a short delay between finishing one task and kicking off the next one. Is that what you mean? Well, during this short moment one of the threads may be "idle" for a few milliseconds. But it's only one of the threads from the pool (the other threads are still busy with the tasks not finished yet). And I think the delay is negligible in my case, as my tasks run for several seconds.

I see that keeping a larger number task in the pool's queue might make sense for extremely "short-lived" tasks. For my app creating ~200 task objects would probably consume an unjustified amount of memory...

miroslav

Of course I don't know much about the actual application, so my advice might be a bit off. The number 200 was of course just a suggestion.

More generally, what sounds strange to me is that only adding tasks when a thread becomes available is kind of duplicating what the thread pool does anyway - the threads will take on any job that becomes available as soon as a thread goes idle. That is why I think making sure there are always a few jobs in the queue, but not too many should already do the trick.

Maybe you can summarize your experience once everything is working, this is quite interesting.

MuldeR

Yup, if I could create all task objects at once and then add them all to the queue, this would be the easiest solution. But there are two reasons why I don't want to do it like that: First of all, the number of tasks can be quite large, so I want to create the task objects "on demand". Secondly, the result of some of the tasks may create additional tasks, so "main" has to collect the result when a task finishes and, in some cases, append new tasks. That's why I was looking for a QThreadPool::waitForOne() or QThreadPool::waitForN(x) method. I think "my" solution does emulate a waitForOne() and it works for my application. I haven't implemented the "semaphore" solution yet. I guess something like "N = 2 * maxThreadCount()" would be sufficient. But I doubt I would see a noteworthy speed-up in my application, because tasks run for 1-2 seconds and the overhead should be negligible. Doesn't need to be the "optimal" solution for everybody. Still I think QThreadPool::waitForOne() could be useful...