Qt 4.8.x QThread Showstopper

FeralUrchin

I developed a multithreaded app, using QThread, in Qt 4.6.2. The app works fine also in Qt 4.7.4. However, in both Qt 4.8.3 and Qt 4.8.4, my app fails due to changes in qthread_unix.cpp [the same problem occurs within qthread_win.cpp]. Specifically, in Qt 4.8.x the emit finished() signal is issued before the running and finished attributes are set. In Qt 4.6.2 and Qt 4.7.4, the running and finished attributes are correctly set before the emit finished() signal is sent.

My app uses several QThreads in an array, with a common slot for handling the finished() signal. Thus, my slot needs to inspect the finished attributes of threads to determine which one raised the signal [the signal doesn't indicate by itself which thread finished].

Here's a snippet from the Qt 4.7.4 qthread_unix.cpp from line 355:

@ d->priority = QThread::InheritPriority;
d->running = false;
d->finished = true;
if (d->terminated)
emit thr->terminated();
d->terminated = false;
emit thr->finished();@

Note that the running and finished attributes are set before emitting finished(). This sequence enables my slot handling the finish() signal reliably to determine which thread signalled finish(). This same (correct) sequence is in qthread_win.cpp also.

On the other hand, in Qt 4.8.4, the snippet from qthread_unix.cpp reads (from line 359):

@ emit thr->finished();
QCoreApplication::sendPostedEvents(0, QEvent::DeferredDelete);
QThreadStorageData::finish((void **)data);
locker.relock();
d->terminated = false;
QAbstractEventDispatcher *eventDispatcher = d->data->eventDispatcher;
if (eventDispatcher) {
d->data->eventDispatcher = 0;
locker.unlock();
eventDispatcher->closingDown();
delete eventDispatcher;
locker.relock();
}

d->thread_id = 0;
d->running = false;
d->finished = true;@

Note that the running and finished attributes are set AFTER the finished() signal is sent. This prevents my slot handling the finished() signal from reliably determining, by examining the finished attribute of all threads, which thread just finished. Again, qthread_win.cpp exhibits the same issue.

This issue is also present in Qt 4.8.x MinGW.

andre

Why don't you just use sender() in your slot? Or better yet, a QSignalMapper?

koahnig

welcome to devnet

Did you check on "JIRA":https://bugreports.qt-project.org/secure/Dashboard.jspa for a bug report on this?
There is a "bug report QTBUG-19783":https://bugreports.qt-project.org/browse/QTBUG-19783 there. However, this has been issued for version 4.7.3. That is before the version you verified to work. Than it has been closed and linked to previous very old bug report.

Probably it is best you check these bug reports (I have used a Query "qthread AND finish"). It looks to me that you might be able to add significant information there.

koahnig

@Andre You have actually issued the bug report 19783. Is this something completely different?

andre

Different issue indeed. This issue was about a possible race condition I believe.

FeralUrchin

Here is corrective code for QThread 4.8.4 qthread_unix.cpp. It reorders setting running and finished attributes with emitting finished(), and keeps the thread lock during the entire function. These changes mirror the corresponding actions in Qt 4.7.4 qthread_unix.cpp, and allow my app to run normally.

@void QThreadPrivate::finish(void *arg)
{
QThread *thr = reinterpret_cast<QThread *>(arg);
QThreadPrivate *d = thr->d_func();

QMutexLocker locker(&d->mutex);

d->isInFinish = true;
d->priority = QThread::InheritPriority;
bool terminated = d->terminated;
void *data = &d->data->tls;
if (terminated)
    emit thr->terminated();
d->running = false;
d->finished = true;
emit thr->finished();
QCoreApplication::sendPostedEvents(0, QEvent::DeferredDelete);
QThreadStorageData::finish((void **)data);

QAbstractEventDispatcher *eventDispatcher = d->data->eventDispatcher;
if (eventDispatcher) {
    d->data->eventDispatcher = 0;
    eventDispatcher->closingDown();
    delete eventDispatcher;
}

d->thread_id = 0;

d->isInFinish = false;
d->thread_done.wakeAll();

locker.unlock();

}@

tobias.hunger

I am pretty sure this was changed for a reason, so I suggest checking the commit message and bug reports associated with this change before reverting it.

The simplest way to find out is to clone the repository and do a "git blame <filename>". It will then list the last person and change-sha for each and every line before the actual line. Then look up the relevant sha using "git show <sha>".

FeralUrchin

Toblas, I don't doubt it was changed for a reason. One reason, evidently, was to introduce QMutexLocker to replace direct locking and unlocking of d->mutex (in 4.7.4). If you examine the 4.8.4 code carefully, I believe you will agree that it prematurely unlocks the thread and permits the finished() slot to run--which has a perfect right immediately to delete the thread, which will still be running in the bottom half of the code. In my testing of preliminary fixes, I actually experienced this bug. "locker" needs to remain locked until the finished() slot can safely delete the thread.

Below is a slight revision of proposed corrective code:

@void QThreadPrivate::finish(void *arg)
{
QThread *thr = reinterpret_cast<QThread *>(arg);
QThreadPrivate *d = thr->d_func();

QMutexLocker locker(&d->mutex);

d->isInFinish = true;
d->priority = QThread::InheritPriority;
bool terminated = d->terminated;
void *data = &d->data->tls;
if (terminated)
    emit thr->terminated();
d->running = false;
d->finished = true;
emit thr->finished();
QCoreApplication::sendPostedEvents(0, QEvent::DeferredDelete);
QThreadStorageData::finish((void **)data);

QAbstractEventDispatcher *eventDispatcher = d->data->eventDispatcher;
if (eventDispatcher) {
    d->data->eventDispatcher = 0;
    eventDispatcher->closingDown();
    delete eventDispatcher;
}

d->isInFinish = false;

d->thread_done.wakeAll();

d->thread_id = 0;

locker.unlock();
}@

This code has been tested on an 8 core cpu running at 3.6 GHz, in an app running over 1,000 processes asynchronously served by a pool of 8 QThreads. Each process loads a QImage and paints a thumbnail.

In summary, while I can't guarantee that my code has no adverse side effects, I am convinced--by my own testing and examination of 4.7.4 versus 4.8.4 code in qthread_xxx.cpp--that the 4.8.4 bug is real.

I hope I've invested enough effort in this matter in order to leave it to someone in the Qt project to review.

FeralUrchin

To Andre:

I believe that whatever means is used to identify the finished() thread, by the time the user finished() slot is run, the code in the slot should be able reliably to query thread attributes, and safely to delete the thread. Both these criteria were met in the 4.7.4 qthread_xxx.cpp, but neither is met in 4.8.4--as is evident by comparing the code in the two versions.

JKSH

For now, a possible work-around is to connect QThread::finished() using Qt::QueuedConnection. That will ensure that your slot is invoked AFTER control returns to the event loop, i.e. after the relevant attributes have been set

FeralUrchin

Greetings, JKSH, and thanks for the suggestion. I may give your suggestion a try. Since one of the first things my slot does is to delete the signalling thread, will your workaround prevent my deleting a still-running thread? I'm concerned that it may not because locker.unlock() seems to be called prematurely.

An alternative workaround might be to wait in the slot until finished is true and isinfinish is false. But I'd prefer to press for a fix in 4.8.x.

My project, CIM, is posted, with source code on sourceforge.net. For now I'm building 4.8.4 with the code above. I plan to include the modified qthread_xxx.cpp files in the distribution, and am considering providing the whole Qt source as qt-everywhere-opensource-src-4.8.4.a.tar.gz so that if others have no choice other than to build with 4.8.x (Mint comes with 4.8.3), they can rebuild Qt before building my app. My current recommendation to users is that they build with 4.7.4.

JKSH

The QThread object lives in the main thread (NOT in the thread that it controls!), so QThreadPrivate::finish() runs in the main thread. It sounds like your slots run in the main thread too, so locker.unlock() by itself can't cause the slots to start running. (The slots might run due to a signal emission via Qt::DirectConnection, but that's a different matter, as described below)

In your case, the default connection type is Qt::DirectConnection. The connected slot will be invoked immediately when QThread::finished() is emitted, i.e. in the middle of QThreadPrivate::finish().

However, if you specify a Qt::QueuedConnection instead, then the connected slot will be invoked after control returns to the event loop, i.e. after the function-that-emits-QThread::finished() returns, i.e. after QThreadPrivate::finish() returns.

So, as long as QThreadPrivate::finish() is guaranteed to be called after QThread::run() returns (which I believe is the case but I haven't checked), then using a queued connection guarantees this order (without modifying QThread):

QThread::run() returns

QThreadPrivate::finish() starts running

QThread::finished() is emitted

d->finished is set

QThreadPrivate::finish() returns

Slot is invoked

Hope this helps. (I do agree that QThread should be fixed though, so that the default connection doesn't cause problems like this)

P.S. You said "one of the first things my slot does is to delete the signalling thread" -- note that a QThread is NOT a thread -- it is a thread manager.

andre

[quote author="FeralUrchin" date="1363472666"]To Andre:

I believe that whatever means is used to identify the finished() thread, by the time the user finished() slot is run, the code in the slot should be able reliably to query thread attributes, and safely to delete the thread. Both these criteria were met in the 4.7.4 qthread_xxx.cpp, but neither is met in 4.8.4--as is evident by comparing the code in the two versions.[/quote]

I think that your assertion that you should be able to safely delete the QThread object from its finished signal is simply not true. Finished tells you that the thread managed by QThread is done processing. It does not say, I think, that QThread itself won't do anything else.

Note that I do think that at the moment an object signals a state change, this state should be consistent with what you get querying the same object for that state. So, first change the flag, then signal the change.

For your case, I'd just connect the QThreads finished signal to QThreads deleteLater slot and be done with it.

FeralUrchin

To JKSH,
Thanks for the further analysis. I don't have the background to become involved in a technical discussion of QThread. I simply observe that my app worked in Qt 4.6.2 and Qt 4.7.4, and that it fails in Qt 4.8.3 and Qt 4.8.4 due to changes in qthread_unix.cpp.

If I modify qthread_unix.cpp to set attributes running and finished prior to its emitting finished(), my app is able to detect the finished thread; however, when my finished() slot deletes the thread it causes a crash because the thread is still running.

If, in addition to making the above change, I keep locker locked (without any unlocking or relocking) until the end of QThreadPrivate::finished(), then my app works normally as it did in 4.6.2 and 4.7.4.

I think any technical analysis of threading in Qt needs to be consistent with these observable behaviors.

I do appreciate your support for fixing QThread. Thank you again.

FeralUrchin

To Andre,
Thanks for your continued interest and support. Please see my response to JKSH above. I believe my code (although I acknowlege there were alternative ways to write it) is reasonable and consistent with Qt documentation in versions 4.6.2 and 4.7.4 as well as in Qt 4.8.4.

QThread behavior changes beginning somewhere in 4.8.x, in a way that caused my app to begin failing. Whoever made the changes in qthread_unix.cpp (and in qthread_win.cpp) did so without regard to the possible effects on exiting apps, and seems not to have modified the Qt documentation to reflect these changes.

Here is the documentation for the Qt 4.8.4 QThread::finished() signal: "This signal is emitted when the thread has finished executing." I claim that it should be possible safely to delete a thread when it "has finished executing".

Qt users need straightforward means to manage their threads. In my mind, it's clear that qthread_unix.cpp and qthread_win.cpp (Qt 4.8.x MinGW is also affected per my testing) need to be fixed.

FeralUrchin

I see that Oliver Goffart and Thiago Macieira have discovered that this bug was fixed in Qt 5.x, and the fix will be backported to Qt 4.8.x. Many, many thanks to each.

I'm wondering if this issue in Qt 4.8.x was only noticed now because I'm running an 8 core CPU at 3.6 GHz in a heavily multithreaded app.

If I can be of any service in testing the revised Qt 4.8.x code, I would be delighted to do so. Anyone could just email me the new qthread_unix.cpp QThreadPrivate::finished() code since it is so brief.

Many thanks to all who took the time to look into this issue, including Andre, JKSH, and especially Oliver and Thiago.

Jim

Discover and share your #QtStories

Upcoming Forum Update April 22nd