Weird crashes in multi-threaded application



  • Hello everyone,

    A few months ago, I posted a topic about crashes inside my Qt application (http://forum.qt.io/topic/54980/random-crashes-after-switching-to-multi-thread).
    Since then, things got better but there is still a weird crash there. Recently, I successfully managed to compile Qt under Windows with debug symbols for release libraries.
    I deployed a "test" version of my application to users who experienced those crashes and received back some dumps.

    Since I got debug symbols for Qt too, I was able to get a full stack trace for those crashes and... well, they're weird.
    Apparently, it crashes inside the QSlotObject class, when the impl function goes out of scope.
    The weird thing is that in the stack trace, there are only Qt-classes, the crash isn't occurring in my code directly.

    Here are two screenshots showing the stack trace and source codes where the crash occurs.
    https://i.imgur.com/d3lAzY9.png, https://i.imgur.com/LrGI719.png

    Some more information :

    • The crash isn't occurring inside the main thread, it occurs inside a thread I created.
    • The crash occurs after a slot is called, after reception of a signal.
    • The class emitting the signal and the class containing the slot are both in the same thread.
    • According to the stack trace, the slot itself doesn't seem to be the cause of the crash, apparently, it crashes when the impl function goes out of scope, after calling the slot function.
    • Those crashes seem to be related to multi-threading because I added a "single-thread mode" to the application and those crashes don't occur while running it in this mode.
    • I wasn't able to reproduce the crash, only some users experience it, not all of them.
    • This crash seem to happen only on Windows (no bug report on OS X or Linux).
    • According to the users, that crash is happening in the first 30 minutes (roughly) after the work began (that means, after the thread is created). If no crash occurred during this time, they're safe and the application can run day and night without issues.

    So really, I don't know how to interpret that stack trace, I don't know why it is crashing here and why the only classes in the stack trace are Qt classes.
    Does anybody have any clue ?

    If you need more info, feel free to ask.
    Thanks in advance.


  • Moderators

    @Moonlight-Angel said:

    I wasn't able to reproduce the crash, only some users experience it, not all of them.

    Sounds like you have a race condition.

    Try using helgrind to scan for threading errors in your code: http://www.kdab.com/helgrind-howto/



  • Thanks for your help.
    I tried using helgrind (I only followed the Suppressions section and the alias for helgrind in the link you provided as I'm using Qt 5.5) and started my application with helgrind.
    As I knew what slot was called thanks to the dump, I managed to reproduce the actions for the slot to be called again (with helgrind attached).

    There was a whole lot of errors (I assume this is normal and some may be false positives) but unfortunately, there was absolutely no error related to that slot call.
    There was some errors at the creation and deletion of the QThread and all the time between, no error shown.

    Maybe I'm interpreting helgrind's output wrong ?


  • Moderators

    @Moonlight-Angel said:

    There was some errors at the creation and deletion of the QThread

    Are you able to fix these errors?



  • Here's an example of the errors I got.
    http://hastebin.com/qalutirici.coffee

    This seems like a false positive or something that isn't in my code (maybe I'm wrong).
    In each error I got, the "at" is pointing to a function that isn't part of my code.



Looks like your connection to Qt Forum was lost, please wait while we try to reconnect.