Solved Segfault when calling QWidget::show (on Debian 9)
-
@Christian-Ehrlicher said in Segfault when calling QWidget::show (on Debian 9):
Some code would be good...
Sorry for not posting code here, @Christian-Ehrlicher, but my question is related to code I work on professionally and for as far as I know I am not allowed to share any code.
-
@Bart_Vandewoestyne said in Segfault when calling QWidget::show (on Debian 9):
allowed to share any code.
Then good luck. We can't guess your code...
Apart from this you already shared code.
-
@Christian-Ehrlicher said in Segfault when calling QWidget::show (on Debian 9):
I would guess this is either a nullptr or not initialized. Build your app with debug information, go to stack frame 7 and print out the value of mpMainWindow .
I've added the
-g
option to our release build and when I run the application in gdb it now segfaults with the following call stack:user@debianvbox:~/SVN/PolarisRel/Apps$ gdb ./PolarisSlave GNU gdb (Debian 7.12-6) 7.12.0.20161007-git Copyright (C) 2016 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from ./PolarisSlave...done. (gdb) r Starting program: /home/user/SVN/PolarisRel/Apps/PolarisSlave [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". Program received signal SIGSEGV, Segmentation fault. strlen () at ../sysdeps/x86_64/strlen.S:106 106 ../sysdeps/x86_64/strlen.S: No such file or directory. (gdb) bt #0 strlen () at ../sysdeps/x86_64/strlen.S:106 #1 0x00007ffff3e101ed in XSetCommand () from /usr/lib/x86_64-linux-gnu/libX11.so.6 #2 0x00007ffff3e147f0 in XSetWMProperties () from /usr/lib/x86_64-linux-gnu/libX11.so.6 #3 0x00007ffff659007d in QWidgetPrivate::create_sys(unsigned long, bool, bool) () from /home/user/SVN/PolarisRel/ThirdParty/Qt/qt-install/lib/libQtGui.so.4 #4 0x00007ffff6548769 in QWidget::create(unsigned long, bool, bool) () from /home/user/SVN/PolarisRel/ThirdParty/Qt/qt-install/lib/libQtGui.so.4 #5 0x00007ffff6550697 in QWidget::setVisible(bool) () from /home/user/SVN/PolarisRel/ThirdParty/Qt/qt-install/lib/libQtGui.so.4 #6 0x000055555594a1fd in QWidget::show (this=<optimized out>) at ../../ThirdParty/Qt/qt-install/include/QtGui/qwidget.h:497 #7 BSPPolarisSlave::mfRun (this=0x7fffffffe0e0, argc=<optimized out>, argv=<optimized out>, errormsg=...) at BSPPolarisSlave.cpp:443 #8 0x0000555555c010da in ICService::mfExec(int, char**, QString&, bool) () #9 0x0000555555bfd90e in ICService::mfParseArguments(int, char**, bool) () #10 0x000055555594dbb3 in BSPPolarisSlave::mfParseArguments (this=0x7fffffffe0e0, argc=2, argv=0x7fffffffe258) at BSPPolarisSlave.cpp:659 #11 0x000055555592a3ad in main (argc=1, argv=0x7fffffffe258) at BSPPolarisSlaveMain.cpp:71 (gdb) f 7 #7 BSPPolarisSlave::mfRun (this=0x7fffffffe0e0, argc=<optimized out>, argv=<optimized out>, errormsg=...) at BSPPolarisSlave.cpp:443 443 mpMainWindow->show(); (gdb) p mpMainWindow $1 = (BSPPolarisSlaveMainWindow *) 0x555556627c40
Some things I noticed are:
mpMainWindow
is notnullptr
.- In the call to
BSPPolarisSlave::mfRun
argc
andargv
are marked as 'optimized out'... and similarly, in the call toQWidget::show
, thethis
parameter is also 'optimized out'. I have not much experience with gdb (most of the time, I debug in the Visual Studio debugger)... but could this 'optimizing out' be the problem?
-
@Bart_Vandewoestyne said in Segfault when calling QWidget::show (on Debian 9):
mpMainWindow is not nullptr
are you sure it is initialised then ? gdb, in contrast to its MSVC equivalent, does no null initialisations during debug runs. So an uninitialised pointer is very rarely a nullptr
-
@J-Hilk said in Segfault when calling QWidget::show (on Debian 9):
are you sure it is initialised then ? gdb, in contrast to its MSVC equivalent, does no null initialisations during debug runs. So an uninitialised pointer is very rarely a nullptr
For as far as I can see yes, because right before the call to
show()
, the pointer is initialized:mpMainWindow = new BSPPolarisSlaveMainWindow(this, windowsCaption, 0, true, Qt::Window | Qt::WindowTitleHint | Qt::WindowSystemMenuHint); connect(mpApplication, SIGNAL(lastWindowClosed()), mpApplication, SLOT(quit())); mpMainWindow->show();
-
@Christian-Ehrlicher said in Segfault when calling QWidget::show (on Debian 9):
Then good luck. We can't guess your code...
Apart from this you already shared code.
In the past on this forum, I've had good answers leading to a solution even without sharing code. I do believe that's possible :-)
And you are right: I have shared some code snippets. That is indeed not consistent with what I wrote, but I am somehow assuming that I am allowed to share small, non meaningful snippets of code that do not reveal any company secrets, if that can help us get to a solution quicker. I hope no one in our company will blame me for that... Finding the right balance between what you can share in order to get to a solution quicker is not always easy, but I try to find that balance.
-
Some more info on this problem:
- It is only a release build on Debian 9 that segfaults. As mentioned earlier, the debug build on Debian 9 runs fine.
- Release builds and debug builds on Debian 8 and Red Hat Enterprise Linux 8.5 run fine!
-
@Christian-Ehrlicher said in Segfault when calling QWidget::show (on Debian 9):
wrt to your strange copy stuff (whyever you need to modify your command line - sounds like a strange hack for me):
std::vector<char*> newArgs; newArgs.push_back(argv[0]); newArgs.push_back(const_cast<char*>("-e")); for (int i = 1; i < argc; ++i) newArgs.push_back(argv[i]); argc += 1; ...mfParseArguments(argc, newArgs.data());
I totally agree that that modification of the command line is strange. Note that this was not my idea, but I inherited this legacy code from my predecessors :-(
I tried your suggestion usingstd::vector
instead of using an array ofchar*
, but that also didn't solve the segfault. -
@Bart_Vandewoestyne
Just so you know. Your segfault emanates from this line: https://code.woboq.org/kde/qt4/src/gui/kernel/qwidget_x11.cpp.html#804XSetWMProperties(dpy, id, 0, 0, qApp->d_func()->argv, qApp->d_func()->argc, &size_hints, &wm_hints, &class_hint);
(Doubtless some sort of X set window manager properties on start up?) It's on a
strlen()
from there, so presumably some element inqApp->d_func()->argv
is wrong. So you're still on theargv
issue. Try to print out everything in the lastargv
you pass on. -
@Bart_Vandewoestyne new debian, huh. New/updated compiler then as well?
I assume you have tried the release build with
-O0
? -
@J-Hilk or run it with valgrind (compile with -O2 and -g)
-
@Christian-Ehrlicher never used valgrind before, as I usually don't do linux stuff. But I trust your expertise :D
oh it also now supports macOS, maybe I should give it a try sometime soon than!
-
@J-Hilk said in Segfault when calling QWidget::show (on Debian 9):
@Bart_Vandewoestyne new debian, huh. New/updated compiler then as well?
Yes, due to the switch from Debian 8 to Debian 9, a new compiler as well. Debian 8 (where everything works) has
dev@debian8:~$ g++ --version | head -1 g++ (Debian 4.9.2-10+deb8u2) 4.9.2
while Debian 9 (where the release build segfaults) has
user@debianvbox:~$ g++ --version | head -1 g++ (Debian 6.3.0-18+deb9u1) 6.3.0 20170516
I assume you have tried the release build with
-O0
?I hadn't, but now I have ;-) And I have interesting news: when using
-O0
the segfault is gone! From-O1
and further, we get the segfault. -
@Christian-Ehrlicher said in Segfault when calling QWidget::show (on Debian 9):
@J-Hilk or run it with valgrind (compile with -O2 and -g)
I have no experience with valgrind, but looks like a good suggestion so I will try and report back.
-
@Christian-Ehrlicher said in Segfault when calling QWidget::show (on Debian 9):
@J-Hilk or run it with valgrind (compile with -O2 and -g)
OK, so I compiled with
-O2
and-g
and ran my program through valgrind. This is what I got:user@debianvbox:~/SVN/PolarisRel/Apps$ valgrind ./PolarisSlave ==5165== Memcheck, a memory error detector ==5165== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al. ==5165== Using Valgrind-3.12.0.SVN and LibVEX; rerun with -h for copyright info ==5165== Command: ./PolarisSlave ==5165== ==5165== Invalid read of size 8 ==5165== at 0x8B3B205: XSetCommand (in /usr/lib/x86_64-linux-gnu/libX11.so.6.3.0) ==5165== by 0x8B3F7EF: XSetWMProperties (in /usr/lib/x86_64-linux-gnu/libX11.so.6.3.0) ==5165== by 0x5EE707C: QWidgetPrivate::create_sys(unsigned long, bool, bool) (in /home/user/SVN/PolarisRel/ThirdParty/Qt/qt-install/lib/libQtGui.so.4.8.7) ==5165== by 0x5E9F768: QWidget::create(unsigned long, bool, bool) (in /home/user/SVN/PolarisRel/ThirdParty/Qt/qt-install/lib/libQtGui.so.4.8.7) ==5165== by 0x5EA7696: QWidget::setVisible(bool) (in /home/user/SVN/PolarisRel/ThirdParty/Qt/qt-install/lib/libQtGui.so.4.8.7) ==5165== by 0x4FE1FC: show (qwidget.h:497) ==5165== by 0x4FE1FC: BSPPolarisSlave::mfRun(int, char**, QString&) (BSPPolarisSlave.cpp:443) ==5165== by 0x7B50D9: ICService::mfExec(int, char**, QString&, bool) (in /home/user/SVN/PolarisRel/Apps/PolarisSlave) ==5165== by 0x7B190D: ICService::mfParseArguments(int, char**, bool) (in /home/user/SVN/PolarisRel/Apps/PolarisSlave) ==5165== by 0x501BB2: BSPPolarisSlave::mfParseArguments(int, char**) (BSPPolarisSlave.cpp:659) ==5165== by 0x4DE3AC: main (BSPPolarisSlaveMain.cpp:71) ==5165== Address 0xbe71780 is 0 bytes after a block of size 16 alloc'd ==5165== at 0x4C2C93F: operator new[](unsigned long) (vg_replace_malloc.c:423) ==5165== by 0x501A27: BSPPolarisSlave::mfParseArguments(int, char**) (BSPPolarisSlave.cpp:637) ==5165== by 0x4DE3AC: main (BSPPolarisSlaveMain.cpp:71) ==5165== ==5165== Invalid read of size 1 ==5165== at 0x4C2EDA2: strlen (vg_replace_strmem.c:454) ==5165== by 0x8B3B1EC: XSetCommand (in /usr/lib/x86_64-linux-gnu/libX11.so.6.3.0) ==5165== by 0x8B3F7EF: XSetWMProperties (in /usr/lib/x86_64-linux-gnu/libX11.so.6.3.0) ==5165== by 0x5EE707C: QWidgetPrivate::create_sys(unsigned long, bool, bool) (in /home/user/SVN/PolarisRel/ThirdParty/Qt/qt-install/lib/libQtGui.so.4.8.7) ==5165== by 0x5E9F768: QWidget::create(unsigned long, bool, bool) (in /home/user/SVN/PolarisRel/ThirdParty/Qt/qt-install/lib/libQtGui.so.4.8.7) ==5165== by 0x5EA7696: QWidget::setVisible(bool) (in /home/user/SVN/PolarisRel/ThirdParty/Qt/qt-install/lib/libQtGui.so.4.8.7) ==5165== by 0x4FE1FC: show (qwidget.h:497) ==5165== by 0x4FE1FC: BSPPolarisSlave::mfRun(int, char**, QString&) (BSPPolarisSlave.cpp:443) ==5165== by 0x7B50D9: ICService::mfExec(int, char**, QString&, bool) (in /home/user/SVN/PolarisRel/Apps/PolarisSlave) ==5165== by 0x7B190D: ICService::mfParseArguments(int, char**, bool) (in /home/user/SVN/PolarisRel/Apps/PolarisSlave) ==5165== by 0x501BB2: BSPPolarisSlave::mfParseArguments(int, char**) (BSPPolarisSlave.cpp:659) ==5165== by 0x4DE3AC: main (BSPPolarisSlaveMain.cpp:71) ==5165== Address 0x50 is not stack'd, malloc'd or (recently) free'd ==5165== ==5165== ==5165== Process terminating with default action of signal 11 (SIGSEGV) ==5165== Access not within mapped region at address 0x50 ==5165== at 0x4C2EDA2: strlen (vg_replace_strmem.c:454) ==5165== by 0x8B3B1EC: XSetCommand (in /usr/lib/x86_64-linux-gnu/libX11.so.6.3.0) ==5165== by 0x8B3F7EF: XSetWMProperties (in /usr/lib/x86_64-linux-gnu/libX11.so.6.3.0) ==5165== by 0x5EE707C: QWidgetPrivate::create_sys(unsigned long, bool, bool) (in /home/user/SVN/PolarisRel/ThirdParty/Qt/qt-install/lib/libQtGui.so.4.8.7) ==5165== by 0x5E9F768: QWidget::create(unsigned long, bool, bool) (in /home/user/SVN/PolarisRel/ThirdParty/Qt/qt-install/lib/libQtGui.so.4.8.7) ==5165== by 0x5EA7696: QWidget::setVisible(bool) (in /home/user/SVN/PolarisRel/ThirdParty/Qt/qt-install/lib/libQtGui.so.4.8.7) ==5165== by 0x4FE1FC: show (qwidget.h:497) ==5165== by 0x4FE1FC: BSPPolarisSlave::mfRun(int, char**, QString&) (BSPPolarisSlave.cpp:443) ==5165== by 0x7B50D9: ICService::mfExec(int, char**, QString&, bool) (in /home/user/SVN/PolarisRel/Apps/PolarisSlave) ==5165== by 0x7B190D: ICService::mfParseArguments(int, char**, bool) (in /home/user/SVN/PolarisRel/Apps/PolarisSlave) ==5165== by 0x501BB2: BSPPolarisSlave::mfParseArguments(int, char**) (BSPPolarisSlave.cpp:659) ==5165== by 0x4DE3AC: main (BSPPolarisSlaveMain.cpp:71) ==5165== If you believe this happened as a result of a stack ==5165== overflow in your program's main thread (unlikely but ==5165== possible), you can try to increase the size of the ==5165== main thread stack using the --main-stacksize= flag. ==5165== The main thread stack size used in this run was 8388608. ==5165== ==5165== HEAP SUMMARY: ==5165== in use at exit: 1,121,308 bytes in 9,104 blocks ==5165== total heap usage: 22,310 allocs, 13,206 frees, 4,039,124 bytes allocated ==5165== ==5165== LEAK SUMMARY: ==5165== definitely lost: 2,944 bytes in 6 blocks ==5165== indirectly lost: 13,190 bytes in 537 blocks ==5165== possibly lost: 54,718 bytes in 437 blocks ==5165== still reachable: 1,050,456 bytes in 8,124 blocks ==5165== suppressed: 0 bytes in 0 blocks ==5165== Rerun with --leak-check=full to see details of leaked memory ==5165== ==5165== For counts of detected and suppressed errors, rerun with: -v ==5165== ERROR SUMMARY: 5 errors from 2 contexts (suppressed: 0 from 0) Segmentation fault
I'll try to decipher this myself, but if in the meanwhile someone more experienced with valgrind can point me in the right direction, that would be nice :-)
-
from my point of view, I would say the issue is with your strange string manipulation stuff.
You have to be very careful, when working with stringliterals, it is super easy to run into undefined behaviour, when you try to modify them.
-
@Bart_Vandewoestyne said in Segfault when calling QWidget::show (on Debian 9):
by 0x501A27: BSPPolarisSlave::mfParseArguments(int, char**) (BSPPolarisSlave.cpp:637)
This is where you have to take a look on. You do something wrong there for an argument.
-
Please provide the the line that @Christian-Ehrlicher mentioned; it's the call to
mfExec
. -
OK, I think we're getting there... In
ICBlackBoxBase::mfInitialize
we call theICBlackBoxBaseApplication
constructor which calls theQApplication
constructor with certainargc
andargv
arguments:ICBlackBoxBaseApplication::ICBlackBoxBaseApplication(int &argc, char** argv, ICBlackBoxBase* apApp) : QApplication(argc,argv), mpApp(apApp) { }
Now let's see what argc and argv we are passing there. I've set a breakpoint right before the location where we call this constructor, and this is the call stack:
(gdb) bt #0 ICBlackBoxBase::mfInitialize (this=0x7fffffffe0e0, argc=2, argv=0x5555565cd3d0, errormsg=...) at ICBlackBoxBase.cpp:101 #1 0x0000555555c0101d in ICService::mfExec(int, char**, QString&, bool) () #2 0x0000555555bfd86e in ICService::mfParseArguments(int, char**, bool) () #3 0x000055555594db13 in BSPPolarisSlave::mfParseArguments (this=0x7fffffffe0e0, argc=2, argv=0x7fffffffe258) at BSPPolarisSlave.cpp:654 #4 0x000055555592a3bd in main (argc=1, argv=0x7fffffffe258) at BSPPolarisSlaveMain.cpp:71
As you can see, in
main
we have thatargc
is 1, but inICBlackBoxBase::mfInitialize
(the function from which we call theICBlackBoxBaseApplication
constructor, and thus also theQApplication
constructor) we have thatargc
is 2 (since an extra-e
argument was added). Now let's look atargv
in bothmain
andICBlackBoxBase::mfInitialize
. Inmain
we have:(gdb) f 4 #4 0x000055555592a3bd in main (argc=1, argv=0x7fffffffe258) at BSPPolarisSlaveMain.cpp:71 71 return (polarisSlave.mfParseArguments(argc, argv)); (gdb) p argc $12 = 1 (gdb) p argv[0] $13 = 0x7fffffffe53a "/home/user/SVN/PolarisRel/Apps/PolarisSlave" (gdb) p argv[argc] $14 = 0x0
but in
ICBlackBoxBase::mfInitialize
we have:(gdb) f 0 #0 ICBlackBoxBase::mfInitialize (this=0x7fffffffe0e0, argc=2, argv=0x5555565cd3d0, errormsg=...) at ICBlackBoxBase.cpp:101 101 { (gdb) p argc $15 = 2 (gdb) p argv[0] $16 = 0x5555565cf830 "/home/user/SVN/PolarisRel/Apps/PolarisSlave" (gdb) p argv[1] $17 = 0x5555565cf940 "-e" (gdb) p argv[argc] $18 = 0x20 <error: Cannot access memory at address 0x20>
so there
argv[argc]
is not null! And now I have to find out why :-) -
@Bart_Vandewoestyne said in Segfault when calling QWidget::show (on Debian 9):
so there
argv[argc]
is not null! And now I have to find out why :-)Earlier I wrote:
I shall be surprised if it is this, but....
I think your code is not 100% technically correct. You do not
NULL
terminate your new vector. Technically you should find your originalargv
had an extra element at the end:argv[argc] == NULL
. You do not copy this orNULL
terminate your newnewArgvs
. E.g. https://stackoverflow.com/questions/16418932/is-argvargc-equal-to-null-pointer
— argv[argc] shall be a null pointer.It is not clear whether this matters or not. If code only uses
argc
to index up toargv[argc - 1]
then it does not. If code does do something about looking atargv[argc]
to check fornullptr
then it does matter. If you have the source code where it goes wrong you may be able to delermine.