Totally confusing segmentation violation
-
@mstoth Good luck! If you get more information I'd be happy to try and help solve it.
Also, have you tried it on any other systems? Something besides Ubuntu? Sometimes Ubuntu does funny things compared to other linux distros. Maybe there is an issue with the system fonts that could be causing it.
-
I am unable to use a different system. Unfortunately we are committed to using ubuntu for now. I have done several cleans and rebuilds. I do wonder about the comment regarding fonts since I see some mention of fonts in the stack trace. If it is a font problem, what can be done about that? I'm out of my depth here and my productivity is dropping due to excessive rebuilds and finger crossings. Sometimes it works and sometimes not (still). It does seem to work properly on the embedded device we are using (BeagleBone Black). I never have a problem running on our BeagleBone even if it fails on the desktop.
If I try to use the debugger I get left in assembly code since none of my code is in the stack trace. Just awful. Any more suggestions from anyone would be greatly appreciated. -
@mstoth
I have at least a suggestion.I run into something similar recently. After a while, I packed my project from my windows pc over to my Mac, and gave it a clean build with clanq. A got about 100 compiler warnings that I fixed => Haven't run into a the problem since then, either on Windows or Mac.
-
@mstoth Another idea since it is crashing in qgetenv... are you perchance multithreaded? Calls to the environment via getenv are not thread safe. So if somehow it is calling getenv in multiple threads that is why you will see the crash only occasionally.
And on the beaglebone it would be less likely to appear (in your case never) because the timing is so different compared to your desktop processor. The problem would still be there it just wouldn't show up as much.
When I was younger multi core cpus were not a thing, and running dual cpus was quite expensive. I remember building a multi cpu system just to test threading issues like this since it would change the timing and almost always crash if you had an issue like this.
Also, don't forget that valgrind is finding legitimate memory issues in your application. This means you definitely have a bug in there somewhere. It will be related to memory that is freed and then used. So look for dangling pointers.
-
I am not manually starting separate threads but it was my understanding that the application is multi-threaded due to the nature of the slots and signals. I emit a signal every time the system receives a message on a tcp socket. Each signal is matched with a slot in one or more objects. I assume that while one signal is being processed, it's possible to emit another one if a message comes into the socket while the first signal is being handled. When I get the crash, the Threads menu shows 12 threads; GenisysVoicesPanel (my app), QXcbEventReader, dconf worker, gmain, gdbus, QDBusConnection, pool, llvmpipe-0, llvmpipe-1, llvmpipe-2, llvmpipe-3.
I believe you are correct when you say there is a bug in my code but valgrind is not helping me too much. How can you trace the problem when valgrind (and the stack trace) shows nothing related to your code? Here's an example of what I mean. The output of valgrind says:
==9773== Invalid read of size 2 ==9773== at 0x67C180D: getenv (getenv.c:84) ==9773== by 0x5B6A120: qgetenv(char const*) (in /home/bbb_developer/Qt/5.6/gcc_64/lib/libQt5Core.so.5.6.1) ==9773== by 0xEB62E77: QFontEngineFT::QFontEngineFT(QFontDef const&) (in /home/bbb_developer/Qt/5.6/gcc_64/lib/libQt5XcbQpa.so.5.6.1) ==9773== by 0xEB2C27B: ??? (in /home/bbb_developer/Qt/5.6/gcc_64/lib/libQt5XcbQpa.so.5.6.1) ==9773== by 0x6D47EEB: ??? (in /home/bbb_developer/Qt/5.6/gcc_64/lib/libQt5Gui.so.5.6.1) ==9773== by 0x6D48523: QFontDatabase::findFont(QFontDef const&, int) (in /home/bbb_developer/Qt/5.6/gcc_64/lib/libQt5Gui.so.5.6.1) ==9773== by 0x6D4907C: QFontDatabase::load(QFontPrivate const*, int) (in /home/bbb_developer/Qt/5.6/gcc_64/lib/libQt5Gui.so.5.6.1) ==9773== by 0x6D207F2: QFontPrivate::engineForScript(int) const (in /home/bbb_developer/Qt/5.6/gcc_64/lib/libQt5Gui.so.5.6.1) ==9773== by 0x6D3D620: QFontMetricsF::leading() const (in /home/bbb_developer/Qt/5.6/gcc_64/lib/libQt5Gui.so.5.6.1) ==9773== by 0x6EC6AF2: ??? (in /home/bbb_developer/Qt/5.6/gcc_64/lib/libQt5Gui.so.5.6.1) ==9773== by 0x6ECD773: QPainter::drawText(QRect const&, int, QString const&, QRect*) (in /home/bbb_developer/Qt/5.6/gcc_64/lib/libQt5Gui.so.5.6.1) ==9773== by 0x54269B6: QStyle::drawItemText(QPainter*, QRect const&, int, QPalette const&, bool, QString const&, QPalette::ColorRole) const (in /home/bbb_developer/Qt/5.6/gcc_64/lib/libQt5Widgets.so.5.6.1) ==9773== Address 0x2 is not stack'd, malloc'd or (recently) free'd ==9773== ==9773== ==9773== Process terminating with default action of signal 11 (SIGSEGV) ==9773== Access not within mapped region at address 0x2 ==9773== at 0x67C180D: getenv (getenv.c:84) ==9773== by 0x5B6A120: qgetenv(char const*) (in /home/bbb_developer/Qt/5.6/gcc_64/lib/libQt5Core.so.5.6.1) ==9773== by 0xEB62E77: QFontEngineFT::QFontEngineFT(QFontDef const&) (in /home/bbb_developer/Qt/5.6/gcc_64/lib/libQt5XcbQpa.so.5.6.1) ==9773== by 0xEB2C27B: ??? (in /home/bbb_developer/Qt/5.6/gcc_64/lib/libQt5XcbQpa.so.5.6.1) ==9773== by 0x6D47EEB: ??? (in /home/bbb_developer/Qt/5.6/gcc_64/lib/libQt5Gui.so.5.6.1) ==9773== by 0x6D48523: QFontDatabase::findFont(QFontDef const&, int) (in /home/bbb_developer/Qt/5.6/gcc_64/lib/libQt5Gui.so.5.6.1) ==9773== by 0x6D4907C: QFontDatabase::load(QFontPrivate const*, int) (in /home/bbb_developer/Qt/5.6/gcc_64/lib/libQt5Gui.so.5.6.1) ==9773== by 0x6D207F2: QFontPrivate::engineForScript(int) const (in /home/bbb_developer/Qt/5.6/gcc_64/lib/libQt5Gui.so.5.6.1) ==9773== by 0x6D3D620: QFontMetricsF::leading() const (in /home/bbb_developer/Qt/5.6/gcc_64/lib/libQt5Gui.so.5.6.1) ==9773== by 0x6EC6AF2: ??? (in /home/bbb_developer/Qt/5.6/gcc_64/lib/libQt5Gui.so.5.6.1) ==9773== by 0x6ECD773: QPainter::drawText(QRect const&, int, QString const&, QRect*) (in /home/bbb_developer/Qt/5.6/gcc_64/lib/libQt5Gui.so.5.6.1) ==9773== by 0x54269B6: QStyle::drawItemText(QPainter*, QRect const&, int, QPalette const&, bool, QString const&, QPalette::ColorRole) const (in /home/bbb_developer/Qt/5.6/gcc_64/lib/libQt5Widgets.so.5.6.1) ==9773== If you believe this happened as a result of a stack ==9773== overflow in your program's main thread (unlikely but ==9773== possible), you can try to increase the size of the ==9773== main thread stack using the --main-stacksize= flag. ==9773== The main thread stack size used in this run was 8388608. ==9773== ==9773== HEAP SUMMARY: ==9773== in use at exit: 4,317,169 bytes in 34,765 blocks ==9773== total heap usage: 139,400 allocs, 104,635 frees, 21,351,017 bytes allocated ==9773== ==9773== LEAK SUMMARY: ==9773== definitely lost: 2,864 bytes in 11 blocks ==9773== indirectly lost: 13,196 bytes in 556 blocks ==9773== possibly lost: 16,507 bytes in 207 blocks ==9773== still reachable: 3,962,882 bytes in 32,477 blocks ==9773== of which reachable via heuristic: ==9773== length64 : 7,880 bytes in 116 blocks ==9773== newarray : 2,112 bytes in 52 blocks ==9773== multipleinheritance: 152 bytes in 1 blocks ==9773== suppressed: 0 bytes in 0 blocks ==9773== Rerun with --leak-check=full to see details of leaked memory ==9773== ==9773== For counts of detected and suppressed errors, rerun with: -v ==9773== ERROR SUMMARY: 4 errors from 4 contexts (suppressed: 0 from 0)
So from what I can tell, it all seems to start with QStyle::drawItemText() and I assume that must have something to do with a panel being drawn. There doesn't seem to be any information that points to where in my code it is happening however. I checked the panel I suspect (although it happens also randomly after this panel is displayed) and the fonts are all SansSerif so there is nothing unusual about the fonts that I can tell.
How can I learn from the valgrind information about where in my code the problem occurs? My ignorance is really a frustration! Thanks for your advice!
-
Try breaking at the segfault and see which widget's being painted as #25
QWidgetPrivate::drawWidget
or try to trace which is the object receiving the event (#30QApplication::notify
) - basically walk down the backtrace; the debugger should provide you with the locals and with some digging you should be able to find the culprit. Also you may consider GammaRay to try and see which widget is triggering the problem (apparently it's anexpose
event).Btw, you seem to have some holes in that backtrace ...?
-
@mstoth Signals/slots are not multithreaded by default. The QEventLoop is on a single thread and you can definitely starve it if you are doing things and don't give it a chance to process events.
If you are doing tcp/ip stuff I would recommend moving that to it's own thread. But that isn't the point of this thread, just wanted to let you know you may starve your event loop and freeze up your gui with a long tcpip delay. :)
As for your problem, yea valgrind's info isn't helping that much. It goes back to my first post on this thread. I think you are deleting a widget (inadvertently) while it still has events on the queue. I feel even stronger about this with your recent statements about signals/slots. Since you thought they were multithreaded that would lead me to believe even more that at some point you deleted something and once your stack bubbled back up to the event loop it tried to process a message on a deleted QWidget.
It's hard enough to find a bug like this when you have the source code, it is extremely difficult in a setting like this where we can't see any code. :/ All I can do is try to guide you based on what I've seen in the past based on my experience.
I would look for anywhere you delete widgets, either explicitly, i.e.
delete myWidget
, or indirectly. Indirect examples would be widgets on the stack, widgets that have been reparented to other widgets you may delete, etc. Calls todeleteLater()
should be ok though as they won't clean up until all events are dealt with. Also don't forget any smart pointers you may use. If they reach a 0 ref count for some reason they will auto delete. C++ smart pointers and Qt's smart pointers could both be the culprits here if you use them.Try following @kshegunov's advice above and see if you can find more info on the crash. That might help you narrow it down in your code.
-
@mstoth As I continue to work on this problem (very frustrating!) I have gotten to the point of just making a complete new form to replace a form that is giving me the segmentation violation. I found that if I create a blank form, it shows up fine. If i put one button on the form, and not even attach an action to the button, I get the segmentation violation (the identical looking stack trace shown above). I hope this may provide a clue to someone with more experience dealing with Qt. Is there anything one can do to a program that would allow you to create and show a blank form but not a form with a button? Desperate here! It still works on the embedded device, however I really can't do development on this product without it running on my desktop as well.
-
Try setting the
QT_NO_FT_CACHE
environment variable to something, either 0 or 1. This is done in the kit configuration you select the "Run" from the side panel (Qt creator 4.x) and then open the "Run environment list". I suspect this is a bug either in Qt, which is less likely at this point or in your redhat's version (or thefontconfig
library). As reference look at this report (albeit quite old). And these bits in Qt's source:
http://code.qt.io/cgit/qt/qtbase.git/tree/src/gui/text/qfontengine_ft.cpp?h=5.7#n686
http://code.qt.io/cgit/qt/qtbase.git/tree/src/corelib/global/qglobal.cpp?h=5.7#n3235PS.
Alternatively try a later Qt version, whereqfontengine_ft
"magically" disappeared. ;) -
@ambershark Here's the three files associated with the panel I tried to present. Really nothing to see here however, it is just the template from choosing Qt Designer Form Class and adding one label. Without the label, everything is fine. Once the label is there, it crashes. To instantiate it all I do is
DialogVoicePresets *dvp = new DialogVoicePresets(this); dvp->show();
The c file:
#include "dialogvoicepresets.h" #include "ui_dialogvoicepresets.h" DialogVoicePresets::DialogVoicePresets(QWidget *parent) : QDialog(parent), ui(new Ui::DialogVoicePresets) { ui->setupUi(this); } DialogVoicePresets::~DialogVoicePresets() { delete ui; }
The Header
#ifndef DIALOGVOICEPRESETS_H #define DIALOGVOICEPRESETS_H #include <QDialog> namespace Ui { class DialogVoicePresets; } class DialogVoicePresets : public QDialog { Q_OBJECT signals: void openPresets(); public: explicit DialogVoicePresets(QWidget *parent = 0); ~DialogVoicePresets(); private: Ui::DialogVoicePresets *ui; }; #endif // DIALOGVOICEPRESETS_H
And the .ui file
<?xml version="1.0" encoding="UTF-8"?> <ui version="4.0"> <class>DialogVoicePresets</class> <widget class="QDialog" name="DialogVoicePresets"> <property name="geometry"> <rect> <x>0</x> <y>0</y> <width>480</width> <height>272</height> </rect> </property> <property name="windowTitle"> <string>Dialog</string> </property> <widget class="QLabel" name="label"> <property name="geometry"> <rect> <x>200</x> <y>20</y> <width>59</width> <height>16</height> </rect> </property> <property name="text"> <string>TextLabel</string> </property> </widget> </widget> <resources/> <connections/> </ui>
-
@kshegunov
I tried to setQT_NO_FT_CACHE to 1. I'm running 4.0.3 but I got to this form from the Debug button on the side panel. On my version of Creator the run button just runs the program. Once I picked Debug then I had the options of Build and Run where I could choose the environment variables as shown in this image.
Unfortunately the problem still exists after this. However thank you for the idea! Still looking... I will see about upgrading to a later version.
-
@mstoth Couple ideas..
-
What happens if you do
dvp->exec()
to make it modal instead of show()? -
Do you have a custom event loop somewhere?
-
What happens if you do not give your dialog a parent? I.e.
new DialogVoicePresets();
? -
If possible can you test this on another linux box? Preferably something more modern than redhat? Not necessarily cutting edge like arch/gentoo, but even something simple and newer like ubuntu. That will help show if it's a problem with your system or if it's a problem with your code.
-
-
@ambershark
First many thanks to all who provided time and help!Finally after several months in desperation, I just re-wrote the MainWindow code, copied all the panels and created a new version of the application. Now I do not have the crashing problem. I do not see how it's different so there must be some nearly invisible artifact that was throwing a monkey wrench into the works. That's my guess anyway.
Again, many thanks to all! As a friend of mine used to say "Your blood's worth bottlin'"