QRegularExpression crashes valgrind
-
I've been trying to use Valgrind to find memory leaks in my program, but valgrind keeps crashing. I finally narrowed it down to QRegularExpression methods being called. After a couple of days scratching my head I found the documentation contained the following:
Debugging Code that Uses QRegularExpression QRegularExpression internally uses a just in time compiler (JIT) to optimize the execution of the matching algorithm. The JIT makes extensive usage of self-modifying code, which can lead debugging tools such as Valgrind to crash. You must enable all checks for self-modifying code if you want to debug programs using QRegularExpression (for instance, Valgrind's --smc-check command line option). The downside of enabling such checks is that your program will run considerably slower. To avoid that, the JIT is disabled by default if you compile Qt in debug mode. It is possible to override the default and enable or disable the JIT usage (both in debug or release mode) by setting the QT_ENABLE_REGEXP_JIT environment variable to a non-zero or zero value respectively.
So I tried defining the QT_ENABLE_REGEXP_JIT in my .pro file (I'm using QMake with Qt6). But it made now difference:
DEFINES += QT_ENABLE_REGEXP_JIT=0
How can I just valgrind and QRegularExpressions together? My code seems fine, I just need to change something to allow Valgrind to get past those self-changing code methods that QRegularExpression uses. When I run Valgrind from QtCreator I see the option "--smc-check=stack" passed to valgrind, but obviously that's not enough/working.
-
I've been trying to use Valgrind to find memory leaks in my program, but valgrind keeps crashing. I finally narrowed it down to QRegularExpression methods being called. After a couple of days scratching my head I found the documentation contained the following:
Debugging Code that Uses QRegularExpression QRegularExpression internally uses a just in time compiler (JIT) to optimize the execution of the matching algorithm. The JIT makes extensive usage of self-modifying code, which can lead debugging tools such as Valgrind to crash. You must enable all checks for self-modifying code if you want to debug programs using QRegularExpression (for instance, Valgrind's --smc-check command line option). The downside of enabling such checks is that your program will run considerably slower. To avoid that, the JIT is disabled by default if you compile Qt in debug mode. It is possible to override the default and enable or disable the JIT usage (both in debug or release mode) by setting the QT_ENABLE_REGEXP_JIT environment variable to a non-zero or zero value respectively.
So I tried defining the QT_ENABLE_REGEXP_JIT in my .pro file (I'm using QMake with Qt6). But it made now difference:
DEFINES += QT_ENABLE_REGEXP_JIT=0
How can I just valgrind and QRegularExpressions together? My code seems fine, I just need to change something to allow Valgrind to get past those self-changing code methods that QRegularExpression uses. When I run Valgrind from QtCreator I see the option "--smc-check=stack" passed to valgrind, but obviously that's not enough/working.
@ocgltd said in QRegularExpression crashes valgrind:
When I run Valgrind from QtCreator I see the option "--smc-check=stack" passed to valgrind, but obviously that's not enough/working.
You must enable all checks for self-modifying code if you want to debug programs using QRegularExpression (for instance, Valgrind's --smc-check command line option).
stack != all
Valgrind manualTo avoid that, the JIT is disabled by default if you compile Qt in debug mode.
You can look for the memory leak in a debug version of your program. If you must debug a release version then:
It is possible to override the default and enable or disable the JIT usage (both in debug or release mode) by setting the QT_ENABLE_REGEXP_JIT environment variable to a non-zero or zero value respectively.
This is not a compile-time DEFINE, it is a run-time environment variable. Run you program for a console:
# Linux export QT_ENABLE_REGEXP_JIT=0 valgrind ./sail_my_leaky_ship # Windows set QT_ENABLE_REGEXP_JIT=0 valgrind .\sail_my_leaky_ship
or set that variable in the run settings of the Qt Creator project.
-
QT_ENABLE_REGEXP_JIT is a runtime environment variable, not a preprocessor variable.
https://codebrowser.dev/qt5/qtbase/src/corelib/text/qregularexpression.cpp.html#1120
QByteArray jitEnvironment = qgetenv("QT_ENABLE_REGEXP_JIT");
-
I've been trying to use Valgrind to find memory leaks in my program, but valgrind keeps crashing. I finally narrowed it down to QRegularExpression methods being called. After a couple of days scratching my head I found the documentation contained the following:
Debugging Code that Uses QRegularExpression QRegularExpression internally uses a just in time compiler (JIT) to optimize the execution of the matching algorithm. The JIT makes extensive usage of self-modifying code, which can lead debugging tools such as Valgrind to crash. You must enable all checks for self-modifying code if you want to debug programs using QRegularExpression (for instance, Valgrind's --smc-check command line option). The downside of enabling such checks is that your program will run considerably slower. To avoid that, the JIT is disabled by default if you compile Qt in debug mode. It is possible to override the default and enable or disable the JIT usage (both in debug or release mode) by setting the QT_ENABLE_REGEXP_JIT environment variable to a non-zero or zero value respectively.
So I tried defining the QT_ENABLE_REGEXP_JIT in my .pro file (I'm using QMake with Qt6). But it made now difference:
DEFINES += QT_ENABLE_REGEXP_JIT=0
How can I just valgrind and QRegularExpressions together? My code seems fine, I just need to change something to allow Valgrind to get past those self-changing code methods that QRegularExpression uses. When I run Valgrind from QtCreator I see the option "--smc-check=stack" passed to valgrind, but obviously that's not enough/working.
@ocgltd said in QRegularExpression crashes valgrind:
When I run Valgrind from QtCreator I see the option "--smc-check=stack" passed to valgrind, but obviously that's not enough/working.
You must enable all checks for self-modifying code if you want to debug programs using QRegularExpression (for instance, Valgrind's --smc-check command line option).
stack != all
Valgrind manualTo avoid that, the JIT is disabled by default if you compile Qt in debug mode.
You can look for the memory leak in a debug version of your program. If you must debug a release version then:
It is possible to override the default and enable or disable the JIT usage (both in debug or release mode) by setting the QT_ENABLE_REGEXP_JIT environment variable to a non-zero or zero value respectively.
This is not a compile-time DEFINE, it is a run-time environment variable. Run you program for a console:
# Linux export QT_ENABLE_REGEXP_JIT=0 valgrind ./sail_my_leaky_ship # Windows set QT_ENABLE_REGEXP_JIT=0 valgrind .\sail_my_leaky_ship
or set that variable in the run settings of the Qt Creator project.
-
O ocgltd has marked this topic as solved on
-
I've been trying to use Valgrind to find memory leaks in my program, but valgrind keeps crashing. I finally narrowed it down to QRegularExpression methods being called. After a couple of days scratching my head I found the documentation contained the following:
Debugging Code that Uses QRegularExpression QRegularExpression internally uses a just in time compiler (JIT) to optimize the execution of the matching algorithm. The JIT makes extensive usage of self-modifying code, which can lead debugging tools such as Valgrind to crash. You must enable all checks for self-modifying code if you want to debug programs using QRegularExpression (for instance, Valgrind's --smc-check command line option). The downside of enabling such checks is that your program will run considerably slower. To avoid that, the JIT is disabled by default if you compile Qt in debug mode. It is possible to override the default and enable or disable the JIT usage (both in debug or release mode) by setting the QT_ENABLE_REGEXP_JIT environment variable to a non-zero or zero value respectively.
So I tried defining the QT_ENABLE_REGEXP_JIT in my .pro file (I'm using QMake with Qt6). But it made now difference:
DEFINES += QT_ENABLE_REGEXP_JIT=0
How can I just valgrind and QRegularExpressions together? My code seems fine, I just need to change something to allow Valgrind to get past those self-changing code methods that QRegularExpression uses. When I run Valgrind from QtCreator I see the option "--smc-check=stack" passed to valgrind, but obviously that's not enough/working.
@ocgltd
This is very useful information. Let us know if this resolves. But I have used valgrind across most of my programs, and I'm pretty sure that will have includedQRegularExpression
with no problem. I am surprised we don't hear of this more often, ought to be plenty of people in same situation.I am Ubuntu 22.04. Qt 5.1.5 from distro. What platform are you? Did you build Qt yourself? Debug or not? Of course it might not be every time with
QRegularExpression
, it could depend on what a particular one does. -
@ocgltd
This is very useful information. Let us know if this resolves. But I have used valgrind across most of my programs, and I'm pretty sure that will have includedQRegularExpression
with no problem. I am surprised we don't hear of this more often, ought to be plenty of people in same situation.I am Ubuntu 22.04. Qt 5.1.5 from distro. What platform are you? Did you build Qt yourself? Debug or not? Of course it might not be every time with
QRegularExpression
, it could depend on what a particular one does.@JonB I tried compiling on Redhat 8 + 9, and ubuntu 23. Problem exists on all platforms. Always installing Qt from Qt website, never from source.
I also use regexes quite often and only sometimes have I encountered this problem. No idea, something to do with the self modifying code. I noticed in particular, if I remove the square braces from the regex the SMC does not crash valgrind, but added the square braces back does. Must be something about how Qt implemented this.
-
@JonB I tried compiling on Redhat 8 + 9, and ubuntu 23. Problem exists on all platforms. Always installing Qt from Qt website, never from source.
I also use regexes quite often and only sometimes have I encountered this problem. No idea, something to do with the self modifying code. I noticed in particular, if I remove the square braces from the regex the SMC does not crash valgrind, but added the square braces back does. Must be something about how Qt implemented this.
@ocgltd said in QRegularExpression crashes valgrind:
if I remove the square braces from the regex the SMC does not crash valgrind, but added the square braces back does.
This is the sort of thing I meant. So it does depend on the actual reg ex/what it's matching, not every time. Then that would figure why I have never seen it, it happens intermittently.
Well blow me down, I think you have spotted something there! My current program is using a variety of half a dozen
QRegularExpression
s and runs under valgrind with no problem. I went and picked one expression:allTokens.at(i).contains(QRegularExpression("^\\d+\\.?$"))
If I change it to
QRegularExpression("^[0-9]+\\.?$") // or QRegularExpression("^[0123456789]+\\.?$")
then, as you say, sure enough it "crashes" --- actually I don't see any "crash", but the program seems to exit abruptly, and valgrind reports many leaking blocks, doubtless because it stopped in the middle.
I can change it over to, say,
QRegularExpression("^(0|1|2|3|4|5|6|7|8|9)+\\.?$")
and it works fine.
I also confirm that if I start
main()
withstatic char envvar[] = "QT_ENABLE_REGEXP_JIT=0"; putenv(envvar);
then the "crash" never happens.
So it does look as though
[]
, with or without a range inside it, is a killer.Of course, that does not tell us whether there are other patterns which cause a similar problem.
But
[]
does not seem to always cause "crash". A few lines above the previous one I havethis->allTokens = fileContent.split(QRegularExpression("\\s+"), Qt::SkipEmptyParts);
I can change that to
QRegularExpression("[ \\t\\r\\n]+")
and still run under valgrind without the
[]
causing "crash" this time.Finally, I don't think this "problem" is necessarily Qt
QRegularExpression
-specific. That uses PCRE/libpcre, "Perl Compatible Regular Expressions". Which is a commonly-used regular expression implementation. That is what uses the "JIT compiler", so presumably the problems can arise with any other library which uses that, not onlyQRegularExpresssion
implementation.UPDATE
I also have found a way to make it work from Creator's valgrind without having to setQT_ENABLE_REGEXP_JIT=0
. In Tools > Options > Analyzer > Valgrind the default for Detect self-modifying code is Only on Stack. If I change that to Everywhere I no longer get valgrind "crashes". This may be preferable to disabling the JIT. -
@ocgltd said in QRegularExpression crashes valgrind:
if I remove the square braces from the regex the SMC does not crash valgrind, but added the square braces back does.
This is the sort of thing I meant. So it does depend on the actual reg ex/what it's matching, not every time. Then that would figure why I have never seen it, it happens intermittently.
Well blow me down, I think you have spotted something there! My current program is using a variety of half a dozen
QRegularExpression
s and runs under valgrind with no problem. I went and picked one expression:allTokens.at(i).contains(QRegularExpression("^\\d+\\.?$"))
If I change it to
QRegularExpression("^[0-9]+\\.?$") // or QRegularExpression("^[0123456789]+\\.?$")
then, as you say, sure enough it "crashes" --- actually I don't see any "crash", but the program seems to exit abruptly, and valgrind reports many leaking blocks, doubtless because it stopped in the middle.
I can change it over to, say,
QRegularExpression("^(0|1|2|3|4|5|6|7|8|9)+\\.?$")
and it works fine.
I also confirm that if I start
main()
withstatic char envvar[] = "QT_ENABLE_REGEXP_JIT=0"; putenv(envvar);
then the "crash" never happens.
So it does look as though
[]
, with or without a range inside it, is a killer.Of course, that does not tell us whether there are other patterns which cause a similar problem.
But
[]
does not seem to always cause "crash". A few lines above the previous one I havethis->allTokens = fileContent.split(QRegularExpression("\\s+"), Qt::SkipEmptyParts);
I can change that to
QRegularExpression("[ \\t\\r\\n]+")
and still run under valgrind without the
[]
causing "crash" this time.Finally, I don't think this "problem" is necessarily Qt
QRegularExpression
-specific. That uses PCRE/libpcre, "Perl Compatible Regular Expressions". Which is a commonly-used regular expression implementation. That is what uses the "JIT compiler", so presumably the problems can arise with any other library which uses that, not onlyQRegularExpresssion
implementation.UPDATE
I also have found a way to make it work from Creator's valgrind without having to setQT_ENABLE_REGEXP_JIT=0
. In Tools > Options > Analyzer > Valgrind the default for Detect self-modifying code is Only on Stack. If I change that to Everywhere I no longer get valgrind "crashes". This may be preferable to disabling the JIT.@JonB I eventually found a sentence burried in the QRegularExpression documentation that QRegularExpression can cause Valgrind to crash!
So by setting the --smc-check=all option now valgrind works fine. So for anyone else stumped, just change the setting above and all will work again (same results as JonB shows above with env variable)
-
@JonB I eventually found a sentence burried in the QRegularExpression documentation that QRegularExpression can cause Valgrind to crash!
So by setting the --smc-check=all option now valgrind works fine. So for anyone else stumped, just change the setting above and all will work again (same results as JonB shows above with env variable)
@ocgltd
The--smc-check=all
is actually what the last option in my UPDATE does, via an option in Creator's valgrind settings. That is actually rather different from setting the environment variableQT_ENABLE_REGEXP_JIT=0
. The former changes the behaviour of valgrind checks to "excuse" the regular expression code, the latter changes Qt runtime code to not do the regular expression JIT code. It seems you can choose one or the other, you don't have to do both.