Important: Please read the Qt Code of Conduct - https://forum.qt.io/topic/113070/qt-code-of-conduct

QRegularExpression construction time



  • After upgrading from 5.11.3 to 5.12.3 performance of my console app dropped dramatically. The app makes heavy use of QRegularExpression, creating and destroying them frequently.

    This sample code runs at least three times faster with QT 5.11.3 than 5.12.3.

    #include <QString>
    #include <QTextStream>
    #include <QDateTime>
    #include <QRegularExpression>
    
    int main() 
    {
    	QTextStream	out(stdout);	// take over stdout
    	QTime timer;
    	timer.start();
    	
    	out << qVersion() << endl;
    	for (int i = 0; i < 100000; i++) 
    	{
    		QRegularExpression re("[A-Z](\\d+)");
    		re.isValid();
    	}
    
    	out << QString("%1 ms").arg(QLocale().toString(timer.elapsed()), 4) << endl;
    }
    

    Is this expected?

    • Ken


  • I agree 100%, but 5.11 acts differently than 5.12 unless I turn off QT_ENABLE_REGEXP_JIT.

    My problem is solved, so now I'm just curious.


  • Lifetime Qt Champion

    I don't see a difference between 5.9 an 5.12 (on windows).
    I would advise you to not recreate the regular expressions every time if it is not needed as in your example above.



  • Thanks for the quick response. I'm worried that I somehow have the wrong dlls.

    Here's the weird part -- when compiled with debugging, the program takes 266 ms. Compiled for release, it takes 2,023 ms.

    I'm running the program from within Microsoft Visual C++ 2019 with the QT extension. Running from the command line with QTDIR to control which dlls get loaded, I get the same results.



  • This is all under Windows 10.

    Obviously the example gets solved by constructing the QRegularExpression outside of the loop, but my program uses many, many QRegularExpression, creating them as necessary.

    I'm wondering if the jit Java compiler is engaged, but I get identical results on a different Win 10 machine, not used for development.


  • Lifetime Qt Champion

    Hi,

    What JIT Java compiler ?

    If I have checked things correctly, the main differences should be upgrades to PCRE2.

    One commit e39a9de3309f84be4101da839a0bacf69090706f did change the JIT enabled stat for windows and arm platforms however it was already present since 5.10.1 so it's likely unrelated to your issue.

    The latest update to PCRE2 was done in Qt 5.11.3 so again, it should not be that.

    Did you try to run your application with a profiler ? That should give you some clue about what is taking so long.



  • The time differences depend on QT_ENABLE_REGEXP_JIT.

    The only case that is slow is with QT 5.12 and QT_ENABLE_REGEXP_JIT=1.
    Using 5.11 dlls OR setting QT_ENABLE_REGEXP_JIT=0 returns to normal performance.

    I'm not sure what this means, but at least it's a work around.


  • Lifetime Qt Champion

    @ktwolff If you use your regexp only one time, the JIT may indeed be overhead. It will pay its price once you re-use the regexp.

    Regards



  • I agree 100%, but 5.11 acts differently than 5.12 unless I turn off QT_ENABLE_REGEXP_JIT.

    My problem is solved, so now I'm just curious.


  • Lifetime Qt Champion

    @ktwolff I think it is all stated in the docs:

    Debug version with JIT is slow to prevent crashes with the self-modyfiying code in tools like valgrind.

    https://doc.qt.io/qt-5/qregularexpression.html#debugging-code-that-uses-qregularexpression

    Or are you talking about the release version?



  • All tests are compiled for release. It seems like Qt 5.11 behaves unexpectedly because enabling the jit compiler doesn't slow things down when the regex is constructed inside the loop.

    I vaguely remember something about Qt caching regexs by signature. Does this sound familiar?

    Another interesting thing is that global a QRegularExpression used on multiple threads seems to have an internal mutex to avoid collisions. That is one reason I can't simply construct all the regexs at load time, they defeat the multi threading.



  • QRegExp caches recently used regexs, QRegularExpression does not. Instead, a matching operation does not modify a QRegularExpression, allowing multi-threading of one global object.

    I'll have to retest my assertion that the regexs are blocking across threads.


Log in to reply