Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Groups
  • Search
  • Get Qt Extensions
  • Unsolved
Collapse
Brand Logo
  1. Home
  2. Qt Development
  3. General and Desktop
  4. QString::indexOf(QRegularExpression) is slower than indexOf(QRegExp)
Forum Updated to NodeBB v4.3 + New Features

QString::indexOf(QRegularExpression) is slower than indexOf(QRegExp)

Scheduled Pinned Locked Moved Unsolved General and Desktop
13 Posts 4 Posters 1.5k Views 5 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • johnson54J Offline
    johnson54J Offline
    johnson54
    wrote on last edited by
    #3

    Here is my code and result. the ‘gnss.log’ file is about 12MB。

    QRegExp  pkg_num =  1229 elapsed =  141 ms
    QRegularExpression  pkg_num =  1229 elapsed =  9168 ms
    
    #include <QCoreApplication>
    #include <QtDebug>
    #include <QFile>
    #include <QRegularExpression>
    #include <QRegExp>
    #include <QElapsedTimer>
    
    constexpr int READ_SIZE = 32*1024*1024;
    
    int TestSplit(int b_use_reg_exp)
    {
        QFile file("E:/gnss.log");
        QString split_str("GGA");
        if (file.open(QIODevice::ReadOnly) == false)
            return -1;
        auto p_buffer = new char[READ_SIZE];
        int finish_flag = 0;
        QString str_buff;
        int last_pos = 0;
        int pkg_num = 0;
        int pos = 0;
        while (1)
        {
            auto file_size = file.read(p_buffer, READ_SIZE);
            if (file_size < READ_SIZE)
                finish_flag = 1;
            str_buff.append(QString(QLatin1String(p_buffer, file_size)));
            if (b_use_reg_exp)
                pos = str_buff.indexOf(QRegExp(split_str));
            else
                pos = str_buff.indexOf(QRegularExpression(split_str));
    
            int offset_pos = -1;
            while (pos != -1)
            {
                offset_pos = pos;
                pkg_num++;
                if (b_use_reg_exp)
                    pos = str_buff.indexOf(QRegExp(split_str), pos+1);
                else
                    pos = str_buff.indexOf(QRegularExpression(split_str), pos+1);
            }
            str_buff.remove(0, offset_pos+1);
            if (offset_pos != -1)
                last_pos += offset_pos + 1;
            if (finish_flag == 1)
                break;
        }
        file.close();
        delete []p_buffer;
        return pkg_num;
    }
    
    int main(int argc, char *argv[])
    {
        QCoreApplication a(argc, argv);
        QElapsedTimer t_time;
        int pkg_num;
        t_time.start();
        pkg_num = TestSplit(true);
        qDebug() << "QRegExp " << "pkg_num = " << pkg_num << "elapsed = " << t_time.elapsed() << "ms";
        t_time.start();
        pkg_num = TestSplit(false);
        qDebug() << "QRegularExpression " << "pkg_num = " << pkg_num << "elapsed = " << t_time.elapsed() << "ms";
        return a.exec();
    }
    
    
    JonBJ 1 Reply Last reply
    0
    • johnson54J johnson54

      Here is my code and result. the ‘gnss.log’ file is about 12MB。

      QRegExp  pkg_num =  1229 elapsed =  141 ms
      QRegularExpression  pkg_num =  1229 elapsed =  9168 ms
      
      #include <QCoreApplication>
      #include <QtDebug>
      #include <QFile>
      #include <QRegularExpression>
      #include <QRegExp>
      #include <QElapsedTimer>
      
      constexpr int READ_SIZE = 32*1024*1024;
      
      int TestSplit(int b_use_reg_exp)
      {
          QFile file("E:/gnss.log");
          QString split_str("GGA");
          if (file.open(QIODevice::ReadOnly) == false)
              return -1;
          auto p_buffer = new char[READ_SIZE];
          int finish_flag = 0;
          QString str_buff;
          int last_pos = 0;
          int pkg_num = 0;
          int pos = 0;
          while (1)
          {
              auto file_size = file.read(p_buffer, READ_SIZE);
              if (file_size < READ_SIZE)
                  finish_flag = 1;
              str_buff.append(QString(QLatin1String(p_buffer, file_size)));
              if (b_use_reg_exp)
                  pos = str_buff.indexOf(QRegExp(split_str));
              else
                  pos = str_buff.indexOf(QRegularExpression(split_str));
      
              int offset_pos = -1;
              while (pos != -1)
              {
                  offset_pos = pos;
                  pkg_num++;
                  if (b_use_reg_exp)
                      pos = str_buff.indexOf(QRegExp(split_str), pos+1);
                  else
                      pos = str_buff.indexOf(QRegularExpression(split_str), pos+1);
              }
              str_buff.remove(0, offset_pos+1);
              if (offset_pos != -1)
                  last_pos += offset_pos + 1;
              if (finish_flag == 1)
                  break;
          }
          file.close();
          delete []p_buffer;
          return pkg_num;
      }
      
      int main(int argc, char *argv[])
      {
          QCoreApplication a(argc, argv);
          QElapsedTimer t_time;
          int pkg_num;
          t_time.start();
          pkg_num = TestSplit(true);
          qDebug() << "QRegExp " << "pkg_num = " << pkg_num << "elapsed = " << t_time.elapsed() << "ms";
          t_time.start();
          pkg_num = TestSplit(false);
          qDebug() << "QRegularExpression " << "pkg_num = " << pkg_num << "elapsed = " << t_time.elapsed() << "ms";
          return a.exec();
      }
      
      
      JonBJ Online
      JonBJ Online
      JonB
      wrote on last edited by
      #4

      @johnson54
      Well that does seem to be a pretty considerable difference!

      What you did not say, but show now, is that your splitter regular expression is not a regular expression, just a constant string ("GGA"). I am not defending the timing, but it might be interesting to know whether this kind of difference also applies when the regular expression actually has some work to do? I assume you realise for the splitter you have you can just use int QString::indexOf(QLatin1String str, int from = 0, Qt::CaseSensitivity cs = Qt::CaseSensitive) const, and you ought time that too.

      The other thought I have, and I don't know if this is a "thing". You construct the regular expression each time as a parameter to indexOf(). Regular expression construction can be expensive. Try taking the QRegExp(split_str) & QRegularExpression(split_str) outside the loop and use the already-constructed reg exp each time. Any difference? This should be done anyway when the reg exp does not change in the loop.

      1 Reply Last reply
      1
      • johnson54J Offline
        johnson54J Offline
        johnson54
        wrote on last edited by
        #5

        @JonB
        I changed the split_str, make it regular expression (GG.,), and make the construction of reg exp outside the loop.

        QRegExp  pkg_num =  1229 elapsed =  63 ms
        QRegularExpression  pkg_num =  1229 elapsed =  7716 ms
        
        #include <QCoreApplication>
        #include <QtDebug>
        #include <QFile>
        #include <QRegularExpression>
        #include <QRegExp>
        #include <QElapsedTimer>
        
        constexpr int READ_SIZE = 32*1024*1024;
        
        int TestSplit(int b_use_reg_exp)
        {
            QFile file("E:/gnss.log");
            QString split_str("GG.,");
            if (file.open(QIODevice::ReadOnly) == false)
                return -1;
            auto p_buffer = new char[READ_SIZE];
            int finish_flag = 0;
            QString str_buff;
            int last_pos = 0;
            int pkg_num = 0;
            int pos = 0;
            auto reg_exp = QRegExp(split_str);
            auto regular_expression = QRegularExpression(split_str);
            while (1)
            {
                auto file_size = file.read(p_buffer, READ_SIZE);
                if (file_size < READ_SIZE)
                    finish_flag = 1;
                str_buff.append(QString(QLatin1String(p_buffer, file_size)));
                if (b_use_reg_exp)
                    pos = str_buff.indexOf(reg_exp);
                else
                    pos = str_buff.indexOf(regular_expression);
        
                int offset_pos = -1;
                while (pos != -1)
                {
                    offset_pos = pos;
                    pkg_num++;
                    if (b_use_reg_exp)
                        pos = str_buff.indexOf(reg_exp, pos+1);
                    else
                        pos = str_buff.indexOf(regular_expression, pos+1);
                }
                str_buff.remove(0, offset_pos+1);
                if (offset_pos != -1)
                    last_pos += offset_pos + 1;
                if (finish_flag == 1)
                    break;
            }
            file.close();
            delete []p_buffer;
            return pkg_num;
        }
        
        int main(int argc, char *argv[])
        {
            QCoreApplication a(argc, argv);
            QElapsedTimer t_time;
            int pkg_num;
            t_time.start();
            pkg_num = TestSplit(true);
            qDebug() << "QRegExp " << "pkg_num = " << pkg_num << "elapsed = " << t_time.elapsed() << "ms";
            t_time.start();
            pkg_num = TestSplit(false);
            qDebug() << "QRegularExpression " << "pkg_num = " << pkg_num << "elapsed = " << t_time.elapsed() << "ms";
            return a.exec();
        }
        
        
        1 Reply Last reply
        1
        • johnson54J Offline
          johnson54J Offline
          johnson54
          wrote on last edited by
          #6

          I use QRegExp instead of QRegularExpression in Qt 5.15, when porting to Qt 6, the QString::indexOf(QRegExp) is deprecated. I changed to QString::indexOf(QRegularExpression), but the performance is terrible. So I tested both in Qt 5.15 and found there is considerable difference in performance. However, many people said QRegularExpression enjoys better performance, I want to know if I use it wrong.

          1 Reply Last reply
          1
          • JonBJ Online
            JonBJ Online
            JonB
            wrote on last edited by JonB
            #7

            @johnson54
            Interesting findings :)

            It's nice to see that taking either regular expression construction out of the loop saves around a second, isn't it?!

            But your difference remains, and now it's an even greater percentage. More than 100x is not good! So much so that I hope one of our Qt experts will care to comment further....

            However, many people said QRegularExpression enjoys better performance, I want to know if I use it wrong.

            Not from your example, apparently! You are using it fine.

            1 Reply Last reply
            0
            • SGaistS Offline
              SGaistS Offline
              SGaist
              Lifetime Qt Champion
              wrote on last edited by
              #8

              Hi,

              Do you have the same performance hit with both Qt 6 and Qt 5 ?

              Which version of Qt 6 did you try ?

              On which platform ?

              Interested in AI ? www.idiap.ch
              Please read the Qt Code of Conduct - https://forum.qt.io/topic/113070/qt-code-of-conduct

              johnson54J 1 Reply Last reply
              0
              • jeremy_kJ Offline
                jeremy_kJ Offline
                jeremy_k
                wrote on last edited by
                #9

                Calling QRegularExpression::optimize() prior to entering the loop might help. Using QRegularExpression::globalMatch() instead of QString::indexOf() is also worth investigating.

                Asking a question about code? http://eel.is/iso-c++/testcase/

                johnson54J 1 Reply Last reply
                0
                • JonBJ Online
                  JonBJ Online
                  JonB
                  wrote on last edited by
                  #10

                  I note we are told:

                  The QRegularExpression class introduced in Qt 5 is a big improvement upon QRegExp, in terms of APIs offered, supported pattern syntax and speed of execution

                  :)

                  @johnson54
                  You have not said: are you running these tests compiled for debug? If so, can you see what the timings are if you compile for release?

                  1 Reply Last reply
                  0
                  • johnson54J Offline
                    johnson54J Offline
                    johnson54
                    wrote on last edited by
                    #11

                    @JonB
                    In Qt 5.15.2, Windows 10, MSVC 2019-32bit
                    Release Mode:

                    QRegExp  pkg_num =  1229 elapsed =  44 ms
                    QRegularExpression  pkg_num =  1229 elapsed =  4142 ms
                    

                    Debug Mode:

                    QRegExp  pkg_num =  1229 elapsed =  63 ms
                    QRegularExpression  pkg_num =  1229 elapsed =  7716 ms
                    
                    1 Reply Last reply
                    0
                    • SGaistS SGaist

                      Hi,

                      Do you have the same performance hit with both Qt 6 and Qt 5 ?

                      Which version of Qt 6 did you try ?

                      On which platform ?

                      johnson54J Offline
                      johnson54J Offline
                      johnson54
                      wrote on last edited by
                      #12

                      @SGaist
                      In Qt6, QRegularExpression has the same performance as Qt 5.15.2 (Windows 10, MSVC 2019-32bit).
                      I use Qt 6.2, Windows 10, MSVC 2019-64bit, Release Mode, the result is

                      QRegularExpression  pkg_num =  1229 elapsed =  4385 ms
                      

                      Although I added "greaterThan(QT_MAJOR_VERSION, 5): QT += core5compat" in .pro
                      I cannot use QString::indexOf(QRegExp) in Qt 6.2

                      1 Reply Last reply
                      0
                      • jeremy_kJ jeremy_k

                        Calling QRegularExpression::optimize() prior to entering the loop might help. Using QRegularExpression::globalMatch() instead of QString::indexOf() is also worth investigating.

                        johnson54J Offline
                        johnson54J Offline
                        johnson54
                        wrote on last edited by
                        #13

                        @jeremy_k I put 'regular_expression.optimize();' before the whole loop, and it makes no help.

                        1 Reply Last reply
                        0

                        • Login

                        • Login or register to search.
                        • First post
                          Last post
                        0
                        • Categories
                        • Recent
                        • Tags
                        • Popular
                        • Users
                        • Groups
                        • Search
                        • Get Qt Extensions
                        • Unsolved