Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Groups
  • Search
  • Get Qt Extensions
  • Unsolved
Collapse
Brand Logo
  1. Home
  2. Qt Development
  3. General and Desktop
  4. QString::indexOf(QRegularExpression) is slower than indexOf(QRegExp)
Qt 6.11 is out! See what's new in the release blog

QString::indexOf(QRegularExpression) is slower than indexOf(QRegExp)

Scheduled Pinned Locked Moved Unsolved General and Desktop
13 Posts 4 Posters 2.0k Views 5 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • johnson54J johnson54

    In Qt 5.15.2, "str" is a QString, the length of "str" is commonly 32KBytes, when I use str.indexOf, I found
    str.indexOf(QRegularExpression(splitStr)) is much more slowly than str.indexOf(QRegExp(splitStr)), is there something wrong?

    JonBJ Offline
    JonBJ Offline
    JonB
    wrote on last edited by
    #2

    @johnson54
    Can you give your comparative timings so we understand the difference? And you do understand that means for many repeats, not just for one off?

    1 Reply Last reply
    0
    • johnson54J Offline
      johnson54J Offline
      johnson54
      wrote on last edited by
      #3

      Here is my code and result. the ‘gnss.log’ file is about 12MB。

      QRegExp  pkg_num =  1229 elapsed =  141 ms
      QRegularExpression  pkg_num =  1229 elapsed =  9168 ms
      
      #include <QCoreApplication>
      #include <QtDebug>
      #include <QFile>
      #include <QRegularExpression>
      #include <QRegExp>
      #include <QElapsedTimer>
      
      constexpr int READ_SIZE = 32*1024*1024;
      
      int TestSplit(int b_use_reg_exp)
      {
          QFile file("E:/gnss.log");
          QString split_str("GGA");
          if (file.open(QIODevice::ReadOnly) == false)
              return -1;
          auto p_buffer = new char[READ_SIZE];
          int finish_flag = 0;
          QString str_buff;
          int last_pos = 0;
          int pkg_num = 0;
          int pos = 0;
          while (1)
          {
              auto file_size = file.read(p_buffer, READ_SIZE);
              if (file_size < READ_SIZE)
                  finish_flag = 1;
              str_buff.append(QString(QLatin1String(p_buffer, file_size)));
              if (b_use_reg_exp)
                  pos = str_buff.indexOf(QRegExp(split_str));
              else
                  pos = str_buff.indexOf(QRegularExpression(split_str));
      
              int offset_pos = -1;
              while (pos != -1)
              {
                  offset_pos = pos;
                  pkg_num++;
                  if (b_use_reg_exp)
                      pos = str_buff.indexOf(QRegExp(split_str), pos+1);
                  else
                      pos = str_buff.indexOf(QRegularExpression(split_str), pos+1);
              }
              str_buff.remove(0, offset_pos+1);
              if (offset_pos != -1)
                  last_pos += offset_pos + 1;
              if (finish_flag == 1)
                  break;
          }
          file.close();
          delete []p_buffer;
          return pkg_num;
      }
      
      int main(int argc, char *argv[])
      {
          QCoreApplication a(argc, argv);
          QElapsedTimer t_time;
          int pkg_num;
          t_time.start();
          pkg_num = TestSplit(true);
          qDebug() << "QRegExp " << "pkg_num = " << pkg_num << "elapsed = " << t_time.elapsed() << "ms";
          t_time.start();
          pkg_num = TestSplit(false);
          qDebug() << "QRegularExpression " << "pkg_num = " << pkg_num << "elapsed = " << t_time.elapsed() << "ms";
          return a.exec();
      }
      
      
      JonBJ 1 Reply Last reply
      0
      • johnson54J johnson54

        Here is my code and result. the ‘gnss.log’ file is about 12MB。

        QRegExp  pkg_num =  1229 elapsed =  141 ms
        QRegularExpression  pkg_num =  1229 elapsed =  9168 ms
        
        #include <QCoreApplication>
        #include <QtDebug>
        #include <QFile>
        #include <QRegularExpression>
        #include <QRegExp>
        #include <QElapsedTimer>
        
        constexpr int READ_SIZE = 32*1024*1024;
        
        int TestSplit(int b_use_reg_exp)
        {
            QFile file("E:/gnss.log");
            QString split_str("GGA");
            if (file.open(QIODevice::ReadOnly) == false)
                return -1;
            auto p_buffer = new char[READ_SIZE];
            int finish_flag = 0;
            QString str_buff;
            int last_pos = 0;
            int pkg_num = 0;
            int pos = 0;
            while (1)
            {
                auto file_size = file.read(p_buffer, READ_SIZE);
                if (file_size < READ_SIZE)
                    finish_flag = 1;
                str_buff.append(QString(QLatin1String(p_buffer, file_size)));
                if (b_use_reg_exp)
                    pos = str_buff.indexOf(QRegExp(split_str));
                else
                    pos = str_buff.indexOf(QRegularExpression(split_str));
        
                int offset_pos = -1;
                while (pos != -1)
                {
                    offset_pos = pos;
                    pkg_num++;
                    if (b_use_reg_exp)
                        pos = str_buff.indexOf(QRegExp(split_str), pos+1);
                    else
                        pos = str_buff.indexOf(QRegularExpression(split_str), pos+1);
                }
                str_buff.remove(0, offset_pos+1);
                if (offset_pos != -1)
                    last_pos += offset_pos + 1;
                if (finish_flag == 1)
                    break;
            }
            file.close();
            delete []p_buffer;
            return pkg_num;
        }
        
        int main(int argc, char *argv[])
        {
            QCoreApplication a(argc, argv);
            QElapsedTimer t_time;
            int pkg_num;
            t_time.start();
            pkg_num = TestSplit(true);
            qDebug() << "QRegExp " << "pkg_num = " << pkg_num << "elapsed = " << t_time.elapsed() << "ms";
            t_time.start();
            pkg_num = TestSplit(false);
            qDebug() << "QRegularExpression " << "pkg_num = " << pkg_num << "elapsed = " << t_time.elapsed() << "ms";
            return a.exec();
        }
        
        
        JonBJ Offline
        JonBJ Offline
        JonB
        wrote on last edited by
        #4

        @johnson54
        Well that does seem to be a pretty considerable difference!

        What you did not say, but show now, is that your splitter regular expression is not a regular expression, just a constant string ("GGA"). I am not defending the timing, but it might be interesting to know whether this kind of difference also applies when the regular expression actually has some work to do? I assume you realise for the splitter you have you can just use int QString::indexOf(QLatin1String str, int from = 0, Qt::CaseSensitivity cs = Qt::CaseSensitive) const, and you ought time that too.

        The other thought I have, and I don't know if this is a "thing". You construct the regular expression each time as a parameter to indexOf(). Regular expression construction can be expensive. Try taking the QRegExp(split_str) & QRegularExpression(split_str) outside the loop and use the already-constructed reg exp each time. Any difference? This should be done anyway when the reg exp does not change in the loop.

        1 Reply Last reply
        1
        • johnson54J Offline
          johnson54J Offline
          johnson54
          wrote on last edited by
          #5

          @JonB
          I changed the split_str, make it regular expression (GG.,), and make the construction of reg exp outside the loop.

          QRegExp  pkg_num =  1229 elapsed =  63 ms
          QRegularExpression  pkg_num =  1229 elapsed =  7716 ms
          
          #include <QCoreApplication>
          #include <QtDebug>
          #include <QFile>
          #include <QRegularExpression>
          #include <QRegExp>
          #include <QElapsedTimer>
          
          constexpr int READ_SIZE = 32*1024*1024;
          
          int TestSplit(int b_use_reg_exp)
          {
              QFile file("E:/gnss.log");
              QString split_str("GG.,");
              if (file.open(QIODevice::ReadOnly) == false)
                  return -1;
              auto p_buffer = new char[READ_SIZE];
              int finish_flag = 0;
              QString str_buff;
              int last_pos = 0;
              int pkg_num = 0;
              int pos = 0;
              auto reg_exp = QRegExp(split_str);
              auto regular_expression = QRegularExpression(split_str);
              while (1)
              {
                  auto file_size = file.read(p_buffer, READ_SIZE);
                  if (file_size < READ_SIZE)
                      finish_flag = 1;
                  str_buff.append(QString(QLatin1String(p_buffer, file_size)));
                  if (b_use_reg_exp)
                      pos = str_buff.indexOf(reg_exp);
                  else
                      pos = str_buff.indexOf(regular_expression);
          
                  int offset_pos = -1;
                  while (pos != -1)
                  {
                      offset_pos = pos;
                      pkg_num++;
                      if (b_use_reg_exp)
                          pos = str_buff.indexOf(reg_exp, pos+1);
                      else
                          pos = str_buff.indexOf(regular_expression, pos+1);
                  }
                  str_buff.remove(0, offset_pos+1);
                  if (offset_pos != -1)
                      last_pos += offset_pos + 1;
                  if (finish_flag == 1)
                      break;
              }
              file.close();
              delete []p_buffer;
              return pkg_num;
          }
          
          int main(int argc, char *argv[])
          {
              QCoreApplication a(argc, argv);
              QElapsedTimer t_time;
              int pkg_num;
              t_time.start();
              pkg_num = TestSplit(true);
              qDebug() << "QRegExp " << "pkg_num = " << pkg_num << "elapsed = " << t_time.elapsed() << "ms";
              t_time.start();
              pkg_num = TestSplit(false);
              qDebug() << "QRegularExpression " << "pkg_num = " << pkg_num << "elapsed = " << t_time.elapsed() << "ms";
              return a.exec();
          }
          
          
          1 Reply Last reply
          1
          • johnson54J Offline
            johnson54J Offline
            johnson54
            wrote on last edited by
            #6

            I use QRegExp instead of QRegularExpression in Qt 5.15, when porting to Qt 6, the QString::indexOf(QRegExp) is deprecated. I changed to QString::indexOf(QRegularExpression), but the performance is terrible. So I tested both in Qt 5.15 and found there is considerable difference in performance. However, many people said QRegularExpression enjoys better performance, I want to know if I use it wrong.

            1 Reply Last reply
            1
            • JonBJ Offline
              JonBJ Offline
              JonB
              wrote on last edited by JonB
              #7

              @johnson54
              Interesting findings :)

              It's nice to see that taking either regular expression construction out of the loop saves around a second, isn't it?!

              But your difference remains, and now it's an even greater percentage. More than 100x is not good! So much so that I hope one of our Qt experts will care to comment further....

              However, many people said QRegularExpression enjoys better performance, I want to know if I use it wrong.

              Not from your example, apparently! You are using it fine.

              1 Reply Last reply
              0
              • SGaistS Offline
                SGaistS Offline
                SGaist
                Lifetime Qt Champion
                wrote on last edited by
                #8

                Hi,

                Do you have the same performance hit with both Qt 6 and Qt 5 ?

                Which version of Qt 6 did you try ?

                On which platform ?

                Interested in AI ? www.idiap.ch
                Please read the Qt Code of Conduct - https://forum.qt.io/topic/113070/qt-code-of-conduct

                johnson54J 1 Reply Last reply
                0
                • jeremy_kJ Offline
                  jeremy_kJ Offline
                  jeremy_k
                  wrote on last edited by
                  #9

                  Calling QRegularExpression::optimize() prior to entering the loop might help. Using QRegularExpression::globalMatch() instead of QString::indexOf() is also worth investigating.

                  Asking a question about code? http://eel.is/iso-c++/testcase/

                  johnson54J 1 Reply Last reply
                  0
                  • JonBJ Offline
                    JonBJ Offline
                    JonB
                    wrote on last edited by
                    #10

                    I note we are told:

                    The QRegularExpression class introduced in Qt 5 is a big improvement upon QRegExp, in terms of APIs offered, supported pattern syntax and speed of execution

                    :)

                    @johnson54
                    You have not said: are you running these tests compiled for debug? If so, can you see what the timings are if you compile for release?

                    1 Reply Last reply
                    0
                    • johnson54J Offline
                      johnson54J Offline
                      johnson54
                      wrote on last edited by
                      #11

                      @JonB
                      In Qt 5.15.2, Windows 10, MSVC 2019-32bit
                      Release Mode:

                      QRegExp  pkg_num =  1229 elapsed =  44 ms
                      QRegularExpression  pkg_num =  1229 elapsed =  4142 ms
                      

                      Debug Mode:

                      QRegExp  pkg_num =  1229 elapsed =  63 ms
                      QRegularExpression  pkg_num =  1229 elapsed =  7716 ms
                      
                      1 Reply Last reply
                      0
                      • SGaistS SGaist

                        Hi,

                        Do you have the same performance hit with both Qt 6 and Qt 5 ?

                        Which version of Qt 6 did you try ?

                        On which platform ?

                        johnson54J Offline
                        johnson54J Offline
                        johnson54
                        wrote on last edited by
                        #12

                        @SGaist
                        In Qt6, QRegularExpression has the same performance as Qt 5.15.2 (Windows 10, MSVC 2019-32bit).
                        I use Qt 6.2, Windows 10, MSVC 2019-64bit, Release Mode, the result is

                        QRegularExpression  pkg_num =  1229 elapsed =  4385 ms
                        

                        Although I added "greaterThan(QT_MAJOR_VERSION, 5): QT += core5compat" in .pro
                        I cannot use QString::indexOf(QRegExp) in Qt 6.2

                        1 Reply Last reply
                        0
                        • jeremy_kJ jeremy_k

                          Calling QRegularExpression::optimize() prior to entering the loop might help. Using QRegularExpression::globalMatch() instead of QString::indexOf() is also worth investigating.

                          johnson54J Offline
                          johnson54J Offline
                          johnson54
                          wrote on last edited by
                          #13

                          @jeremy_k I put 'regular_expression.optimize();' before the whole loop, and it makes no help.

                          1 Reply Last reply
                          0

                          • Login

                          • Login or register to search.
                          • First post
                            Last post
                          0
                          • Categories
                          • Recent
                          • Tags
                          • Popular
                          • Users
                          • Groups
                          • Search
                          • Get Qt Extensions
                          • Unsolved