Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Groups
  • Search
  • Get Qt Extensions
  • Unsolved
Collapse
Brand Logo
  1. Home
  2. Qt Development
  3. General and Desktop
  4. QString::indexOf(QRegularExpression) is slower than indexOf(QRegExp)

QString::indexOf(QRegularExpression) is slower than indexOf(QRegExp)

Scheduled Pinned Locked Moved Unsolved General and Desktop
13 Posts 4 Posters 1.3k Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • johnson54J Offline
    johnson54J Offline
    johnson54
    wrote on last edited by
    #1

    In Qt 5.15.2, "str" is a QString, the length of "str" is commonly 32KBytes, when I use str.indexOf, I found
    str.indexOf(QRegularExpression(splitStr)) is much more slowly than str.indexOf(QRegExp(splitStr)), is there something wrong?

    JonBJ 1 Reply Last reply
    0
    • johnson54J johnson54

      In Qt 5.15.2, "str" is a QString, the length of "str" is commonly 32KBytes, when I use str.indexOf, I found
      str.indexOf(QRegularExpression(splitStr)) is much more slowly than str.indexOf(QRegExp(splitStr)), is there something wrong?

      JonBJ Offline
      JonBJ Offline
      JonB
      wrote on last edited by
      #2

      @johnson54
      Can you give your comparative timings so we understand the difference? And you do understand that means for many repeats, not just for one off?

      1 Reply Last reply
      0
      • johnson54J Offline
        johnson54J Offline
        johnson54
        wrote on last edited by
        #3

        Here is my code and result. the ‘gnss.log’ file is about 12MB。

        QRegExp  pkg_num =  1229 elapsed =  141 ms
        QRegularExpression  pkg_num =  1229 elapsed =  9168 ms
        
        #include <QCoreApplication>
        #include <QtDebug>
        #include <QFile>
        #include <QRegularExpression>
        #include <QRegExp>
        #include <QElapsedTimer>
        
        constexpr int READ_SIZE = 32*1024*1024;
        
        int TestSplit(int b_use_reg_exp)
        {
            QFile file("E:/gnss.log");
            QString split_str("GGA");
            if (file.open(QIODevice::ReadOnly) == false)
                return -1;
            auto p_buffer = new char[READ_SIZE];
            int finish_flag = 0;
            QString str_buff;
            int last_pos = 0;
            int pkg_num = 0;
            int pos = 0;
            while (1)
            {
                auto file_size = file.read(p_buffer, READ_SIZE);
                if (file_size < READ_SIZE)
                    finish_flag = 1;
                str_buff.append(QString(QLatin1String(p_buffer, file_size)));
                if (b_use_reg_exp)
                    pos = str_buff.indexOf(QRegExp(split_str));
                else
                    pos = str_buff.indexOf(QRegularExpression(split_str));
        
                int offset_pos = -1;
                while (pos != -1)
                {
                    offset_pos = pos;
                    pkg_num++;
                    if (b_use_reg_exp)
                        pos = str_buff.indexOf(QRegExp(split_str), pos+1);
                    else
                        pos = str_buff.indexOf(QRegularExpression(split_str), pos+1);
                }
                str_buff.remove(0, offset_pos+1);
                if (offset_pos != -1)
                    last_pos += offset_pos + 1;
                if (finish_flag == 1)
                    break;
            }
            file.close();
            delete []p_buffer;
            return pkg_num;
        }
        
        int main(int argc, char *argv[])
        {
            QCoreApplication a(argc, argv);
            QElapsedTimer t_time;
            int pkg_num;
            t_time.start();
            pkg_num = TestSplit(true);
            qDebug() << "QRegExp " << "pkg_num = " << pkg_num << "elapsed = " << t_time.elapsed() << "ms";
            t_time.start();
            pkg_num = TestSplit(false);
            qDebug() << "QRegularExpression " << "pkg_num = " << pkg_num << "elapsed = " << t_time.elapsed() << "ms";
            return a.exec();
        }
        
        
        JonBJ 1 Reply Last reply
        0
        • johnson54J johnson54

          Here is my code and result. the ‘gnss.log’ file is about 12MB。

          QRegExp  pkg_num =  1229 elapsed =  141 ms
          QRegularExpression  pkg_num =  1229 elapsed =  9168 ms
          
          #include <QCoreApplication>
          #include <QtDebug>
          #include <QFile>
          #include <QRegularExpression>
          #include <QRegExp>
          #include <QElapsedTimer>
          
          constexpr int READ_SIZE = 32*1024*1024;
          
          int TestSplit(int b_use_reg_exp)
          {
              QFile file("E:/gnss.log");
              QString split_str("GGA");
              if (file.open(QIODevice::ReadOnly) == false)
                  return -1;
              auto p_buffer = new char[READ_SIZE];
              int finish_flag = 0;
              QString str_buff;
              int last_pos = 0;
              int pkg_num = 0;
              int pos = 0;
              while (1)
              {
                  auto file_size = file.read(p_buffer, READ_SIZE);
                  if (file_size < READ_SIZE)
                      finish_flag = 1;
                  str_buff.append(QString(QLatin1String(p_buffer, file_size)));
                  if (b_use_reg_exp)
                      pos = str_buff.indexOf(QRegExp(split_str));
                  else
                      pos = str_buff.indexOf(QRegularExpression(split_str));
          
                  int offset_pos = -1;
                  while (pos != -1)
                  {
                      offset_pos = pos;
                      pkg_num++;
                      if (b_use_reg_exp)
                          pos = str_buff.indexOf(QRegExp(split_str), pos+1);
                      else
                          pos = str_buff.indexOf(QRegularExpression(split_str), pos+1);
                  }
                  str_buff.remove(0, offset_pos+1);
                  if (offset_pos != -1)
                      last_pos += offset_pos + 1;
                  if (finish_flag == 1)
                      break;
              }
              file.close();
              delete []p_buffer;
              return pkg_num;
          }
          
          int main(int argc, char *argv[])
          {
              QCoreApplication a(argc, argv);
              QElapsedTimer t_time;
              int pkg_num;
              t_time.start();
              pkg_num = TestSplit(true);
              qDebug() << "QRegExp " << "pkg_num = " << pkg_num << "elapsed = " << t_time.elapsed() << "ms";
              t_time.start();
              pkg_num = TestSplit(false);
              qDebug() << "QRegularExpression " << "pkg_num = " << pkg_num << "elapsed = " << t_time.elapsed() << "ms";
              return a.exec();
          }
          
          
          JonBJ Offline
          JonBJ Offline
          JonB
          wrote on last edited by
          #4

          @johnson54
          Well that does seem to be a pretty considerable difference!

          What you did not say, but show now, is that your splitter regular expression is not a regular expression, just a constant string ("GGA"). I am not defending the timing, but it might be interesting to know whether this kind of difference also applies when the regular expression actually has some work to do? I assume you realise for the splitter you have you can just use int QString::indexOf(QLatin1String str, int from = 0, Qt::CaseSensitivity cs = Qt::CaseSensitive) const, and you ought time that too.

          The other thought I have, and I don't know if this is a "thing". You construct the regular expression each time as a parameter to indexOf(). Regular expression construction can be expensive. Try taking the QRegExp(split_str) & QRegularExpression(split_str) outside the loop and use the already-constructed reg exp each time. Any difference? This should be done anyway when the reg exp does not change in the loop.

          1 Reply Last reply
          1
          • johnson54J Offline
            johnson54J Offline
            johnson54
            wrote on last edited by
            #5

            @JonB
            I changed the split_str, make it regular expression (GG.,), and make the construction of reg exp outside the loop.

            QRegExp  pkg_num =  1229 elapsed =  63 ms
            QRegularExpression  pkg_num =  1229 elapsed =  7716 ms
            
            #include <QCoreApplication>
            #include <QtDebug>
            #include <QFile>
            #include <QRegularExpression>
            #include <QRegExp>
            #include <QElapsedTimer>
            
            constexpr int READ_SIZE = 32*1024*1024;
            
            int TestSplit(int b_use_reg_exp)
            {
                QFile file("E:/gnss.log");
                QString split_str("GG.,");
                if (file.open(QIODevice::ReadOnly) == false)
                    return -1;
                auto p_buffer = new char[READ_SIZE];
                int finish_flag = 0;
                QString str_buff;
                int last_pos = 0;
                int pkg_num = 0;
                int pos = 0;
                auto reg_exp = QRegExp(split_str);
                auto regular_expression = QRegularExpression(split_str);
                while (1)
                {
                    auto file_size = file.read(p_buffer, READ_SIZE);
                    if (file_size < READ_SIZE)
                        finish_flag = 1;
                    str_buff.append(QString(QLatin1String(p_buffer, file_size)));
                    if (b_use_reg_exp)
                        pos = str_buff.indexOf(reg_exp);
                    else
                        pos = str_buff.indexOf(regular_expression);
            
                    int offset_pos = -1;
                    while (pos != -1)
                    {
                        offset_pos = pos;
                        pkg_num++;
                        if (b_use_reg_exp)
                            pos = str_buff.indexOf(reg_exp, pos+1);
                        else
                            pos = str_buff.indexOf(regular_expression, pos+1);
                    }
                    str_buff.remove(0, offset_pos+1);
                    if (offset_pos != -1)
                        last_pos += offset_pos + 1;
                    if (finish_flag == 1)
                        break;
                }
                file.close();
                delete []p_buffer;
                return pkg_num;
            }
            
            int main(int argc, char *argv[])
            {
                QCoreApplication a(argc, argv);
                QElapsedTimer t_time;
                int pkg_num;
                t_time.start();
                pkg_num = TestSplit(true);
                qDebug() << "QRegExp " << "pkg_num = " << pkg_num << "elapsed = " << t_time.elapsed() << "ms";
                t_time.start();
                pkg_num = TestSplit(false);
                qDebug() << "QRegularExpression " << "pkg_num = " << pkg_num << "elapsed = " << t_time.elapsed() << "ms";
                return a.exec();
            }
            
            
            1 Reply Last reply
            1
            • johnson54J Offline
              johnson54J Offline
              johnson54
              wrote on last edited by
              #6

              I use QRegExp instead of QRegularExpression in Qt 5.15, when porting to Qt 6, the QString::indexOf(QRegExp) is deprecated. I changed to QString::indexOf(QRegularExpression), but the performance is terrible. So I tested both in Qt 5.15 and found there is considerable difference in performance. However, many people said QRegularExpression enjoys better performance, I want to know if I use it wrong.

              1 Reply Last reply
              1
              • JonBJ Offline
                JonBJ Offline
                JonB
                wrote on last edited by JonB
                #7

                @johnson54
                Interesting findings :)

                It's nice to see that taking either regular expression construction out of the loop saves around a second, isn't it?!

                But your difference remains, and now it's an even greater percentage. More than 100x is not good! So much so that I hope one of our Qt experts will care to comment further....

                However, many people said QRegularExpression enjoys better performance, I want to know if I use it wrong.

                Not from your example, apparently! You are using it fine.

                1 Reply Last reply
                0
                • SGaistS Offline
                  SGaistS Offline
                  SGaist
                  Lifetime Qt Champion
                  wrote on last edited by
                  #8

                  Hi,

                  Do you have the same performance hit with both Qt 6 and Qt 5 ?

                  Which version of Qt 6 did you try ?

                  On which platform ?

                  Interested in AI ? www.idiap.ch
                  Please read the Qt Code of Conduct - https://forum.qt.io/topic/113070/qt-code-of-conduct

                  johnson54J 1 Reply Last reply
                  0
                  • jeremy_kJ Offline
                    jeremy_kJ Offline
                    jeremy_k
                    wrote on last edited by
                    #9

                    Calling QRegularExpression::optimize() prior to entering the loop might help. Using QRegularExpression::globalMatch() instead of QString::indexOf() is also worth investigating.

                    Asking a question about code? http://eel.is/iso-c++/testcase/

                    johnson54J 1 Reply Last reply
                    0
                    • JonBJ Offline
                      JonBJ Offline
                      JonB
                      wrote on last edited by
                      #10

                      I note we are told:

                      The QRegularExpression class introduced in Qt 5 is a big improvement upon QRegExp, in terms of APIs offered, supported pattern syntax and speed of execution

                      :)

                      @johnson54
                      You have not said: are you running these tests compiled for debug? If so, can you see what the timings are if you compile for release?

                      1 Reply Last reply
                      0
                      • johnson54J Offline
                        johnson54J Offline
                        johnson54
                        wrote on last edited by
                        #11

                        @JonB
                        In Qt 5.15.2, Windows 10, MSVC 2019-32bit
                        Release Mode:

                        QRegExp  pkg_num =  1229 elapsed =  44 ms
                        QRegularExpression  pkg_num =  1229 elapsed =  4142 ms
                        

                        Debug Mode:

                        QRegExp  pkg_num =  1229 elapsed =  63 ms
                        QRegularExpression  pkg_num =  1229 elapsed =  7716 ms
                        
                        1 Reply Last reply
                        0
                        • SGaistS SGaist

                          Hi,

                          Do you have the same performance hit with both Qt 6 and Qt 5 ?

                          Which version of Qt 6 did you try ?

                          On which platform ?

                          johnson54J Offline
                          johnson54J Offline
                          johnson54
                          wrote on last edited by
                          #12

                          @SGaist
                          In Qt6, QRegularExpression has the same performance as Qt 5.15.2 (Windows 10, MSVC 2019-32bit).
                          I use Qt 6.2, Windows 10, MSVC 2019-64bit, Release Mode, the result is

                          QRegularExpression  pkg_num =  1229 elapsed =  4385 ms
                          

                          Although I added "greaterThan(QT_MAJOR_VERSION, 5): QT += core5compat" in .pro
                          I cannot use QString::indexOf(QRegExp) in Qt 6.2

                          1 Reply Last reply
                          0
                          • jeremy_kJ jeremy_k

                            Calling QRegularExpression::optimize() prior to entering the loop might help. Using QRegularExpression::globalMatch() instead of QString::indexOf() is also worth investigating.

                            johnson54J Offline
                            johnson54J Offline
                            johnson54
                            wrote on last edited by
                            #13

                            @jeremy_k I put 'regular_expression.optimize();' before the whole loop, and it makes no help.

                            1 Reply Last reply
                            0

                            • Login

                            • Login or register to search.
                            • First post
                              Last post
                            0
                            • Categories
                            • Recent
                            • Tags
                            • Popular
                            • Users
                            • Groups
                            • Search
                            • Get Qt Extensions
                            • Unsolved