Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • Users
  • Groups
  • Search
  • Get Qt Extensions
  • Unsolved
Collapse
Brand Logo
  1. Home
  2. Qt Development
  3. General and Desktop
  4. Rust file parsing significantly faster than Qt/C++ file parsing. Solutions for Qt implementation wanted. File size: 68.5 MB
Forum Updated to NodeBB v4.3 + New Features

Rust file parsing significantly faster than Qt/C++ file parsing. Solutions for Qt implementation wanted. File size: 68.5 MB

Scheduled Pinned Locked Moved General and Desktop
64 Posts 7 Posters 11.5k Views 2 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • Christian EhrlicherC Christian Ehrlicher

    @DerReisende said in Rust file parsing significantly faster than Qt/C++ file parsing. Solutions for Qt implementation wanted. File size: 68.5 MB:

    const auto result = line.split(semicolon);

    Please read my last post about QStringView

    D Offline
    D Offline
    DerReisende
    wrote on last edited by
    #43

    @Christian-Ehrlicher It is just a port of @J-Hilk code to run with 6.4. I do not intend to rework it for QStringView - and my first attempt was incorrect anyways.

    Christian EhrlicherC 1 Reply Last reply
    0
    • D DerReisende

      @Christian-Ehrlicher It is just a port of @J-Hilk code to run with 6.4. I do not intend to rework it for QStringView - and my first attempt was incorrect anyways.

      Christian EhrlicherC Offline
      Christian EhrlicherC Offline
      Christian Ehrlicher
      Lifetime Qt Champion
      wrote on last edited by
      #44

      @DerReisende said in Rust file parsing significantly faster than Qt/C++ file parsing. Solutions for Qt implementation wanted. File size: 68.5 MB:

      It is just a port of @J-Hilk code to run with 6.4.

      But the port is wrong - you're now creating useless QString objects ...

      Qt Online Installer direct download: https://download.qt.io/official_releases/online_installers/
      Visit the Qt Academy at https://academy.qt.io/catalog

      D 1 Reply Last reply
      0
      • Christian EhrlicherC Christian Ehrlicher

        @DerReisende said in Rust file parsing significantly faster than Qt/C++ file parsing. Solutions for Qt implementation wanted. File size: 68.5 MB:

        It is just a port of @J-Hilk code to run with 6.4.

        But the port is wrong - you're now creating useless QString objects ...

        D Offline
        D Offline
        DerReisende
        wrote on last edited by
        #45

        @Christian-Ehrlicher Maybe I don't get the problem:
        .splitRef() does not exist anymore in 64. .split() returns a QStringList which is implicitly shared according to the doc. Where is the mistake and how would you come to a QStringRef/QStringView version on Qt 6.4?

        Christian EhrlicherC 1 Reply Last reply
        0
        • D DerReisende

          @Christian-Ehrlicher Maybe I don't get the problem:
          .splitRef() does not exist anymore in 64. .split() returns a QStringList which is implicitly shared according to the doc. Where is the mistake and how would you come to a QStringRef/QStringView version on Qt 6.4?

          Christian EhrlicherC Offline
          Christian EhrlicherC Offline
          Christian Ehrlicher
          Lifetime Qt Champion
          wrote on last edited by
          #46

          @DerReisende said in Rust file parsing significantly faster than Qt/C++ file parsing. Solutions for Qt implementation wanted. File size: 68.5 MB:

          Where is the mistake and how would you come to a QStringRef/QStringView version on Qt 6.4?

          QString::splitRef() - create a QList/Vector with references to parts of a QString, no QString objects are created, nothing gets copied
          QString::split() - create a QList/Vector with newly create QString objects

          To avoid the creation of the QString objects and get the same behavior as before, use QStringView::split()0

          Qt Online Installer direct download: https://download.qt.io/official_releases/online_installers/
          Visit the Qt Academy at https://academy.qt.io/catalog

          D 1 Reply Last reply
          0
          • D DerReisende

            Here is a version of @J-Hilk 's program modified to run with Qt 6.4:

            int main(int argc, char **argv) {
                QFile *testFile = new QFile(R"(C:\Users\aea_t\Downloads\TestFile.txt)");
            
            
                testFile->open(QFile::ReadOnly);
                Instrument instr;
                Tick t;
            
                QElapsedTimer *parseTimer2 = new QElapsedTimer();
                parseTimer2->start();
                testFile->reset();
                instr.tickList.reserve(1000000);
            
                auto DateTimeParser = [](const QStringView string) -> QDateTime {
                    const QDate date(string.left(4).toInt(), string.mid(4, 2).toInt(), string.mid(6, 2).toInt());
                    const QTime time(string.mid(9, 2).toInt(), string.mid(11, 2).toInt(), string.mid(13, 2).toInt(),
                                     string.mid(15, 3).toInt());
                    QDateTime dt(date, time);
                    return dt;
                };
            
                QTextStream readFile(testFile);
                QString line;
                line.reserve(100);
                const QChar semicolon(';');
                while (!readFile.atEnd()) {
                    if (!readFile.readLineInto(&line, 100)) {
                        break;
                    }
                    const auto result = line.split(semicolon);
                    instr.tickList.append(
                            {DateTimeParser(result.at(0)), result.at(1).toFloat(), result.at(2).toFloat(), result.at(3).toFloat(),
                             result.at(3).toUInt()});
                }
                qDebug().noquote() << QString("Qt parse time: %1ms")
                        .arg(parseTimer2->elapsed());
            }
            

            Before we continue complaining about Qt it would be great to know on what OS @TheLumbee is running the code?
            All solution seem to run much slower on Windows than Linux and macOS.
            I tried the above version on my Win 11 machine VS2022 and it takes around 32 seconds to complete (compared to his 1.4 seconds on macOS). I don't have a mingw installation for comparison unfortunately :(

            T Offline
            T Offline
            TheLumbee
            wrote on last edited by TheLumbee
            #47

            @DerReisende I'm running on Ubuntu 22.04 64-bit. I'm currently creating the Rust lib. Got the shared lib and C-header file generated, but having an issue linking it all. So hopefully that will be resolved soon.

            1 Reply Last reply
            0
            • JonBJ JonB

              @TheLumbee
              From where you are now. If you comment out the datetime handling in the C++ (and the Rust if you like), is your performance timing for the Qt/C++ acceptable compared to the Rust? So it is only the QDateTime parsing which is the issue for you?

              T Offline
              T Offline
              TheLumbee
              wrote on last edited by
              #48

              @JonB Unfortunately, no. The solutions provided are by far leaps and bounds better than before, but the file sizes I'm expecting will definitely cause an issue if it's all done in Qt. I'm currently writing a Rust lib to resolve the problem for now, but even a 2-3x increase with Rust will make a monumental difference.

              JoeCFDJ 1 Reply Last reply
              0
              • Christian EhrlicherC Christian Ehrlicher

                @DerReisende said in Rust file parsing significantly faster than Qt/C++ file parsing. Solutions for Qt implementation wanted. File size: 68.5 MB:

                Where is the mistake and how would you come to a QStringRef/QStringView version on Qt 6.4?

                QString::splitRef() - create a QList/Vector with references to parts of a QString, no QString objects are created, nothing gets copied
                QString::split() - create a QList/Vector with newly create QString objects

                To avoid the creation of the QString objects and get the same behavior as before, use QStringView::split()0

                D Offline
                D Offline
                DerReisende
                wrote on last edited by
                #49

                @Christian-Ehrlicher
                Ok I modified it with the following:

                        //const auto result = line.split(semicolon);
                        const QStringView sv{line};
                        const auto result = sv.split(semicolon);
                

                Doesn't make a difference in runtime. And looking at the QString split code I am almost sure that line.split already does this optimization through implicit sharing through QStringList.

                Christian EhrlicherC 1 Reply Last reply
                0
                • D DerReisende

                  @Christian-Ehrlicher
                  Ok I modified it with the following:

                          //const auto result = line.split(semicolon);
                          const QStringView sv{line};
                          const auto result = sv.split(semicolon);
                  

                  Doesn't make a difference in runtime. And looking at the QString split code I am almost sure that line.split already does this optimization through implicit sharing through QStringList.

                  Christian EhrlicherC Offline
                  Christian EhrlicherC Offline
                  Christian Ehrlicher
                  Lifetime Qt Champion
                  wrote on last edited by
                  #50

                  @DerReisende said in Rust file parsing significantly faster than Qt/C++ file parsing. Solutions for Qt implementation wanted. File size: 68.5 MB:

                  I am almost sure that line.split already does this optimization through implicit sharing through QStringList.

                  No, implicit sharing of a container has nothing to do with creating new QString objects - QStringList is a list of string objects, not a list of QStringViews ...

                  Qt Online Installer direct download: https://download.qt.io/official_releases/online_installers/
                  Visit the Qt Academy at https://academy.qt.io/catalog

                  1 Reply Last reply
                  0
                  • T TheLumbee

                    @JonB Unfortunately, no. The solutions provided are by far leaps and bounds better than before, but the file sizes I'm expecting will definitely cause an issue if it's all done in Qt. I'm currently writing a Rust lib to resolve the problem for now, but even a 2-3x increase with Rust will make a monumental difference.

                    JoeCFDJ Offline
                    JoeCFDJ Offline
                    JoeCFD
                    wrote on last edited by JoeCFD
                    #51

                    @TheLumbee That is what I did with Java code before. When I got any bottleneck in my Java apps, I tried to use C/C++ code to do the jobs. Good luck!

                    1 Reply Last reply
                    0
                    • T Offline
                      T Offline
                      TheLumbee
                      wrote on last edited by TheLumbee
                      #52

                      Update

                      I successfully created the Rust lib and it's passing all the data to Qt. Rust converts the dateTime portion to mSecsSinceEpoch and then it passes that to Qt to create a DateTime object. Much faster that way.

                      This wasn't the preferred choice, but it does the job faster than I've ever seen. I do appreciate everyone's input and help. Hopefully in the future Qt would be able to match the performance of Rust with this particular issue.

                      Also, I'm going to run some tests with this Rust lib on Windows to see if it's still glacial compared to Unix.

                      1 Reply Last reply
                      1
                      • T Offline
                        T Offline
                        TheLumbee
                        wrote on last edited by
                        #53

                        Windows Update

                        After testing on Windows, I'm actually getting the same performance as Linux. So, it must be a C++ issue with Windows. I've tested with MinGW and MSVC with parsing and it's nearly unusable.

                        Christian EhrlicherC 1 Reply Last reply
                        0
                        • T TheLumbee

                          Windows Update

                          After testing on Windows, I'm actually getting the same performance as Linux. So, it must be a C++ issue with Windows. I've tested with MinGW and MSVC with parsing and it's nearly unusable.

                          Christian EhrlicherC Offline
                          Christian EhrlicherC Offline
                          Christian Ehrlicher
                          Lifetime Qt Champion
                          wrote on last edited by
                          #54

                          @TheLumbee said in Rust file parsing significantly faster than Qt/C++ file parsing. Solutions for Qt implementation wanted. File size: 68.5 MB:

                          So, it must be a C++ issue with Windows

                          Again - use plain C++ and not Qt - the QDateTime parsing is painful slow...

                          Qt Online Installer direct download: https://download.qt.io/official_releases/online_installers/
                          Visit the Qt Academy at https://academy.qt.io/catalog

                          T 1 Reply Last reply
                          1
                          • T TheLumbee

                            @JonB @J-Hilk @DerReisende Thanks for all the responses! Didn't actually expect much here. I apologize for not providing more details. I've been dealing with this file parsing issue in C++ for years. Same code in Windows takes >100x times to complete rather than using Linux for some odd reason which I've posted in C++ forums prior to using Qt, but what you've provided is actually the first significant improvement I've ever seen.

                            So thank you for that!

                            I've tested this with versions 512, 5.15, 6.0, 6.2.4, and 6.4. Never noticed a major difference between them regarding this issue. Current machine: i7-6700 with 32GB RAM. So not sure what y'all are working with but the results seem promising.

                            I was previously streaming into a QTextStream then reading line-by-line but came across this post: https://forum.qt.io/topic/98282/parsing-large-big-text-files-quickly and a couple of others that suggested that is more expensive that using a QByteArray. I didn't notice much difference to be quite honest.

                            I did comment out the QDateTime parsing just to check and it was a significant improvement. Not quite like Rust but I'll attribute that to @JonB comment:

                            That "naive" means it does not do any local time/daylight etc, conversions.
                            

                            If any of you are interested, I'll test each of your solutions and provide an update. But this actually woke me up and got me excited to start my day so thank you.

                            JonBJ Offline
                            JonBJ Offline
                            JonB
                            wrote on last edited by
                            #55

                            @TheLumbee said in Rust file parsing significantly faster than Qt/C++ file parsing. Solutions for Qt implementation wanted. File size: 68.5 MB:

                            I've tested this with versions 512, 5.15, 6.0, 6.2.4, and 6.4. Never noticed a major difference between them regarding this issue.

                            Do you have access to both Qt 6.4 and C++20? If so, can you try combining
                            https://doc.qt.io/qt-6/qdatetime.html#fromStdTimePoint-1 (QDateTime QDateTime::fromStdTimePoint(const std::chrono::local_time<std::chrono::milliseconds> &time))
                            https://en.cppreference.com/w/cpp/chrono/parse (std::chrono::parse())
                            to see whether the datetime conversion part now matches Rust's?

                            For right or for wrong, I have appended a post into https://bugreports.qt.io/browse/QTBUG-97489 for this whole datetime parsing to QDateTime issue, as I am concerned it is a "show-stopper" if you have a large amountof string datetime data you need to get into Qt's QDateTime.

                            1 Reply Last reply
                            0
                            • Christian EhrlicherC Offline
                              Christian EhrlicherC Offline
                              Christian Ehrlicher
                              Lifetime Qt Champion
                              wrote on last edited by Christian Ehrlicher
                              #56

                              Ok, some c++20 magic to use c++ instead c, but not really optimized

                              testFile.open(QFile::ReadOnly);
                              instr.tickList.clear();
                              instr.tickList.reserve(1000000);
                              QElapsedTimer parseTimer1;
                              parseTimer1.start();
                              const QByteArrayList allData = testFile.readAll().split('\n');
                              for (const auto &line : allData)
                              {
                                  const QByteArrayList data = line.split(';');
                                  std::string str = data.at(0).data();   // TODO: use data.at(0).toStdString()
                                  str[15] = '.';                         // sadly needed for correct msec parsing
                                  std::istringstream stream(str);
                                  std::chrono::sys_time<std::chrono::milliseconds> tTimePoint;
                                  std::chrono::from_stream(stream, "%Y%m%d %H%M%S", tTimePoint);
                                  instr.tickList.push_back({
                                      QDateTime::fromMSecsSinceEpoch(tTimePoint.time_since_epoch().count()),
                                      data.at(1).toDouble(),
                                      data.at(2).toDouble(),
                                      data.at(3).toDouble(),
                                      data.at(4).toInt()
                                  });
                              }
                              qDebug().noquote() << QString("Qt parse time: %1ms").arg(parseTimer1.elapsed());```
                              
                              compared to @J-Hilk 's version:
                              
                              compiled with debug:
                              Qt parse time: 32125ms
                              Qt parse time: 154511ms
                              
                              compiled with release:
                              Qt parse time: 17926ms
                              Qt parse time: 157748ms
                              
                              Attention: Qt debug libs, the emplace_back() doesn't help much.

                              Qt Online Installer direct download: https://download.qt.io/official_releases/online_installers/
                              Visit the Qt Academy at https://academy.qt.io/catalog

                              1 Reply Last reply
                              0
                              • Christian EhrlicherC Christian Ehrlicher

                                @TheLumbee said in Rust file parsing significantly faster than Qt/C++ file parsing. Solutions for Qt implementation wanted. File size: 68.5 MB:

                                So, it must be a C++ issue with Windows

                                Again - use plain C++ and not Qt - the QDateTime parsing is painful slow...

                                T Offline
                                T Offline
                                TheLumbee
                                wrote on last edited by
                                #57

                                @Christian-Ehrlicher Before I ever used Qt, I was facing the same issue with C++. I initially believed it was a filesystem difference and posted in a forum here: https://cplusplus.com/forum/general/254030/

                                Near the end of the thread you'll see that others noticed the same issue with Windows. I just find it odd that this blatant difference has never been noticed, at least in a major way.

                                JonBJ 1 Reply Last reply
                                0
                                • T TheLumbee

                                  @Christian-Ehrlicher Before I ever used Qt, I was facing the same issue with C++. I initially believed it was a filesystem difference and posted in a forum here: https://cplusplus.com/forum/general/254030/

                                  Near the end of the thread you'll see that others noticed the same issue with Windows. I just find it odd that this blatant difference has never been noticed, at least in a major way.

                                  JonBJ Offline
                                  JonBJ Offline
                                  JonB
                                  wrote on last edited by
                                  #58

                                  @TheLumbee said in Rust file parsing significantly faster than Qt/C++ file parsing. Solutions for Qt implementation wanted. File size: 68.5 MB:

                                  Before I ever used Qt, I was facing the same issue with C++.

                                  I do think this thread is getting confused. You certainly seem to talk in this thread about various different aspects of your speed with Rust/C++/Qt/Windows/file I/O all mixed into one. One has to deal with these separately. The issue @Christian-Ehrlicher and I, at least, are discussing now is specifically what to do about QDateTime::fromString(), which is by far the major contributor to your efficiency compared to Rust, other items are minor. The proposal is if one has Qt 6.4+ and C++ 20 then std::chrono can be used to parse the string input to a "naive datetime" (and I have a hunch that is what Rust uses) and that converted to a QDateTime in condirably better time that QDateTime::fromString().

                                  This is quite distinct from e.g. the time taken to read the large file under Windows.

                                  T 1 Reply Last reply
                                  0
                                  • JonBJ JonB

                                    @TheLumbee said in Rust file parsing significantly faster than Qt/C++ file parsing. Solutions for Qt implementation wanted. File size: 68.5 MB:

                                    Before I ever used Qt, I was facing the same issue with C++.

                                    I do think this thread is getting confused. You certainly seem to talk in this thread about various different aspects of your speed with Rust/C++/Qt/Windows/file I/O all mixed into one. One has to deal with these separately. The issue @Christian-Ehrlicher and I, at least, are discussing now is specifically what to do about QDateTime::fromString(), which is by far the major contributor to your efficiency compared to Rust, other items are minor. The proposal is if one has Qt 6.4+ and C++ 20 then std::chrono can be used to parse the string input to a "naive datetime" (and I have a hunch that is what Rust uses) and that converted to a QDateTime in condirably better time that QDateTime::fromString().

                                    This is quite distinct from e.g. the time taken to read the large file under Windows.

                                    T Offline
                                    T Offline
                                    TheLumbee
                                    wrote on last edited by
                                    #59

                                    @JonB said in Rust file parsing significantly faster than Qt/C++ file parsing. Solutions for Qt implementation wanted. File size: 68.5 MB:

                                    The issue @Christian-Ehrlicher and I, at least, are discussing now is specifically what to do about QDateTime::fromString(), which is by far the major contributor to your efficiency compared to Rust, other items are minor. The proposal is if one has Qt 6.4+ and C++ 20 then std::chrono can be used to parse the string input to a "naive datetime" (and I have a hunch that is what Rust uses) and that converted to a QDateTime in condirably better time that QDateTime::fromString().

                                    Apologies. I don't have C++20, but I can set up an environment to test it. But even without the DateTime, Rust is parsing the file 2-3x faster than C++, with just floats and ints. Maybe C++20 has some improvements in that domain, but I'll set up an environment to test this.

                                    JonBJ 1 Reply Last reply
                                    0
                                    • T TheLumbee

                                      @JonB said in Rust file parsing significantly faster than Qt/C++ file parsing. Solutions for Qt implementation wanted. File size: 68.5 MB:

                                      The issue @Christian-Ehrlicher and I, at least, are discussing now is specifically what to do about QDateTime::fromString(), which is by far the major contributor to your efficiency compared to Rust, other items are minor. The proposal is if one has Qt 6.4+ and C++ 20 then std::chrono can be used to parse the string input to a "naive datetime" (and I have a hunch that is what Rust uses) and that converted to a QDateTime in condirably better time that QDateTime::fromString().

                                      Apologies. I don't have C++20, but I can set up an environment to test it. But even without the DateTime, Rust is parsing the file 2-3x faster than C++, with just floats and ints. Maybe C++20 has some improvements in that domain, but I'll set up an environment to test this.

                                      JonBJ Offline
                                      JonBJ Offline
                                      JonB
                                      wrote on last edited by
                                      #60

                                      @TheLumbee said in Rust file parsing significantly faster than Qt/C++ file parsing. Solutions for Qt implementation wanted. File size: 68.5 MB:

                                      But even without the DateTime, Rust is parsing the file 2-3x faster than C++, with just floats and ints.

                                      I do understand this. But I suggest this is a separate issue from the QDateTime. You started with 40x faster. Dealing with QDateTime is the first priority. File reading or parsing ints and floats is a separate issue requiring its own solution.

                                      T 1 Reply Last reply
                                      0
                                      • JonBJ JonB

                                        @TheLumbee said in Rust file parsing significantly faster than Qt/C++ file parsing. Solutions for Qt implementation wanted. File size: 68.5 MB:

                                        But even without the DateTime, Rust is parsing the file 2-3x faster than C++, with just floats and ints.

                                        I do understand this. But I suggest this is a separate issue from the QDateTime. You started with 40x faster. Dealing with QDateTime is the first priority. File reading or parsing ints and floats is a separate issue requiring its own solution.

                                        T Offline
                                        T Offline
                                        TheLumbee
                                        wrote on last edited by
                                        #61

                                        @JonB I agree. I'll do some testing with this today and provide an update.

                                        Thanks again for the help.

                                        1 Reply Last reply
                                        0
                                        • T Offline
                                          T Offline
                                          TheLumbee
                                          wrote on last edited by
                                          #62

                                          To provide an update, the solution provided by @JonB 's "final offering" in this post parses files a little quicker. Although better than my initial approach, still not nearly as fast as the Rust solution. For now, I'm sticking with the Rust lib I wrote but I do think this points out some performance enhancements that can be made on the C++ side of things.

                                          D 1 Reply Last reply
                                          2

                                          • Login

                                          • Login or register to search.
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • Users
                                          • Groups
                                          • Search
                                          • Get Qt Extensions
                                          • Unsolved