Rust file parsing significantly faster than Qt/C++ file parsing. Solutions for Qt implementation wanted. File size: 68.5 MB
-
@TheLumbee said in Rust file parsing significantly faster than Qt/C++ file parsing. Solutions for Qt implementation wanted. File size: 68.5 MB:
So, it must be a C++ issue with Windows
Again - use plain C++ and not Qt - the QDateTime parsing is painful slow...
@Christian-Ehrlicher Before I ever used Qt, I was facing the same issue with C++. I initially believed it was a filesystem difference and posted in a forum here: https://cplusplus.com/forum/general/254030/
Near the end of the thread you'll see that others noticed the same issue with Windows. I just find it odd that this blatant difference has never been noticed, at least in a major way.
-
@Christian-Ehrlicher Before I ever used Qt, I was facing the same issue with C++. I initially believed it was a filesystem difference and posted in a forum here: https://cplusplus.com/forum/general/254030/
Near the end of the thread you'll see that others noticed the same issue with Windows. I just find it odd that this blatant difference has never been noticed, at least in a major way.
@TheLumbee said in Rust file parsing significantly faster than Qt/C++ file parsing. Solutions for Qt implementation wanted. File size: 68.5 MB:
Before I ever used Qt, I was facing the same issue with C++.
I do think this thread is getting confused. You certainly seem to talk in this thread about various different aspects of your speed with Rust/C++/Qt/Windows/file I/O all mixed into one. One has to deal with these separately. The issue @Christian-Ehrlicher and I, at least, are discussing now is specifically what to do about
QDateTime::fromString(), which is by far the major contributor to your efficiency compared to Rust, other items are minor. The proposal is if one has Qt 6.4+ and C++ 20 thenstd::chronocan be used to parse the string input to a "naive datetime" (and I have a hunch that is what Rust uses) and that converted to aQDateTimein condirably better time thatQDateTime::fromString().This is quite distinct from e.g. the time taken to read the large file under Windows.
-
@TheLumbee said in Rust file parsing significantly faster than Qt/C++ file parsing. Solutions for Qt implementation wanted. File size: 68.5 MB:
Before I ever used Qt, I was facing the same issue with C++.
I do think this thread is getting confused. You certainly seem to talk in this thread about various different aspects of your speed with Rust/C++/Qt/Windows/file I/O all mixed into one. One has to deal with these separately. The issue @Christian-Ehrlicher and I, at least, are discussing now is specifically what to do about
QDateTime::fromString(), which is by far the major contributor to your efficiency compared to Rust, other items are minor. The proposal is if one has Qt 6.4+ and C++ 20 thenstd::chronocan be used to parse the string input to a "naive datetime" (and I have a hunch that is what Rust uses) and that converted to aQDateTimein condirably better time thatQDateTime::fromString().This is quite distinct from e.g. the time taken to read the large file under Windows.
@JonB said in Rust file parsing significantly faster than Qt/C++ file parsing. Solutions for Qt implementation wanted. File size: 68.5 MB:
The issue @Christian-Ehrlicher and I, at least, are discussing now is specifically what to do about
QDateTime::fromString(), which is by far the major contributor to your efficiency compared to Rust, other items are minor. The proposal is if one has Qt 6.4+ and C++ 20 thenstd::chronocan be used to parse the string input to a "naive datetime" (and I have a hunch that is what Rust uses) and that converted to aQDateTimein condirably better time thatQDateTime::fromString().Apologies. I don't have C++20, but I can set up an environment to test it. But even without the DateTime, Rust is parsing the file 2-3x faster than C++, with just floats and ints. Maybe C++20 has some improvements in that domain, but I'll set up an environment to test this.
-
@JonB said in Rust file parsing significantly faster than Qt/C++ file parsing. Solutions for Qt implementation wanted. File size: 68.5 MB:
The issue @Christian-Ehrlicher and I, at least, are discussing now is specifically what to do about
QDateTime::fromString(), which is by far the major contributor to your efficiency compared to Rust, other items are minor. The proposal is if one has Qt 6.4+ and C++ 20 thenstd::chronocan be used to parse the string input to a "naive datetime" (and I have a hunch that is what Rust uses) and that converted to aQDateTimein condirably better time thatQDateTime::fromString().Apologies. I don't have C++20, but I can set up an environment to test it. But even without the DateTime, Rust is parsing the file 2-3x faster than C++, with just floats and ints. Maybe C++20 has some improvements in that domain, but I'll set up an environment to test this.
@TheLumbee said in Rust file parsing significantly faster than Qt/C++ file parsing. Solutions for Qt implementation wanted. File size: 68.5 MB:
But even without the DateTime, Rust is parsing the file 2-3x faster than C++, with just floats and ints.
I do understand this. But I suggest this is a separate issue from the
QDateTime. You started with 40x faster. Dealing withQDateTimeis the first priority. File reading or parsing ints and floats is a separate issue requiring its own solution. -
@TheLumbee said in Rust file parsing significantly faster than Qt/C++ file parsing. Solutions for Qt implementation wanted. File size: 68.5 MB:
But even without the DateTime, Rust is parsing the file 2-3x faster than C++, with just floats and ints.
I do understand this. But I suggest this is a separate issue from the
QDateTime. You started with 40x faster. Dealing withQDateTimeis the first priority. File reading or parsing ints and floats is a separate issue requiring its own solution. -
To provide an update, the solution provided by @JonB 's "final offering" in this post parses files a little quicker. Although better than my initial approach, still not nearly as fast as the Rust solution. For now, I'm sticking with the Rust lib I wrote but I do think this points out some performance enhancements that can be made on the C++ side of things.
-
To provide an update, the solution provided by @JonB 's "final offering" in this post parses files a little quicker. Although better than my initial approach, still not nearly as fast as the Rust solution. For now, I'm sticking with the Rust lib I wrote but I do think this points out some performance enhancements that can be made on the C++ side of things.
@TheLumbee said in Rust file parsing significantly faster than Qt/C++ file parsing. Solutions for Qt implementation wanted. File size: 68.5 MB:
Although
For MSVC++ you should write a bug report for them with test data. There seems to be a major performance issue with the MS c++ library as it is much slower than other solutions. Maybe they can provide an update in a future version or hint to workarounds to improve performance on windows.
-
Hello. I know this is an old topic.
I couldn't test as the test file doesn't seem accessible anymore.
Assuming you were testing on Linux with glibc, could you try setting the TZ environment variable? e.g.:
export TZ=":/etc/localtime"See also:
https://sourceware.org/bugzilla/show_bug.cgi?id=24004
https://bugreports.qt.io/browse/QTBUG-77948 -
I just wanted to give this thread kudos. I copied the following function from this thread into my code:
auto DateTimeParser =[](const QStringRef & string) ->QDateTime { const QDate date(string.left(4).toInt(), string.mid(4,2).toInt(), string.mid(6.2).toInt()); const QTime time(string.mid(9,2).toInt(), string.mid(11,2).toInt(), string.mid(13.2).toInt(), string.mid(15.3).toInt()); QDateTime dt (date,time); return dt; };and the performance improved dramatically. I was parsing approximately 1 million datetimes and it took 90-100 seconds using
QDateTime::fromString(). Using this function it now takes 23.5 seconds. 4x increase in performance.Thank you!!
-
The problem with QDateTime is that for every call the internal format parser (QDateTimeParser, private class) is re-created and needs to re-evaluate the string. This takes a lot of time.
Necromancy, sorry, but there was a point made a while back that kind of zipped by without comment, and I wanted to circle back to it:
@Christian-Ehrlicher said in Rust file parsing significantly faster than Qt/C++ file parsing. Solutions for Qt implementation wanted. File size: 68.5 MB:
The problem with QDateTime is that for every call the internal format parser (QDateTimeParser, private class) is re-created and needs to re-evaluate the string. This takes a lot of time.
Given how expensive it is to re-create that parser for every call, I wonder if Qt would consider making the parser public. instantiable, and reusable, so that when parsing a lot of same-formatted date strings, it could be created just once and applied to all of them? Kind of like Python's
re.compile()for regular expressions.e.g. something like:
QDateTimeStringParser* dateParser = new QDateTimeStringParser( "yyyyMMdd HHmmss zzz0000" ); QElapsedTimer* parseTimer1 = new QElapsedTimer(); parseTimer1->start(); for (int ii = 0; ii < allData.size() - 1; ii++) { QByteArrayList data = allData.at(ii).split(';'); t.dt = dateParser->parse(data.at(0)); t.last = data.at(1).toDouble(); t.bid = data.at(2).toDouble(); t.ask = data.at(3).toDouble(); t.volume = data.at(4).toInt(); instr.tickList.append(t); } qDebug().noquote() << QString("Qt parse time: %1ms") .arg(parseTimer1->elapsed()); dateParser->deleteLater();I wonder if that could make it significantly faster?
parse()would have to be reentrant and stateless, of course (beyond the stored, immutable format string), which could still be tricky for a complex parser.